Artificial Intelligence Inference
Artificial intelligence (AI) inference focuses on efficiently and reliably deploying trained AI models for real-world applications. Current research emphasizes optimizing inference speed and resource utilization across diverse hardware platforms, including edge devices, cloud systems, and high-performance computing clusters, often employing techniques like parameter-efficient fine-tuning and novel model architectures designed for specific hardware. This work is crucial for enabling widespread adoption of AI in various fields, from healthcare and finance to scientific discovery, by addressing challenges related to latency, cost, security, and energy consumption. Furthermore, research is actively exploring methods to improve the fairness and environmental sustainability of AI inference.
Papers
Symbolic Regression on FPGAs for Fast Machine Learning Inference
Ho Fung Tsoi, Adrian Alan Pol, Vladimir Loncar, Ekaterina Govorkova, Miles Cranmer, Sridhara Dasu, Peter Elmer, Philip Harris, Isobel Ojalvo, Maurizio Pierini
A Blockchain-based Platform for Reliable Inference and Training of Large-Scale Models
Sanghyeon Park, Junmo Lee, Soo-Mook Moon