Model Inference
Model inference, the process of using a trained machine learning model to make predictions, is a critical area of research focusing on improving efficiency, accuracy, and robustness. Current efforts concentrate on mitigating issues like hallucinations in large language and vision-language models, optimizing resource allocation for both inference and retraining (especially in edge computing scenarios), and enhancing privacy and security during inference. These advancements are crucial for deploying machine learning models effectively in various applications, ranging from real-time IoT systems to large-scale data analysis, while addressing challenges related to computational cost, data heterogeneity, and model interpretability.
Papers
November 19, 2024
November 7, 2024
October 11, 2024
September 24, 2024
September 16, 2024
September 10, 2024
September 6, 2024
September 3, 2024
August 15, 2024
August 9, 2024
July 25, 2024
May 29, 2024
May 25, 2024
May 24, 2024
February 24, 2024
February 12, 2024
February 5, 2024
January 16, 2024