Model Inference
Model inference, the process of using a trained machine learning model to make predictions, is a critical area of research focusing on improving efficiency, accuracy, and robustness. Current efforts concentrate on mitigating issues like hallucinations in large language and vision-language models, optimizing resource allocation for both inference and retraining (especially in edge computing scenarios), and enhancing privacy and security during inference. These advancements are crucial for deploying machine learning models effectively in various applications, ranging from real-time IoT systems to large-scale data analysis, while addressing challenges related to computational cost, data heterogeneity, and model interpretability.
Papers
MLink: Linking Black-Box Models from Multiple Domains for Collaborative Inference
Mu Yuan, Lan Zhang, Zimu Zheng, Yi-Nan Zhang, Xiang-Yang Li
InFi: End-to-End Learning to Filter Input for Resource-Efficiency in Mobile-Centric Inference
Mu Yuan, Lan Zhang, Fengxiang He, Xueting Tong, Miao-Hui Song, Zhengyuan Xu, Xiang-Yang Li