Mobile Inference
Mobile inference focuses on optimizing the execution of deep learning models on mobile devices, aiming for faster speeds, lower energy consumption, and improved privacy. Current research emphasizes efficient model architectures like transformers and MobileNets, incorporating techniques such as sparsity, quantization, and novel attention mechanisms to reduce computational cost. These advancements are crucial for enabling resource-constrained mobile devices to run complex AI applications, impacting areas like mobile vision, natural language processing, and collaborative intelligence. Furthermore, research is actively addressing privacy concerns through data masking and selective offloading to the cloud.
Papers
November 11, 2024
March 29, 2024
October 30, 2023
June 24, 2023
March 27, 2023
December 2, 2022
November 12, 2022