Device LLM
Device LLMs focus on deploying large language models directly onto mobile devices to enhance privacy, reduce latency, and enable new mobile applications. Current research emphasizes efficient model compression techniques like quantization and novel architectures designed for resource-constrained hardware, including the use of NPUs and derivative-free optimization methods for on-device fine-tuning. This field is significant because it addresses critical limitations of cloud-based LLMs, paving the way for personalized and privacy-preserving AI applications on mobile devices. Addressing security vulnerabilities, such as data leakage during inference, is also a key area of ongoing investigation.
Papers
November 7, 2024
October 25, 2024
September 8, 2024
September 6, 2024
July 8, 2024
July 1, 2024
June 11, 2024
June 8, 2024
March 19, 2024
December 19, 2023
September 29, 2023
June 1, 2023