Memory Efficient Fine Tuning
Memory-efficient fine-tuning focuses on adapting large pre-trained language and vision models to specific downstream tasks while minimizing the computational resources and memory required. Current research emphasizes techniques like low-rank adaptation (LoRA), quantization (e.g., 2-bit, 4-bit), and selective parameter updates (e.g., freezing layers, using adapters), often combined with strategies like reversible networks or approximate backpropagation. These advancements are crucial for deploying large models on resource-constrained devices and making advanced AI accessible to a wider range of users and applications, reducing both the financial and environmental costs of training and inference.
Papers
December 25, 2024
November 21, 2024
September 4, 2024
August 20, 2024
July 9, 2024
July 7, 2024
June 24, 2024
June 7, 2024
May 28, 2024
May 5, 2024
February 28, 2024
February 7, 2024
January 13, 2024
January 8, 2024
October 11, 2023
September 28, 2023
August 7, 2023
June 13, 2023
June 1, 2023
May 29, 2023