Memory Efficient Fine Tuning
Memory-efficient fine-tuning focuses on adapting large pre-trained language and vision models to specific downstream tasks while minimizing the computational resources and memory required. Current research emphasizes techniques like low-rank adaptation (LoRA), quantization (e.g., 2-bit, 4-bit), and selective parameter updates (e.g., freezing layers, using adapters), often combined with strategies like reversible networks or approximate backpropagation. These advancements are crucial for deploying large models on resource-constrained devices and making advanced AI accessible to a wider range of users and applications, reducing both the financial and environmental costs of training and inference.
Papers
May 24, 2023
May 23, 2023
May 22, 2023
May 8, 2023
May 2, 2023