LLM Training
Large language model (LLM) training focuses on efficiently and effectively developing these powerful models, primarily by optimizing resource allocation, minimizing communication overhead during distributed training, and improving data selection and utilization strategies. Current research emphasizes techniques like parameter-efficient fine-tuning (PEFT), novel optimization algorithms (e.g., those reducing communication or improving memory efficiency), and data-centric approaches such as curriculum learning and synthetic data refinement. These advancements are crucial for making LLM training more scalable, cost-effective, and robust, ultimately impacting various applications from mobile devices to high-performance computing and improving the reliability and safety of LLMs.
Papers
Towards Transparency: Exploring LLM Trainings Datasets through Visual Topic Modeling and Semantic Frame
Charles de Dampierre, Andrei Mogoutov, Nicolas Baumard
ACCO: Accumulate while you Communicate, Hiding Communications in Distributed LLM Training
Adel Nabli, Louis Fournier, Pierre Erbacher, Louis Serrano, Eugene Belilovsky, Edouard Oyallon