LLM Training

Large language model (LLM) training focuses on efficiently and effectively developing these powerful models, primarily by optimizing resource allocation, minimizing communication overhead during distributed training, and improving data selection and utilization strategies. Current research emphasizes techniques like parameter-efficient fine-tuning (PEFT), novel optimization algorithms (e.g., those reducing communication or improving memory efficiency), and data-centric approaches such as curriculum learning and synthetic data refinement. These advancements are crucial for making LLM training more scalable, cost-effective, and robust, ultimately impacting various applications from mobile devices to high-performance computing and improving the reliability and safety of LLMs.

Papers