Large Language Model Training
Large language model (LLM) training focuses on efficiently and reliably developing increasingly powerful models using massive datasets and computational resources. Current research emphasizes optimizing distributed training algorithms (like data, tensor, and pipeline parallelism) and mitigating bottlenecks such as communication overhead and memory limitations through techniques like compression, near-storage processing, and efficient communication topologies. This field is crucial for advancing AI capabilities, impacting various applications while also driving innovation in high-performance computing and addressing challenges related to data quality, copyright, and environmental sustainability.
Papers
December 9, 2024
November 19, 2024
November 10, 2024
October 23, 2024
October 15, 2024
October 14, 2024
September 20, 2024
September 4, 2024
August 14, 2024
July 30, 2024
July 24, 2024
July 19, 2024
July 3, 2024
July 1, 2024
June 25, 2024
June 3, 2024
May 16, 2024
May 9, 2024
April 16, 2024