Large Language Model Training
Large language model (LLM) training focuses on efficiently and reliably developing increasingly powerful models using massive datasets and computational resources. Current research emphasizes optimizing distributed training algorithms (like data, tensor, and pipeline parallelism) and mitigating bottlenecks such as communication overhead and memory limitations through techniques like compression, near-storage processing, and efficient communication topologies. This field is crucial for advancing AI capabilities, impacting various applications while also driving innovation in high-performance computing and addressing challenges related to data quality, copyright, and environmental sustainability.
Papers
March 11, 2024
February 23, 2024
January 30, 2024
January 25, 2024
January 4, 2024
January 1, 2024
October 27, 2023
October 9, 2023
August 23, 2023
July 18, 2023