Model Scaling
Model scaling investigates how increasing model size, training data, and computational resources impacts the performance of large language models (LLMs) and other deep learning architectures, aiming to optimize performance and efficiency. Current research focuses on understanding scaling laws across various model types, including transformers and graph neural networks, and exploring techniques like parameter-efficient tuning and test-time computation optimization to maximize performance gains. These findings are crucial for advancing both the theoretical understanding of deep learning and the development of more powerful and resource-efficient AI systems across diverse applications.
Papers
A Collaborative Ensemble Framework for CTR Prediction
Xiaolong Liu, Zhichen Zeng, Xiaoyi Liu, Siyang Yuan, Weinan Song, Mengyue Hang, Yiqun Liu, Chaofei Yang, Donghyun Kim, Wen-Yen Chen, Jiyan Yang, Yiping Han, Rong Jin, Bo Long, Hanghang Tong, Philip S. Yu
Training Bilingual LMs with Data Constraints in the Targeted Language
Skyler Seto, Maartje ter Hoeve, He Bai, Natalie Schluter, David Grangier