Multiplicative Size Scaling
Multiplicative size scaling in machine learning investigates how model performance changes with increases in model parameters, training data, and other resources. Current research focuses on optimizing this scaling across various model architectures, including transformers, diffusion models, and graph neural networks, often employing techniques like parameter-efficient fine-tuning and improved data sampling strategies to enhance efficiency and generalization. These investigations are crucial for developing more powerful and resource-efficient AI systems, impacting fields ranging from natural language processing and computer vision to scientific computing and robotics. A key theme is moving beyond simple scaling to understand and optimize the interplay between model size, data quality, and training methodologies.
Papers
Scaling Large-Language-Model-based Multi-Agent Collaboration
Chen Qian, Zihao Xie, Yifei Wang, Wei Liu, Yufan Dang, Zhuoyun Du, Weize Chen, Cheng Yang, Zhiyuan Liu, Maosong Sun
Scaling up masked audio encoder learning for general audio classification
Heinrich Dinkel, Zhiyong Yan, Yongqing Wang, Junbo Zhang, Yujun Wang, Bin Wang