Data Mixing
Data mixing, the technique of combining diverse datasets for training machine learning models, aims to improve model generalization and efficiency. Current research focuses on optimizing data mixture proportions, often employing gradient alignment algorithms or bivariate scaling laws to predict optimal combinations and reduce computational costs. These advancements are particularly relevant for large language models and self-supervised learning, enhancing performance on downstream tasks and improving data efficiency in resource-constrained environments. The resulting improvements in model accuracy and robustness have significant implications for various applications, including satellite navigation and natural language processing.
Papers
Mix Data or Merge Models? Optimizing for Diverse Multi-Task Learning
Aakanksha, Arash Ahmadian, Seraphina Goldfarb-Tarrant, Beyza Ermis, Marzieh Fadaee, Sara Hooker
RICASSO: Reinforced Imbalance Learning with Class-Aware Self-Supervised Outliers Exposure
Xuan Zhang, Sin Chee Chin, Tingxuan Gao, Wenming Yang