Training Data
Training data is crucial for machine learning model development, with current research focusing on improving data quality, efficiency, and mitigating biases. Active areas include generating synthetic data to address scarcity or privacy concerns, developing algorithms to optimize data selection and usage (e.g., self-paced learning, active learning), and mitigating issues like data contamination and imbalance through techniques such as data augmentation, selective parameter merging, and novel loss functions. The quality and characteristics of training data significantly impact model performance, generalization, and robustness, influencing various applications from natural language processing and image recognition to scientific computing and medical diagnosis.
Papers - Page 29
FedComLoc: Communication-Efficient Distributed Training of Sparse and Quantized Models
Kai Yi, Georg Meinhardt, Laurent Condat, Peter RichtárikMultilingual Audio-Visual Speech Recognition with Hybrid CTC/RNN-T Fast Conformer
Maxime Burchi, Krishna C. Puvvada, Jagadeesh Balam, Boris Ginsburg, Radu Timofte
Fairness Feedback Loops: Training on Synthetic Data Amplifies Bias
Sierra Wyllie, Ilia Shumailov, Nicolas PapernotAnnotations on a Budget: Leveraging Geo-Data Similarity to Balance Model Performance and Annotation Cost
Oana Ignat, Longju Bai, Joan Nwatu, Rada MihalceaIM-Unpack: Training and Inference with Arbitrarily Low Precision Integers
Zhanpeng Zeng, Karthikeyan Sankaralingam, Vikas Singh
Gaussian Loss Smoothing Enables Certified Training with Tight Convex Relaxations
Stefan Balauca, Mark Niklas Müller, Yuhao Mao, Maximilian Baader, Marc Fischer, Martin VechevRe-Simulation-based Self-Supervised Learning for Pre-Training Foundation Models
Philip Harris, Michael Kagan, Jeffrey Krupa, Benedikt Maier, Nathaniel Woodward
Transfer Learning for Security: Challenges and Future Directions
Adrian Shuai Li, Arun Iyengar, Ashish Kundu, Elisa BertinoFlatten Long-Range Loss Landscapes for Cross-Domain Few-Shot Learning
Yixiong Zou, Yicong Liu, Yiman Hu, Yuhua Li, Ruixuan LiRobust Deep Reinforcement Learning Through Adversarial Attacks and Training : A Survey
Lucas Schott, Josephine Delas, Hatem Hajri, Elies Gherbi, Reda Yaich, Nora Boulahia-Cuppens, Frederic Cuppens, Sylvain Lamprier