Training Point

Training point selection and generation are crucial for optimizing machine learning model performance, particularly when dealing with limited or heterogeneous data. Current research focuses on developing algorithms that intelligently select or synthesize the most informative training points, leveraging techniques like Shapley value-based data valuation, diffusion transformers for synthetic data generation, and adaptive collocation strategies for physics-informed neural networks. These advancements aim to improve model robustness, generalization, and efficiency across diverse applications, from medical image analysis and federated learning to speech recognition and code summarization.

Papers