Sample Selection

Sample selection aims to optimize machine learning model training by carefully choosing a subset of the available data, improving efficiency and effectiveness. Current research focuses on developing sophisticated selection strategies that consider both local (e.g., sample difficulty) and global (e.g., data structure) information, often employing techniques like graph-based methods, modified Frank-Wolfe algorithms, and ensemble approaches. These advancements are crucial for handling challenges like noisy labels, imbalanced datasets, and limited annotation budgets, ultimately leading to more robust and accurate models across various applications, including medical image analysis and natural language processing.

Papers