Sample Importance

Sample importance, a crucial concept in machine learning, focuses on identifying and prioritizing the most informative data points within a dataset to improve model efficiency and performance. Current research explores various methods for quantifying sample importance, ranging from information-theoretic approaches (e.g., leveraging structural entropy or data compression principles) to second-order methods analyzing model sensitivity (e.g., Hessian-based influence functions). These techniques find applications in diverse areas, including active learning, data pruning for large language models, and robust training of models for audio and video processing, ultimately leading to more efficient and effective machine learning systems.

Papers