Data Complexity

Data complexity, encompassing both the inherent intricacy of data structures and the difficulty of learning from them, is a central challenge in machine learning and related fields. Current research focuses on quantifying data complexity through various metrics, including geometric measures like local intrinsic dimension and information-theoretic approaches based on compression algorithms and influence functions, and applying these metrics to improve model performance and efficiency. This work is crucial for developing more robust and reliable machine learning models, particularly in high-dimensional settings and for tasks involving complex data distributions, ultimately impacting the accuracy and efficiency of diverse applications.

Papers