Intrinsic Dimension

Intrinsic dimension (ID) refers to the minimum number of parameters needed to effectively represent a dataset, even if it resides in a higher-dimensional space. Current research focuses on robustly estimating ID in diverse data types (images, text, biological data) using methods ranging from manifold learning and quantum machine learning to topological data analysis and autoencoders, often within the context of deep neural networks. Understanding and leveraging ID is crucial for improving model efficiency, generalization, fairness, and robustness in machine learning, as well as providing insights into the underlying structure of complex datasets across various scientific domains.

Papers