Intrinsic Dimension
Intrinsic dimension (ID) refers to the minimum number of parameters needed to effectively represent a dataset, even if it resides in a higher-dimensional space. Current research focuses on robustly estimating ID in diverse data types (images, text, biological data) using methods ranging from manifold learning and quantum machine learning to topological data analysis and autoencoders, often within the context of deep neural networks. Understanding and leveraging ID is crucial for improving model efficiency, generalization, fairness, and robustness in machine learning, as well as providing insights into the underlying structure of complex datasets across various scientific domains.
Papers
Unsupervised detection of semantic correlations in big data
Santiago Acevedo, Alex Rodriguez, Alessandro Laio
Intrinsic Dimensionality of Fermi-Pasta-Ulam-Tsingou High-Dimensional Trajectories Through Manifold Learning
Gionni Marchetti
Understanding Variational Autoencoders with Intrinsic Dimension and Information Imbalance
Charles Camboulin, Diego Doimo, Aldo Glielmo