Data Centric
Data-centric AI prioritizes high-quality data as the primary driver of successful machine learning, shifting focus from solely model optimization. Current research emphasizes improving data quality through techniques like data augmentation, feature engineering, and careful dataset curation, often employing transformer-based models and other deep learning architectures for analysis. This approach is crucial for addressing issues like algorithmic bias, improving model robustness and generalization, and ultimately leading to more reliable and trustworthy AI systems across diverse applications, from healthcare and finance to earth observation and natural language processing.
Papers
Model-free feature selection to facilitate automatic discovery of divergent subgroups in tabular data
Girmaw Abebe Tadesse, William Ogallo, Celia Cintas, Skyler Speakman
Towards Efficient Data-Centric Robust Machine Learning with Noise-based Augmentation
Xiaogeng Liu, Haoyu Wang, Yechao Zhang, Fangzhou Wu, Shengshan Hu