Annotated Dataset
Annotated datasets are collections of data points labeled with specific information, crucial for training and evaluating machine learning models, particularly in complex domains like medicine and robotics. Current research emphasizes creating high-quality annotations, often incorporating AI-assisted methods to reduce manual effort, and addressing challenges like noisy or partially annotated data through techniques such as active learning, multi-task learning, and self-supervised learning. These datasets are vital for advancing various fields, enabling the development of more accurate and robust models for applications ranging from medical image analysis and natural language processing to robotics and e-commerce.
Papers
StatMix: Data augmentation method that relies on image statistics in federated learning
Dominik Lewy, Jacek Mańdziuk, Maria Ganzha, Marcin Paprzycki
ASL-Homework-RGBD Dataset: An annotated dataset of 45 fluent and non-fluent signers performing American Sign Language homeworks
Saad Hassan, Matthew Seita, Larwan Berke, Yingli Tian, Elaine Gale, Sooyeon Lee, Matt Huenerfauth