Data Set
Datasets are crucial for training and evaluating machine learning models, particularly in areas like natural language processing, computer vision, and audio analysis. Current research emphasizes creating diverse and high-quality datasets addressing specific challenges, such as data imbalance, cross-lingual inconsistencies, and the need for realistic representations of real-world scenarios. This involves developing novel annotation techniques, incorporating multiple data modalities (e.g., text, images, audio), and employing various model architectures (e.g., transformers, convolutional neural networks) for analysis and benchmark creation. The availability of well-designed datasets directly impacts the development of robust and reliable machine learning models, ultimately advancing scientific understanding and improving practical applications across numerous fields.
Papers
ConstScene: Dataset and Model for Advancing Robust Semantic Segmentation in Construction Environments
Maghsood Salimi, Mohammad Loni, Sara Afshar, Antonio Cicchetti, Marjan Sirjani
S2M: Converting Single-Turn to Multi-Turn Datasets for Conversational Question Answering
Baokui Li, Sen Zhang, Wangshu Zhang, Yicheng Chen, Changlin Yang, Sen Hu, Teng Xu, Siye liu, Jiwei Li
Understanding News Creation Intents: Frame, Dataset, and Method
Zhengjia Wang, Danding Wang, Qiang Sheng, Juan Cao, Silong Su, Yifan Sun, Beizhe Hu, Siyuan Ma
The State of Documentation Practices of Third-party Machine Learning Models and Datasets
Ernesto Lang Oreamuno, Rohan Faiyaz Khan, Abdul Ali Bangash, Catherine Stinson, Bram Adams
BonnBeetClouds3D: A Dataset Towards Point Cloud-based Organ-level Phenotyping of Sugar Beet Plants under Field Conditions
Elias Marks, Jonas Bömer, Federico Magistri, Anurag Sah, Jens Behley, Cyrill Stachniss
DSAP: Analyzing Bias Through Demographic Comparison of Datasets
Iris Dominguez-Catena, Daniel Paternain, Mikel Galar
CaptainCook4D: A Dataset for Understanding Errors in Procedural Activities
Rohith Peddi, Shivvrat Arya, Bharath Challa, Likhitha Pallapothula, Akshay Vyas, Bhavya Gouripeddi, Jikai Wang, Qifan Zhang, Vasundhara Komaragiri, Eric Ragan, Nicholas Ruozzi, Yu Xiang, Vibhav Gogate