Data Set
Datasets are crucial for training and evaluating machine learning models, particularly in areas like natural language processing, computer vision, and audio analysis. Current research emphasizes creating diverse and high-quality datasets addressing specific challenges, such as data imbalance, cross-lingual inconsistencies, and the need for realistic representations of real-world scenarios. This involves developing novel annotation techniques, incorporating multiple data modalities (e.g., text, images, audio), and employing various model architectures (e.g., transformers, convolutional neural networks) for analysis and benchmark creation. The availability of well-designed datasets directly impacts the development of robust and reliable machine learning models, ultimately advancing scientific understanding and improving practical applications across numerous fields.
Papers
Measuring the Quality of Text-to-Video Model Outputs: Metrics and Dataset
Iya Chivileva, Philip Lynch, Tomas E. Ward, Alan F. Smeaton
M3Dsynth: A dataset of medical 3D images with AI-generated local manipulations
Giada Zingarini, Davide Cozzolino, Riccardo Corvi, Giovanni Poggi, Luisa Verdoliva
Dhan-Shomadhan: A Dataset of Rice Leaf Disease Classification for Bangladeshi Local Rice
Md. Fahad Hossain
Rank2Tell: A Multimodal Driving Dataset for Joint Importance Ranking and Reasoning
Enna Sachdeva, Nakul Agarwal, Suhas Chundi, Sean Roelofs, Jiachen Li, Mykel Kochenderfer, Chiho Choi, Behzad Dariush
Flows for Flows: Morphing one Dataset into another with Maximum Likelihood Estimation
Tobias Golling, Samuel Klein, Radha Mastandrea, Benjamin Nachman, John Andrew Raine
SMPLitex: A Generative Model and Dataset for 3D Human Texture Estimation from Single Image
Dan Casas, Marc Comino-Trinidad
Transfer Learning between Motor Imagery Datasets using Deep Learning -- Validation of Framework and Comparison of Datasets
Pierre Guetschel, Michael Tangermann
Artificial Empathy Classification: A Survey of Deep Learning Techniques, Datasets, and Evaluation Scales
Sharjeel Tahir, Syed Afaq Shah, Jumana Abu-Khalaf
Prompt me a Dataset: An investigation of text-image prompting for historical image dataset creation using foundation models
Hassan El-Hajj, Matteo Valleriani
NumHG: A Dataset for Number-Focused Headline Generation
Jian-Tao Huang, Chung-Chi Chen, Hen-Hsen Huang, Hsin-Hsi Chen