Multimodal Dataset
Multimodal datasets integrate data from diverse sources, such as text, images, audio, and sensor readings, to improve the performance of machine learning models on complex tasks. Current research focuses on developing and applying these datasets across various domains, including remote sensing, healthcare, and robotics, often employing transformer-based architectures and contrastive learning methods to effectively fuse information from different modalities. The availability of high-quality multimodal datasets is crucial for advancing research in artificial intelligence and enabling the development of more robust and accurate systems for a wide range of applications.
Papers
The Effects of Selected Object Features on a Pick-and-Place Task: a Human Multimodal Dataset
Linda Lastrico, Valerio Belcamino, Alessandro Carfì, Alessia Vignolo, Alessandra Sciutti, Fulvio Mastrogiovanni, Francesco Rea
DiveSound: LLM-Assisted Automatic Taxonomy Construction for Diverse Audio Generation
Baihan Li, Zeyu Xie, Xuenan Xu, Yiwei Guo, Ming Yan, Ji Zhang, Kai Yu, Mengyue Wu
LEMoN: Label Error Detection using Multimodal Neighbors
Haoran Zhang, Aparna Balagopalan, Nassim Oufattole, Hyewon Jeong, Yan Wu, Jiacheng Zhu, Marzyeh Ghassemi
RoBus: A Multimodal Dataset for Controllable Road Networks and Building Layouts Generation
Tao Li, Ruihang Li, Huangnan Zheng, Shanding Ye, Shijian Li, Zhijie Pan
MAN TruckScenes: A multimodal dataset for autonomous trucking in diverse conditions
Felix Fent, Fabian Kuttenreich, Florian Ruch, Farija Rizwin, Stefan Juergens, Lorenz Lechermann, Christian Nissler, Andrea Perl, Ulrich Voll, Min Yan, Markus Lienkamp
BIOSCAN-5M: A Multimodal Dataset for Insect Biodiversity
Zahra Gharaee, Scott C. Lowe, ZeMing Gong, Pablo Millan Arias, Nicholas Pellegrino, Austin T. Wang, Joakim Bruslund Haurum, Iuliia Zarubiieva, Lila Kari, Dirk Steinke, Graham W. Taylor, Paul Fieguth, Angel X. Chang
Language and Multimodal Models in Sports: A Survey of Datasets and Applications
Haotian Xia, Zhengbang Yang, Yun Zhao, Yuqing Wang, Jingxi Li, Rhys Tracy, Zhuangdi Zhu, Yuan-fang Wang, Hanjie Chen, Weining Shen
Creating a Lens of Chinese Culture: A Multimodal Dataset for Chinese Pun Rebus Art Understanding
Tuo Zhang, Tiantian Feng, Yibin Ni, Mengqin Cao, Ruying Liu, Katharine Butler, Yanjun Weng, Mi Zhang, Shrikanth S. Narayanan, Salman Avestimehr
SemanticSpray++: A Multimodal Dataset for Autonomous Driving in Wet Surface Conditions
Aldi Piroli, Vinzenz Dallabetta, Johannes Kopp, Marc Walessa, Daniel Meissner, Klaus Dietmayer
Industrial Language-Image Dataset (ILID): Adapting Vision Foundation Models for Industrial Settings
Keno Moenck, Duc Trung Thieu, Julian Koch, Thorsten Schüppstuhl