Multimodal Dataset
Multimodal datasets integrate data from diverse sources, such as text, images, audio, and sensor readings, to improve the performance of machine learning models on complex tasks. Current research focuses on developing and applying these datasets across various domains, including remote sensing, healthcare, and robotics, often employing transformer-based architectures and contrastive learning methods to effectively fuse information from different modalities. The availability of high-quality multimodal datasets is crucial for advancing research in artificial intelligence and enabling the development of more robust and accurate systems for a wide range of applications.
Papers
Creating a Lens of Chinese Culture: A Multimodal Dataset for Chinese Pun Rebus Art Understanding
Tuo Zhang, Tiantian Feng, Yibin Ni, Mengqin Cao, Ruying Liu, Katharine Butler, Yanjun Weng, Mi Zhang, Shrikanth S. Narayanan, Salman Avestimehr
SemanticSpray++: A Multimodal Dataset for Autonomous Driving in Wet Surface Conditions
Aldi Piroli, Vinzenz Dallabetta, Johannes Kopp, Marc Walessa, Daniel Meissner, Klaus Dietmayer
Industrial Language-Image Dataset (ILID): Adapting Vision Foundation Models for Industrial Settings
Keno Moenck, Duc Trung Thieu, Julian Koch, Thorsten Schüppstuhl
Enhancing Adverse Drug Event Detection with Multimodal Dataset: Corpus Creation and Model Development
Pranab Sahoo, Ayush Kumar Singh, Sriparna Saha, Aman Chadha, Samrat Mondal
EmpathicStories++: A Multimodal Dataset for Empathy towards Personal Experiences
Jocelyn Shen, Yubin Kim, Mohit Hulse, Wazeer Zulfikar, Sharifa Alghowinem, Cynthia Breazeal, Hae Won Park