Multimodal Dataset
Multimodal datasets integrate data from diverse sources, such as text, images, audio, and sensor readings, to improve the performance of machine learning models on complex tasks. Current research focuses on developing and applying these datasets across various domains, including remote sensing, healthcare, and robotics, often employing transformer-based architectures and contrastive learning methods to effectively fuse information from different modalities. The availability of high-quality multimodal datasets is crucial for advancing research in artificial intelligence and enabling the development of more robust and accurate systems for a wide range of applications.
Papers
ICC: Quantifying Image Caption Concreteness for Multimodal Dataset Curation
Moran Yanuka, Morris Alper, Hadar Averbuch-Elor, Raja Giryes
REWIND Dataset: Privacy-preserving Speaking Status Segmentation from Multimodal Body Movement Signals in the Wild
Jose Vargas Quiros, Chirag Raman, Stephanie Tan, Ekin Gedik, Laura Cabrera-Quiros, Hayley Hung
Non-contact Multimodal Indoor Human Monitoring Systems: A Survey
Le Ngu Nguyen, Praneeth Susarla, Anirban Mukherjee, Manuel Lage Cañellas, Constantino Álvarez Casado, Xiaoting Wu, Olli~Silvén, Dinesh Babu Jayagopi, Miguel Bordallo López
A Multimodal Dataset and Benchmark for Radio Galaxy and Infrared Host Detection
Nikhel Gupta, Zeeshan Hayder, Ray P. Norris, Minh Hyunh, Lars Petersson