Multimodal Information
Multimodal information processing focuses on integrating data from multiple sources, such as text, images, audio, and sensor data, to achieve a more comprehensive understanding than any single modality allows. Current research emphasizes developing robust model architectures, including large language models (LLMs), transformers, and autoencoders, to effectively fuse and interpret this diverse information, often addressing challenges like missing data and noise. This field is significant for advancing numerous applications, from improving medical diagnoses and e-commerce search to enhancing robotic perception and understanding human-computer interactions.
Papers
Experimenting with Multi-modal Information to Predict Success of Indian IPOs
Sohom Ghosh, Arnab Maji, N Harsha Vardhan, Sudip Kumar Naskar
An Entailment Tree Generation Approach for Multimodal Multi-Hop Question Answering with Mixture-of-Experts and Iterative Feedback Mechanism
Qing Zhang, Haocheng Lv, Jie Liu, Zhiyun Chen, Jianyong Duan, Hao Wang, Li He, Mingying Xv
Compositional Image Retrieval via Instruction-Aware Contrastive Learning
Wenliang Zhong, Weizhi An, Feng Jiang, Hehuan Ma, Yuzhi Guo, Junzhou Huang
Fragmented Layer Grouping in GUI Designs Through Graph Learning Based on Multimodal Information
Yunnong Chen, Shuhong Xiao, Jiazhi Li, Tingting Zhou, Yanfang Chang, Yankun Zhen, Lingyun Sun, Liuqing Chen
CMATH: Cross-Modality Augmented Transformer with Hierarchical Variational Distillation for Multimodal Emotion Recognition in Conversation
Xiaofei Zhu, Jiawei Cheng, Zhou Yang, Zhuo Chen, Qingyang Wang, Jianfeng Yao
VMID: A Multimodal Fusion LLM Framework for Detecting and Identifying Misinformation of Short Videos
Weihao Zhong, Yinhao Xiao, Minghui Xu, Xiuzhen Cheng
A Pattern to Align Them All: Integrating Different Modalities to Define Multi-Modal Entities
Gianluca Apriceno, Valentina Tamma, Tania Bailoni, Jacopo de Berardinis, Mauro Dragoni
Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Yanpeng Sun, Huaxin Zhang, Qiang Chen, Xinyu Zhang, Nong Sang, Gang Zhang, Jingdong Wang, Zechao Li