MLLM Security

Multimodal large language model (MLLM) security research focuses on mitigating the risks associated with these powerful AI systems, which combine language processing with image and other modalities. Current efforts concentrate on developing robust evaluation suites to assess safety across multiple dimensions (e.g., bias, toxicity, privacy), improving instruction tuning methods to enhance model control and reduce harmful outputs, and designing defense mechanisms to protect against malicious inputs, particularly images. This field is crucial for ensuring the responsible deployment of MLLMs in various applications, preventing unintended harm, and advancing the trustworthiness of AI.

Papers

November 27, 2024

R-MTLLMF: Resilient Multi-Task Large Language Model Fusion at the Wireless Edge
Aladin Djuhera, Vlad C. Andrei, Mohsen Pourghasemian, Haris Gacanin, Holger Boche, Walid Saad
Language Model Tiny Refinement Elicit Resilience Model Fusion Wireless Edge Task Vector MLLM Security

October 21, 2024

VipAct: Visual-Perception Enhancement via Specialized VLM Agent Collaboration and Tool-use
Zhehao Zhang, Ryan Rossi, Tong Yu, Franck Dernoncourt, Ruiyi Zhang, Jiuxiang Gu, Sungchul Kim, Xiang Chen, Zichao Wang, Nedim Lipka
Fine Grained Vision Language Model Feature Enhancement Perception System Visual Understanding MLLM Security

October 17, 2024

VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding
Runsen Xu, Zhiwei Huang, Tai Wang, Yilun Chen, Jiangmiao Pang, Dahua Lin
Vision Language Model Visual Grounding 3D Scene Understanding Object Prior MLLM Security

October 4, 2024

MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents
Junpeng Yue, Xinru Xu, Börje F. Karlsson, Zongqing Lu
Embodied Agent MLLM Training Multimodal Retrieval MLLM Attention Hybrid Retriever Multimodal Trajectory Prediction MLLM Security

July 29, 2024

CoMMIT: Coordinated Instruction Tuning for Multimodal Large Language Models
Junda Wu, Xintong Li, Tong Yu, Yu Wang, Xiang Chen, Jiuxiang Gu, Lina Yao, Jingbo Shang, Julian McAuley
Multimodal Large Language Model Instruction Tuning CoMMIT Selection Feature Encoder MLLM Security

June 11, 2024

MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models
Tianle Gu, Zeyang Zhou, Kexin Huang, Dandan Liang, Yixu Wang, Haiquan Zhao, Yuanqi Yao, Xingge Qiao, Keqing Wang, Yujiu Yang, Yan Teng, Yu Qiao, Yingchun Wang
Multimodal Large Language Model Image Text Safety Evaluation MLLM Security

April 8, 2024

Unbridled Icarus: A Survey of the Potential Perils of Image Inputs in Multimodal Large Language Model Security
Yihe Fan, Yuxin Cao, Ziyu Zhao, Ziyao Liu, Shaofeng Li
Large Language Model Timely Survey Multimodal Large Language Model Threat Model Input Image MLLM Security MLLM Agent

February 20, 2024

The Wolf Within: Covert Injection of Malice into MLLM Societies via an MLLM Operative
Zhen Tan, Chengshuai Zhao, Raha Moraffah, Yifan Li, Yu Kong, Tianlong Chen, Huan Liu
Artificial General Intelligence MLLM Training MLLM Attention MLLM Security MLLM Agent

January 5, 2024

MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance
Renjie Pi, Tianyang Han, Jianshu Zhang, Yueqi Xie, Rui Pan, Qing Lian, Hanze Dong, Jipeng Zhang, Tong Zhang
Large Language Model Multimodal Large Language Model System Performance Human SAFETY Text Image Pair MLLM Security

MLLM Security

Papers

R-MTLLMF: Resilient Multi-Task Large Language Model Fusion at the Wireless Edge

VipAct: Visual-Perception Enhancement via Specialized VLM Agent Collaboration and Tool-use

VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding

MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents

CoMMIT: Coordinated Instruction Tuning for Multimodal Large Language Models

MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models

Unbridled Icarus: A Survey of the Potential Perils of Image Inputs in Multimodal Large Language Model Security

The Wolf Within: Covert Injection of Malice into MLLM Societies via an MLLM Operative

MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance