Black Box
"Black box" refers to systems whose internal workings are opaque, hindering understanding and analysis. Current research focuses on methods to analyze and mitigate the limitations of black-box models, particularly deep neural networks, across diverse applications like code generation, robot design, and autonomous systems. Key approaches involve developing surrogate models, employing novel optimization techniques, and designing explainable AI (XAI) methods to enhance interpretability and trustworthiness. This research is crucial for ensuring the safety, reliability, and fairness of increasingly prevalent AI systems in various fields.
Papers
Human-Readable Programs as Actors of Reinforcement Learning Agents Using Critic-Moderated Evolution
Senne Deproost, Denis Steckelmacher, Ann Nowé
Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Models
Shaobo Wang, Hongxuan Tang, Mingyang Wang, Hongrui Zhang, Xuyang Liu, Weiya Li, Xuming Hu, Linfeng Zhang
Stealthy Jailbreak Attacks on Large Language Models via Benign Data Mirroring
Honglin Mu, Han He, Yuxin Zhou, Yunlong Feng, Yang Xu, Libo Qin, Xiaoming Shi, Zeming Liu, Xudong Han, Qi Shi, Qingfu Zhu, Wanxiang Che
BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks
Yunhan Zhao, Xiang Zheng, Lin Luo, Yige Li, Xingjun Ma, Yu-Gang Jiang
S$^4$ST: A Strong, Self-transferable, faSt, and Simple Scale Transformation for Transferable Targeted Attack
Yongxiang Liu, Bowen Peng, Li Liu, Xiang Li
BlackDAN: A Black-Box Multi-Objective Approach for Effective and Contextual Jailbreaking of Large Language Models
Xinyuan Wang, Victor Shea-Jay Huang, Renmiao Chen, Hao Wang, Chengwei Pan, Lei Sha, Minlie Huang
Sequencing the Neurome: Towards Scalable Exact Parameter Reconstruction of Black-Box Neural Networks
Judah Goldfeder, Quinten Roets, Gabe Guo, John Wright, Hod Lipson
Safe Decentralized Multi-Agent Control using Black-Box Predictors, Conformal Decision Policies, and Control Barrier Functions
Sacha Huriot, Hussein Sibai
Unveiling the Black Box: Independent Functional Module Evaluation for Bird's-Eye-View Perception Model
Ludan Zhang, Xiaokang Ding, Yuqi Dai, Lei He, Keqiang Li
LLM-wrapper: Black-Box Semantic-Aware Adaptation of Vision-Language Models for Referring Expression Comprehension
Amaia Cardiel, Eloi Zablocki, Elias Ramzi, Oriane Siméoni, Matthieu Cord