Black Box
"Black box" refers to systems whose internal workings are opaque, hindering understanding and analysis. Current research focuses on methods to analyze and mitigate the limitations of black-box models, particularly deep neural networks, across diverse applications like code generation, robot design, and autonomous systems. Key approaches involve developing surrogate models, employing novel optimization techniques, and designing explainable AI (XAI) methods to enhance interpretability and trustworthiness. This research is crucial for ensuring the safety, reliability, and fairness of increasingly prevalent AI systems in various fields.
Papers
Illuminating the Black Box: A Psychometric Investigation into the Multifaceted Nature of Large Language Models
Yang Lu, Jordan Yu, Shou-Hsuan Stephen Huang
Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models
Jingwei Yi, Yueqi Xie, Bin Zhu, Emre Kiciman, Guangzhong Sun, Xing Xie, Fangzhao Wu