Black Box
"Black box" refers to systems whose internal workings are opaque, hindering understanding and analysis. Current research focuses on methods to analyze and mitigate the limitations of black-box models, particularly deep neural networks, across diverse applications like code generation, robot design, and autonomous systems. Key approaches involve developing surrogate models, employing novel optimization techniques, and designing explainable AI (XAI) methods to enhance interpretability and trustworthiness. This research is crucial for ensuring the safety, reliability, and fairness of increasingly prevalent AI systems in various fields.
Papers
BEACON: A Bayesian Optimization Strategy for Novelty Search in Expensive Black-Box Systems
Wei-Ting Tang, Ankush Chakrabarty, Joel A. Paulson
Alignment Calibration: Machine Unlearning for Contrastive Learning under Auditing
Yihan Wang, Yiwei Lu, Guojun Zhang, Franziska Boenisch, Adam Dziedzic, Yaoliang Yu, Xiao-Shan Gao
Is Algorithmic Stability Testable? A Unified Framework under Computational Constraints
Yuetian Luo, Rina Foygel Barber
Extracting Prompts by Inverting LLM Outputs
Collin Zhang, John X. Morris, Vitaly Shmatikov
Nearly Tight Black-Box Auditing of Differentially Private Machine Learning
Meenatchi Sundaram Muthu Selva Annamalai, Emiliano De Cristofaro