Human SAFETY
Human safety in the context of rapidly advancing AI systems, particularly large language models (LLMs) and autonomous vehicles, is a critical research area focusing on mitigating risks associated with harmful outputs, unreliable predictions, and unforeseen interactions. Current research emphasizes developing robust safety mechanisms, including novel algorithms like Precision Knowledge Editing for LLMs and Physics-Enhanced Residual Policy Learning for autonomous vehicle control, as well as exploring multi-objective learning frameworks to balance safety and performance. These efforts are crucial for ensuring the responsible deployment of AI technologies across various sectors, ultimately improving the reliability and trustworthiness of these systems in real-world applications.
Papers
Towards Safety and Helpfulness Balanced Responses via Controllable Large Language Models
Yi-Lin Tuan, Xilun Chen, Eric Michael Smith, Louis Martin, Soumya Batra, Asli Celikyilmaz, William Yang Wang, Daniel M. Bikel
What's in Your "Safe" Data?: Identifying Benign Data that Breaks Safety
Luxi He, Mengzhou Xia, Peter Henderson
Constrained Passive Interaction Control: Leveraging Passivity and Safety for Robot Manipulators
Zhiquan Zhang, Tianyu Li, Nadia Figueroa
Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation
Yunhao Gou, Kai Chen, Zhili Liu, Lanqing Hong, Hang Xu, Zhenguo Li, Dit-Yan Yeung, James T. Kwok, Yu Zhang