Comprehensive Trustworthiness

Comprehensive trustworthiness in artificial intelligence (AI) focuses on developing and evaluating AI systems that are reliable, fair, robust, safe, and private. Current research emphasizes benchmarking and improving trustworthiness across various model architectures, including large language models (LLMs), multimodal LLMs, and smaller on-device models, often using techniques like reinforcement learning from human feedback and data-centric approaches to address biases and vulnerabilities. This research is crucial for building public trust in AI and ensuring responsible deployment in high-stakes applications, driving the development of more reliable and ethical AI systems.

Papers