Safety Evaluation

Safety evaluation of AI systems, particularly large language models (LLMs) and autonomous driving systems (ADS), focuses on developing robust methods to assess and mitigate potential risks. Current research emphasizes creating comprehensive benchmark datasets and toolkits, employing techniques like red-teaming exercises, scenario-based analysis, and uncertainty quantification to evaluate various safety dimensions (e.g., bias, toxicity, adversarial attacks). These advancements are crucial for building trust and ensuring the responsible deployment of AI technologies across diverse applications, impacting both the scientific understanding of AI safety and the development of safer, more reliable systems.

Papers