Risk Taxonomy

Risk taxonomies are structured classifications of potential hazards associated with emerging technologies, particularly large language models (LLMs) and other AI systems. Current research focuses on developing comprehensive taxonomies that encompass various risk categories, from biases and safety violations to security vulnerabilities and ethical concerns, often employing natural language processing (NLP) techniques like topic modeling to analyze large datasets of user interactions and incident reports. These taxonomies are crucial for benchmarking model safety, informing the development of mitigation strategies, and ultimately promoting the responsible design and deployment of AI systems across diverse applications.

11papers

Papers

May 21, 2025

A Risk Taxonomy for Evaluating AI-Powered Psychotherapy Agents
Ian Steenstra, Timothy W. Bickmore
Northeastern University
Risk Taxonomy Cognitive Learning Psychotherapy Model

February 16, 2025

SafeDialBench: A Fine-Grained Safety Benchmark for Large Language Models in Multi-Turn Dialogues with Diverse Jailbreak Attacks
Hongye Cao, Yanming Wang, Sijia Jing, Ziyue Peng, Zhixin Bai, Zhe Cao, Meng Fang, Fan Feng, Boyan Wang, Jiaheng Liu, Tianpei Yang, Jing Huo+5
Multi Turn Dialogue Fine Grained Jailbreak Attack Risk Taxonomy

January 15, 2025

Aegis2.0: A Diverse AI Safety Dataset and Risks Taxonomy for Alignment of LLM Guardrails
Shaona Ghosh, Prasoon Varshney, Makesh Narsimhan Sreedhar, Aishwarya Padmakumar, Traian Rebedea, Jibin Rajan Varghese, Christopher Parisien
Generative AI Risk Taxonomy Guardrail Model AI Safety Human Annotated LLM Safety Alignment Problem

November 4, 2024

"It's a conversation, not a quiz": A Risk Taxonomy and Reflection Tool for LLM Adoption in Public Health
Jiawei Zhou, Amy Z. Chen, Darshi Shah, Laura Schwab Reese, Munmun De Choudhury
Potential Conversation Outcome Risk Taxonomy Clinical Narrative Risk Sensitive Online Health

July 11, 2024

June 25, 2024

Towards Probing Speech-Specific Risks in Large Multimodal Models: A Taxonomy, Benchmark, and Insights
Hao Yang, Lizhen Qu, Ehsan Shareghi, Gholamreza Haffari
Large Multimodal Model DCU Insight AQ Paralinguistic Feature Speech Privacy Risk Taxonomy Comprehensive Taxonomy Text Modality Speech Input New Benchmark

June 14, 2024

CHiSafetyBench: A Chinese Hierarchical Safety Benchmark for Large Language Models
Wenjing Zhang, Xuejiao Lei, Zhaoxiang Liu, Meijuan An, Bikun Yang, KaiKai Zhao, Kai Wang, Shiguo Lian
Safety Benchmark Risk Taxonomy

June 7, 2024

LlavaGuard: An Open VLM-based Framework for Safeguarding Vision Datasets and Models
Lukas Helff, Felix Friedrich, Manuel Brack, Kristian Kersting, Patrick Schramowski
Risk Taxonomy Generative AI Risk Awareness Visual Data Vision Datasets

April 9, 2024

AEGIS: Online Adaptive AI Content Safety Moderation with Ensemble of LLM Experts
Shaona Ghosh, Prasoon Varshney, Erick Galinkin, Christopher Parisien
Diverse Ensemble LLM Credence Content Moderation Risk Taxonomy LLM Safety

April 1, 2024

Developing Safe and Responsible Large Language Model : Can We Balance Bias Reduction and Language Understanding in Large Language Models?
Shaina Raza, Oluwanifemi Bamgbose, Shardul Ghuge, Fatemeh Tavakol, Deepak John Reji, Syed Raza Bashir
LLM Generated Risk Taxonomy

January 11, 2024

Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
Tianyu Cui, Yanling Wang, Chuanpu Fu, Yong Xiao, Sijia Li, Xinhao Deng, Yunpeng Liu, Qinglin Zhang, Ziyi Qiu, Peiyang Li, Zhixing Tan, Junwu Xiong+4
State of the Art LLM Faulty Negative Mitigation Large Language Model Risk Taxonomy LLM Generated Evaluation Benchmark Language Model

November 19, 2023

A Security Risk Taxonomy for Prompt-Based Interaction With Large Language Models
Erik Derner, Kristina Batistič, Jan Zahálka, Robert Babuška
Prompt Attack Risk Taxonomy Security Risk LLM Safety Large Language Model

March 9, 2023

Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback
Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale
Large Language Model Risk Taxonomy Policy Design Personalized Alignment Personalized Feedback Human Preference Conversational Interaction Personalization Performance

January 10, 2022

Using Online Customer Reviews to Classify, Predict, and Learn about Domestic Robot Failures
Shanee Honig, Alon Bartal, Yisrael Parmet, Tal Oron-Gilad
Risk Taxonomy Future Failure Robot Failure Home Robot Customer Review

Risk Taxonomy

Papers

A Risk Taxonomy for Evaluating AI-Powered Psychotherapy Agents

SafeDialBench: A Fine-Grained Safety Benchmark for Large Language Models in Multi-Turn Dialogues with Diverse Jailbreak Attacks

Aegis2.0: A Diverse AI Safety Dataset and Risks Taxonomy for Alignment of LLM Guardrails

"It's a conversation, not a quiz": A Risk Taxonomy and Reflection Tool for LLM Adoption in Public Health

AIR-Bench 2024: A Safety Benchmark Based on Risk Categories from Regulations and Policies

Unveiling Disparities in Maternity Care: A Topic Modelling Approach to Analysing Maternity Incident Investigation Reports

Towards Probing Speech-Specific Risks in Large Multimodal Models: A Taxonomy, Benchmark, and Insights

CHiSafetyBench: A Chinese Hierarchical Safety Benchmark for Large Language Models

LlavaGuard: An Open VLM-based Framework for Safeguarding Vision Datasets and Models

AEGIS: Online Adaptive AI Content Safety Moderation with Ensemble of LLM Experts

Developing Safe and Responsible Large Language Model : Can We Balance Bias Reduction and Language Understanding in Large Language Models?

Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems

A Security Risk Taxonomy for Prompt-Based Interaction With Large Language Models

Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback

Using Online Customer Reviews to Classify, Predict, and Learn about Domestic Robot Failures