Potential Harm
Research on potential harms from artificial intelligence (AI) systems, particularly large language models (LLMs), focuses on identifying and mitigating biases, inaccuracies, and vulnerabilities that lead to negative societal impacts. Current efforts utilize various techniques, including human-centered evaluations, post-hoc model correction methods, and the development of new datasets and annotation frameworks to better understand and categorize different types of harm. This research is crucial for ensuring responsible AI development and deployment, addressing issues ranging from algorithmic bias and misinformation to safety concerns in high-stakes applications like healthcare and law enforcement.
Papers
March 20, 2024
March 17, 2024
March 16, 2024
February 20, 2024
February 14, 2024
February 13, 2024
February 6, 2024
February 1, 2024
January 19, 2024
November 7, 2023
October 25, 2023
October 9, 2023
September 19, 2023
September 12, 2023
August 22, 2023
July 6, 2023
June 15, 2023
June 3, 2023
May 24, 2023