Stereotype Content

Stereotype content research investigates how biases and stereotypes are represented and perpetuated within large language models (LLMs) and other AI systems, aiming to understand and mitigate their harmful societal impact. Current research focuses on identifying and quantifying these biases across various modalities (text, images), languages, and demographic groups, often employing techniques like adversarial attacks and explainable AI methods to analyze model behavior and develop mitigation strategies. This work is crucial for ensuring fairness and equity in AI applications, impacting fields ranging from education and healthcare to hiring and criminal justice, by promoting the development of less biased and more responsible AI systems.

Papers

November 30, 2023

Women Are Beautiful, Men Are Leaders: Gender Stereotypes in Machine Translation and Language Modeling
Matúš Pikuliak, Andrea Hrckova, Stefan Oresko, Marián Šimko
Machine Translation Stereotype Content Machine Translation System Gender Stereotype Female Speaker

November 24, 2023

Evaluating Large Language Models through Gender and Racial Stereotypes
Ananya Malik
Language Model Artificial Intelligence Gender Bias NLP Community Stereotype Content Gender Inclusive Text Potential Bias

November 23, 2023

Towards Auditing Large Language Models: Improving Text-based Stereotype Detection
Wu Zekun, Sahan Bulathwela, Adriano Soares Koshiyama
Language Model Stereotype Content Stereotype Detection

November 8, 2023

Profiling Irony & Stereotype: Exploring Sentiment, Topic, and Lexical Features
Tibor L. R. Krols, Marie Mortensen, Ninell Oldenburg
Implicit Sentiment Topic Model Stereotype Content Topic Analysis Tf Idf Sentiment Feature Lexical Feature Irony Detection

October 31, 2023

Beyond Denouncing Hate: Strategies for Countering Implied Biases and Stereotypes in Language
Jimin Mun, Emily Allaway, Akhila Yerukola, Laura Vianna, Sarah-Jane Leslie, Maarten Sap
Human Language General Strategy Stereotype Content Counterspeech Generation Source Bias

October 30, 2023

'Person' == Light-skinned, Western Man, and Sexualization of Women of Color: Stereotypes in Stable Diffusion
Sourojit Ghosh, Aylin Caliskan
Person Name Stable Diffusion Stereotype Content Female Objectification

October 23, 2023

Towards Conceptualization of "Fair Explanation": Disparate Impacts of anti-Asian Hate Speech Explanations on Content Moderators
Tin Nguyen, Jiannan Xu, Aayushi Roy, Hal Daumé, Marine Carpuat
Counterfactual Explanation Content Moderation Stereotype Content Faithful Explanation Conceptual Tool Disparate Impact Discriminatory Language

October 20, 2023

StereoMap: Quantifying the Awareness of Human-like Stereotypes in Large Language Models
Sullam Jeoung, Yubin Ge, Jana Diesner
Human Like Stereotype Content Social Awareness Demographic Attribute

October 18, 2023

Language Agents for Detecting Implicit Stereotypes in Text-to-image Models at Scale
Qichao Wang, Tian Bian, Yian Yin, Tingyang Xu, Hong Cheng, Helen M. Meng, Zibin Zheng, Liang Chen, Bingzhe Wu
Text to Image Model Visual Analogue Scale Generated Content Language Agent Stereotype Content Stereotype Detection

October 13, 2023

"Im not Racist but...": Discovering Bias in the Internal Knowledge of Large Language Models
Abel Salinas, Louis Penafiel, Robert McCormack, Fred Morstatter
Large Language Model Natural Language Absolute Stance Bias Natural Language Processing Task Stereotype Content Societal Bias

October 7, 2023

Generative AI May Prefer to Present National-level Characteristics of Cities Based on Stereotypical Geographic Impressions at the Continental Level
Shan Ye
Artificial Intelligence Generative Artificial Intelligence Urban Environment Stereotype Content Stereotypical Bias Urban Scene

August 28, 2023

Gender bias and stereotypes in Large Language Models
Hadas Kotek, Rikker Dockum, David Q. Sun
Gender Bias Stereotype Content Biased Behavior Gender Stereotype

August 21, 2023

FairMonitor: A Four-Stage Automatic Framework for Detecting Stereotypes and Biases in Large Language Models
Yanhong Bai, Jiabao Zhao, Jinxin Shi, Tingjiang Wei, Xingjiao Wu, Liang He
Implicit Bias Stereotype Content Topic Bias Multi Dimensional Evaluation Stereotype Detection Multi Stage Framework

July 24, 2023

Interpretable Stereotype Identification through Reasoning
Jacob-Junqi Tian, Omkar Dige, David Emerson, Faiza Khan Khattak
Language Model Zero Shot Complex Reasoning Stereotype Content Inherent Bias Stereotype Detection

July 20, 2023

Building Socio-culturally Inclusive Stereotype Resources with Community Engagement
Sunipa Dev, Jaya Goyal, Dinesh Tewari, Shachi Dave, Vinodkumar Prabhakaran
Language Model Generative Language Model Social Bias Stereotype Content Novel Evaluation

July 14, 2023

June 7, 2023

ChatGPT is fun, but it is not funny! Humor is still challenging Large Language Models
Sophie Jentzsch, Kristian Kersting
ChatGPT Generated Conversation Stereotype Content Human Humor Prompt Based Method Humor Mechanic

June 3, 2023

Word-Level Explanations for Analyzing Bias in Text-to-Image Models
Alexander Lin, Lucas Monteiro Paes, Sree Harsha Tanneru, Suraj Srinivas, Himabindu Lakkaraju
Language Model Synthetic Data Absolute Stance Bias Text to Image Model Text to Image Stereotype Content

June 1, 2023

T2IAT: Measuring Valence and Stereotypical Biases in Text-to-Image Generation
Jialu Wang, Xinyue Gabby Liu, Zonglin Di, Yang Liu, Xin Eric Wang
Text to Image Generation Stereotype Content Human Bias Stereotypical Bias Association Test Valence Prediction Text to Image Association

Stereotype Content

Papers

Women Are Beautiful, Men Are Leaders: Gender Stereotypes in Machine Translation and Language Modeling

Evaluating Large Language Models through Gender and Racial Stereotypes

Towards Auditing Large Language Models: Improving Text-based Stereotype Detection

Profiling Irony & Stereotype: Exploring Sentiment, Topic, and Lexical Features

Beyond Denouncing Hate: Strategies for Countering Implied Biases and Stereotypes in Language

'Person' == Light-skinned, Western Man, and Sexualization of Women of Color: Stereotypes in Stable Diffusion

Towards Conceptualization of "Fair Explanation": Disparate Impacts of anti-Asian Hate Speech Explanations on Content Moderators

StereoMap: Quantifying the Awareness of Human-like Stereotypes in Large Language Models

Language Agents for Detecting Implicit Stereotypes in Text-to-image Models at Scale

"Im not Racist but...": Discovering Bias in the Internal Knowledge of Large Language Models

Generative AI May Prefer to Present National-level Characteristics of Cities Based on Stereotypical Geographic Impressions at the Continental Level

Gender bias and stereotypes in Large Language Models

FairMonitor: A Four-Stage Automatic Framework for Detecting Stereotypes and Biases in Large Language Models

Interpretable Stereotype Identification through Reasoning

Building Socio-culturally Inclusive Stereotype Resources with Community Engagement

Othering and low status framing of immigrant cuisines in US restaurant reviews and large language models

How Different Is Stereotypical Bias Across Languages?

ChatGPT is fun, but it is not funny! Humor is still challenging Large Language Models

Word-Level Explanations for Analyzing Bias in Text-to-Image Models

T2IAT: Measuring Valence and Stereotypical Biases in Text-to-Image Generation