Biased Behavior

Biased behavior in artificial intelligence, particularly large language models (LLMs) and other machine learning systems, is a significant area of research focusing on identifying, mitigating, and understanding the sources of such biases. Current efforts utilize various techniques, including Bayesian methods for bias removal, multitask learning to disentangle dialect from bias, and the development of detectors (guardrails) trained on synthetic data to flag problematic outputs. This research is crucial for ensuring fairness and equity in AI applications, impacting fields ranging from news consumption and social media to healthcare and loan applications, and promoting the development of more trustworthy and responsible AI systems.

Papers