LLM Bias
Large language models (LLMs) often exhibit biases reflecting societal prejudices, leading to unfair or discriminatory outputs. Current research focuses on developing methods to detect and mitigate these biases, encompassing both implicit and explicit forms across various protected attributes like race, gender, and age, using techniques such as prompt engineering, attention mechanism analysis, and counterfactual evaluations applied to models like GPT-3.5 and others. Understanding and addressing LLM bias is crucial for ensuring fairness and ethical deployment of these powerful technologies, impacting both the development of responsible AI and the avoidance of harmful societal consequences.
Papers
December 24, 2023
December 15, 2023
November 15, 2023
November 6, 2023
September 16, 2023
September 15, 2023
September 11, 2023
June 13, 2023