Counterfactual Fairness

Counterfactual fairness aims to ensure that a machine learning model's predictions for an individual would remain unchanged even if their sensitive attributes (e.g., race, gender) were different. Current research focuses on developing methods to achieve this, often employing techniques like data augmentation, generative adversarial networks, and causal inference methods to either pre-process data or directly train fairer models, including the use of transformers and graph neural networks. This field is crucial for mitigating bias in high-stakes decision-making processes across various domains, such as finance, healthcare, and criminal justice, promoting more equitable outcomes and improving the trustworthiness of AI systems.

Papers