Concept Intervention

Concept intervention focuses on improving the interpretability and performance of machine learning models by allowing human users to correct erroneous intermediate representations, called concepts, that the model generates before making a final prediction. Current research emphasizes developing model architectures, such as Concept Bottleneck Models (CBMs) and their variants (e.g., stochastic, counterfactual, energy-based CBMs), that efficiently incorporate these interventions, often by explicitly modeling relationships between concepts. This research aims to enhance model accuracy, reduce the number of interventions needed, and improve the overall usability of these explainable AI systems, ultimately leading to more reliable and trustworthy AI applications.

Papers