Model Agreement

Model agreement, the consistency of predictions across multiple models or instances of the same model, is a crucial area of research focusing on improving the reliability and trustworthiness of machine learning systems, particularly large language models (LLMs). Current research investigates model agreement as a metric for evaluating LLM performance, assessing the suitability of LLMs for replacing human annotation, and mitigating adversarial attacks in collaborative LLM settings. Understanding and improving model agreement is vital for building robust and dependable AI systems across diverse applications, from software engineering to educational technology, and for enhancing the interpretability and fairness of machine learning outputs.

Papers