Monotone Adversary

Monotone adversaries represent a challenging class of attackers in machine learning, focusing on scenarios where malicious actors strategically manipulate data or models to achieve a specific goal, such as misclassifying inputs or degrading model performance, while maintaining a consistent pattern of malicious behavior. Current research investigates robust defenses against these adversaries across various machine learning tasks, including federated learning, reinforcement learning, and natural language processing, employing techniques like list-decodable learning, activation space analysis, and adaptive budget allocation in multi-agent systems. Understanding and mitigating the impact of monotone adversaries is crucial for building reliable and secure machine learning systems across diverse applications, from autonomous systems to large language models.

Papers