Defense Framework

Defense frameworks are being developed to protect machine learning models, particularly large language models and federated learning systems, from various adversarial attacks like data poisoning, model backdooring, and adversarial examples. Current research focuses on robust methods such as data curation, model merging, and homophily augmentation to improve model resilience, often employing techniques like mixture-of-experts models and outlier detection. These advancements are crucial for ensuring the reliability and trustworthiness of AI systems across diverse applications, ranging from natural language processing to autonomous driving and resource allocation, mitigating potential harms from malicious manipulations.

Papers