Non Robust Feature

Non-robust features, easily manipulated aspects of data that influence model predictions but lack genuine representational power, are a central concern in machine learning robustness research. Current work focuses on identifying and mitigating the influence of these features through techniques like input density smoothing and information bottleneck methods, often within the context of adversarial training and various deep learning architectures. Understanding and controlling the impact of non-robust features is crucial for building trustworthy and reliable machine learning systems, particularly in safety-critical applications where model predictions must be resistant to manipulation.

Papers