Universal Perturbation

Universal perturbations (UAPs) are single, small alterations to input data that consistently fool machine learning models across a wide range of inputs, unlike traditional adversarial attacks targeting individual inputs. Current research focuses on developing more robust methods for training models resistant to UAPs, exploring their application in diverse fields such as 3D object recognition and large language model security, and investigating their inherent properties and potential for beneficial uses like improving classification accuracy. The ability to create and defend against UAPs has significant implications for the trustworthiness and security of machine learning systems across various applications, highlighting the need for further research into both attack and defense strategies.

Papers