Paper ID: 2212.04014
Statistical and Computational Guarantees for Influence Diagnostics
Jillian Fisher, Lang Liu, Krishna Pillutla, Yejin Choi, Zaid Harchaoui
Influence diagnostics such as influence functions and approximate maximum influence perturbations are popular in machine learning and in AI domain applications. Influence diagnostics are powerful statistical tools to identify influential datapoints or subsets of datapoints. We establish finite-sample statistical bounds, as well as computational complexity bounds, for influence functions and approximate maximum influence perturbations using efficient inverse-Hessian-vector product implementations. We illustrate our results with generalized linear models and large attention based models on synthetic and real data.
Submitted: Dec 8, 2022