Paper ID: 2212.04014

Statistical and Computational Guarantees for Influence Diagnostics

Jillian Fisher, Lang Liu, Krishna Pillutla, Yejin Choi, Zaid Harchaoui

Influence diagnostics such as influence functions and approximate maximum influence perturbations are popular in machine learning and in AI domain applications. Influence diagnostics are powerful statistical tools to identify influential datapoints or subsets of datapoints. We establish finite-sample statistical bounds, as well as computational complexity bounds, for influence functions and approximate maximum influence perturbations using efficient inverse-Hessian-vector product implementations. We illustrate our results with generalized linear models and large attention based models on synthetic and real data.

Submitted: Dec 8, 2022