The Mirrored Influence Hypothesis: Efficient Data Influence Estimation by Harnessing Forward Passes

Large-scale black-box models are ubiquitous, but understanding how individual training points influence predictions remains computationally challenging. We introduce the Mirrored Influence Hypothesis, revealing a reciprocity between training and test data influence that reformulates the estimation task. Building on this, we estimate training data influence by computing gradients only for selected test samples and using forward passes for each training point—yielding substantial efficiency gains. We demonstrate broad applicability, including data attribution for diffusion models, leakage detection, memorization analysis, mislabeled data detection, and tracing behavior in LMs.

Authors

Myeongseob Ko

Feiyang Kang

Weiyan Shi

Ming Jin

Zhou Yu

Ruoxi Jia

Published

January 1, 2024