The Mirrored Influence Hypothesis: Efficient Data Influence Estimation by Harnessing Forward Passes
Large-scale black-box models are ubiquitous, but understanding how individual training points influence predictions remains computationally challenging. We introduce the Mirrored Influence Hypothesis, revealing a reciprocity between training and test data influence that reformulates the estimation task. Building on this, we estimate training data influence by computing gradients only for selected test samples and using forward passes for each training point—yielding substantial efficiency gains. We demonstrate broad applicability, including data attribution for diffusion models, leakage detection, memorization analysis, mislabeled data detection, and tracing behavior in LMs.