LAVA: Data Valuation without Pre-Specified Learning Algorithms
We introduce LAVA, a framework for valuing data independent of any pre-specified learning algorithm. We derive a proxy for validation performance via a class-wise Wasserstein distance between training and validation sets and show it upper bounds validation performance under Lipschitz conditions. Sensitivity analysis of this distance yields pointwise values that can be obtained directly from solver outputs, avoiding repeated training runs. We evaluate LAVA across diverse settings, enabling algorithm-agnostic data valuation for acquisition and pricing.