A theoretical analysis of using gradient data for Sobolev training in RKHS
Recent works show that incorporating target derivatives during training can improve accuracy and data efficiency. We provide a theoretical analysis of gradient-augmented (Sobolev) training in RKHS, highlighting (i) limitations and benefits in low-data regimes, and (ii) how gradients affect learning rates. We show thresholds on the target’s Lipschitz constant and data size under which Sobolev training outperforms classical value-only training in sample efficiency.