Towards Robustness Certification Against Universal Perturbations

We study certifying neural network robustness against universal perturbations (UPs), which are shared across all samples and arise in universal adversarial and backdoor attacks. Sample-wise certification bounds are loose under the UP threat model as they ignore the shared-perturbation constraint. We combine linear relaxation–based analysis with MILP to establish the first certification method for UPs and develop a framework to compute population error bounds from certification on random batches. The certifications further enable efficient comparison of model robustness and defenses and accurate detection of backdoor target classes.

Authors

Yi Zeng

Zhouxing Shi

Ming Jin

Feiyang Kang

Lingjuan Lyu

Cho-Jui Hsieh

Ruoxi Jia

Published

January 1, 2023