Stability-certified Reinforcement Learning: A Control-theoretic Perspective
We study certifying stability of RL policies interconnected with nonlinear dynamical systems. By regulating partial policy gradients, robust stability can be certified via an SDP feasibility problem that exploits problem structure. We analyze (non)conservatism and empirically demonstrate high performance within the certified parameter space and stable long-run learning on multi-flight formation and power system frequency regulation.