Inverse Reinforcement Learning via Deep Gaussian Process

We propose an IRL approach based on deep Gaussian processes (deep GPs) to learn complex reward structures from few demonstrations. Stacking latent GP layers yields abstract representations of state features, linked to demonstrations via maximum entropy IRL. We develop a non-standard variational approximation to enable tractable inference despite the embedded IRL engine, providing approximate Bayesian treatment and guarding against overfitting. The joint representation+IRL learning outperforms state-of-the-art baselines on standard and new benchmarks.

Authors

Ming Jin

Andreas Damianou

Pieter Abbeel

Costas Spanos

Published

January 1, 2017