What are the implications of Curriculum Learning strategy on IRL methods?

Investigating Inverse Reinforcement Learning from Human Behavior

More Info
expand_more

Abstract

Inverse Reinforcement Learning (IRL) is a subfield of Reinforcement Learning (RL) that focuses on recovering the reward function using expert demonstrations. In the field of IRL, Adversarial IRL (AIRL) is a promising algorithm that is postulated to recover non-linear rewards in environments with unknown dynamics. This study investigates the potential benefits of applying the Curriculum Learning (CL) strategy to the AIRL algorithm. For our experiments, we use a randomized partially observable Markov decision process in the form of a grid-world-like environment. Using only expert demonstrations obtained with an RL algorithm under the true reward function, we train AIRL in a variety of configurations and identify an effective curriculum. Our results show, that a well-constructed curriculum can enhance the performance of AIRL twofold in both key aspects: the speed of convergence and the efficiency of using expert demonstrations. We thus conclude that CL can be a useful addition to an AIRL-based solution. Full code is available online in the supplementary material https://github.com/mikhail-vlasenko/curriculum-learning-IRL.