A Framework for Reinforcement Learning and Planning

Extended Abstract

More Info
expand_more

Abstract

Sequential decision making, commonly formalized as Markov Decision Process optimiza-tion, is a key challenge in artificial intelligence. Two successful approaches to MDP opti-mization are planning and reinforcement learning. Both research fields largely have their own research communities. However, if both research fields solve the same problem, then we should be able to disentangle the common factors in their solution approaches. Therefore,this paper presents a unifying framework for reinforcement learning and planning (FRAP),which identifies the underlying dimensions on which any planning or learning algorithm has to decide. At the end of the paper, we compare - in a single table - a variety of well-known planning, model-free and model-based RL algorithms along the dimensions of our frame-work, illustrating the validity of the framework. Altogether, FRAP provides deeper insight into the algorithmic space of planning and reinforcement learning, and also suggests new approaches to integration of both fields