Residential demand response of thermostatically controlled loads using batch Reinforcement Learning

Ruelens, F; Claessens, BJ; Vandael, S; De Schutter, Bart; Babuska, R.; Belmans, R

doi:10.1109/TSG.2016.2517211

Residential demand response of thermostatically controlled loads using batch Reinforcement Learning

Journal article (2017)

Authors

F Ruelens Katholieke Universiteit Leuven

BJ Claessens Flemish Institute for Technological Research

S Vandael Katholieke Universiteit Leuven

Bart De Schutter

R. Babuska Learning & Autonomous Control - Mechanical, Maritime and Materials Engineering

R Belmans Katholieke Universiteit Leuven

DOI: https://doi.org/10.1109/TSG.2016.2517211

Feature extraction Load management Water heating Resistance heating Atmospheric modeling Load modeling Learning (artificial intelligence)

To reference this document use:

http://resolver.tudelft.nl/uuid:2a6f9e25-d538-45d5-a8dd-2c2f9b453bdb

More Info

expand_more

Published Date

2017

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Driven by recent advances in batch Reinforcement Learning (RL), this paper contributes to the application of batch RL to demand response. In contrast to conventional model-based approaches, batch RL techniques do not require a system identification step, making them more suitable for a large-scale implementation. This paper extends fitted Q-iteration, a standard batch RL technique, to the situation when a forecast of the exogenous data is provided. In general, batch RL techniques do not rely on expert knowledge about the system dynamics or the solution. However, if some expert knowledge is provided, it can be incorporated by using the proposed policy adjustment method. Finally, we tackle the challenge of finding an open-loop schedule required to participate in the day-ahead market. We propose a model-free Monte Carlo method that uses a metric based on the state-action value function or Q-function and we illustrate this method by finding the day-ahead schedule of a heat-pump thermostat. Our experiments show that batch RL techniques provide a valuable alternative to model-based controllers and that they can be used to construct both closed-loop and open-loop policies.

Files

07401112_1_3.pdf

(pdf | 1.38 Mb)

Unknown license