A novel nested stochastic dynamic programming (nSDP) and nested reinforcement learning (nRL) algorithm for multipurpose reservoir optimization

Journal article (2017)

Authors

Blagoj Delipetrev University Goce Delcev

Andreja Jonoski IHE Delft Institute for Water Education

D.P. Solomatine Water Resources - , IHE Delft Institute for Water Education

Research Group

Water Resources () (TU Delft)

Algorithm Reinforcement learning Stochastic dynamic programming Optimal reservoir operation

To reference this document use:

http://resolver.tudelft.nl/uuid:909f1640-0653-4d81-9fe3-4714ed3925a6

More Info

expand_more

Published Date

01-01-2017

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Civil Engineering & Geosciences

Department

Water Management

Research Group

Water Resources

Abstract

In this article we present two novel multipurpose reservoir optimization algorithms named nested stochastic dynamic programming (nSDP) and nested reinforcement learning (nRL). Both algorithms are built as a combination of two algorithms; in the nSDP case it is (1) stochastic dynamic programming (SDP) and (2) nested optimal allocation algorithm (nOAA) and in the nRL case it is (1) reinforcement learning (RL) and (2) nOAA. The nOAA is implemented with linear and non-linear optimization. The main novel idea is to include a nOAA at each SDP and RL state transition, that decreases starting problem dimension and alleviates curse of dimensionality. Both nSDP and nRL can solve multi-objective optimization problems without significant computational expenses and algorithm complexity and can handle dense and irregular variable discretization. The two algorithms were coded in Java as a prototype application and on the Knezevo reservoir, located in the Republic of Macedonia. The nSDP and nRL optimal reservoir policies were compared with nested dynamic programming policies, and overall conclusion is that nRL is more powerful, but significantly more complex than nSDP.

Files

Jh0190047.pdf

(pdf | 0.679 Mb)

Unknown license

Download not available