Autonomous greenhouse climate control with Q-learning using ENMPC as a function approximator

Lubbers, Seymour

Autonomous greenhouse climate control with Q-learning using ENMPC as a function approximator

Title

Autonomous greenhouse climate control with Q-learning using ENMPC as a function approximator

Author

Lubbers, Seymour (TU Delft Mechanical, Maritime and Materials Engineering; TU Delft Delft Center for Systems and Control)

Contributor

Dabiri, A. (mentor)
Sun, Congcong (graduation committee)
Airaldi, F. (graduation committee)
McAllister, R.D. (graduation committee)

Degree granting institution

Delft University of Technology

Programme

Mechanical Engineering | Systems and Control

Date

2023-08-28

Abstract

Greenhouses allow production of crops that would otherwise be impossible. Permitting more local, fresher and nutrient richer crop production. Eorts are taken to minimize societal harm due to energy and resource consumption by greenhouse production systems. One way to control such systems is by using model predictive control. Optimal crop yield and resource eciency can, in theory, be achieved by model predictive control. Unfortunately, one major drawback of model predictive control is that it is not well equipped to deal with parametric uncertainty. Significant prediction errors can occur when a mismatch between the model and the real system exists, resulting in deteriorated performance of the system. Strategies exist, such as robust MPC, that are designed to handle uncertainty, but those often result in conservative control policies. This thesis proposes to use model predictive control as a function approximator for RL in order to learn values for model and MPC parameters that can deliver optimal performance in the case of model mismatch.
In this thesis, data-driven economic nonlinear model predictive control using Q-learning is proposed as a method to alter the model parameters. The performance of the system af- ter learning is compared to approaches using robust and nominal model predictive control. Three dierent goals are determined: maximizing economic profit, minimizing the constraint violations and maximizing the economic performance while minimizing constraint violations.
In this work, an ENMPC scheme is used as a function approximator in a Q-learning envi- ronment. The optimization solution from the ENMPC scheme is used as the input to the system, while the Q-learning agent optimizes the parameter values of the ENMPC scheme and model for the environment. The performance of the system after learning is compared to approaches using robust and nominal model predictive control. The simulation results show that the data-driven ENMPC using reinforcement learning is able to decrease constraint vi- olations by up to 94%, but unable to increase economic performance compared to nominal MPC, compared to robust MPC the EPI is increased by almost 10% while keeping constraint violations at a similar level.

Subject

Model Predictive Control
Reinforcement Learning (RL)
Greenhouse Climate control
function approximation
adaptive nonlinear model predictive control
economic NMPC

To reference this document use:

http://resolver.tudelft.nl/uuid:c956b3db-c2b4-4762-8736-fa15ad43d4cf

Part of collection

Student theses

Document type

master thesis

Rights

Files

PDF

Thesis_Seymour_Lubbers.pdf

6.93 MB

Close viewer