W.J. Remmerswaal

Master thesis (1)

1 records found

Combining MPC and reinforcement learning in a model-reference framework for urban traffic signal control

Both model predictive control (MPC) and reinforcement learning (RL) have shown promising results in the control of traffic signals in urban traffic networks. There are, however, a few drawbacks. MPC controllers are not adaptive and therefore perform suboptimal in the presence of the uncertainties that always occur in urban traffic systems. Although very advanced prediction models for urban traffic signal control systems exist, these models also come with a price: the computational complexity of MPC controllers increases with the accuracy of the model. RL techniques involve a time-consuming and data-dependent offline computation, as the agent needs to pursue a training process. The training process is also the main reason why RL techniques have not been employed in real-world urban traffic systems. Through exploration in the training phase the controller may cause a suboptimal and potentially unacceptable bad performance in the system. Besides, most RL techniques do not have any stability and feasibility guarantees. With the goal of mitigating these drawbacks, the model-reference RL adaptive control framework is introduced. RL is used to obtain an adaptive law to adjust a stable baseline controller to follow a set reference. This thesis focusses on the design and analysis of this scheme where MPC control is used to obtain the baseline control input. The computed baseline control input combined with the traffic model used, determines the reference state to be followed. By performing a case study, the training characteristics of the framework are compared to those of a conventional RL-based controller. Besides, the system performance framework is compared to that of a fixed-time controller a conventional MPC controller and a conventional RL-based controller. The simulation shows that the framework outperforms the RL-based controller in terms of performance during training and the general simulation performance of the MPC controller.