Traffic congestion remains a critical challenge with profound economic, safety, and environmental implications, exacerbated by the continuous growth of global population and vehicle densities. The expansion of road infrastructure is often impractical due to excessive costs and sp
...
Traffic congestion remains a critical challenge with profound economic, safety, and environmental implications, exacerbated by the continuous growth of global population and vehicle densities. The expansion of road infrastructure is often impractical due to excessive costs and spatial constraints. In response, intelligent traffic management strategies have gained prominence, leveraging advanced control methodologies to optimize flow and mitigate congestion. Among these, Model Predictive Control (MPC) has been extensively applied due to its capacity to handle complex, constrained systems. However, the accuracy of the underlying prediction models inherently determines its effectiveness. Traditional system identification methods can have inherent uncertainties and are often only accurate to a certain confidence level, leading to potential performance degradation in dynamic traffic conditions. To overcome this limitation, learning-based MPC integrates machine learning techniques, particularly Reinforcement Learning (RL), that can enhance adaptability and closed-loop performance. This integration can result in more responsive and robust traffic management systems capable of handling uncertainties and dynamic environments.
The integration of MPC and RL has emerged as a compelling approach to control complex nonlinear systems, capitalizing on the strengths of both methodologies. RL constantly improves control policies based on observed system dynamics and performance feedback, while MPC automatically chooses the best control actions over a limited prediction horizon while still enforcing system constraints. This thesis investigates the integration of RL within the MPC framework for highway traffic flow optimization, specifically focusing on the dynamic regulation of Variable Speed Limits (VSL) and Ramp Metering (RM). The proposed approach is evaluated against three benchmark models: a theoretically optimal MPC model with perfect system knowledge, assuming exact parameters; an MPC controller with an imperfect prediction model, where these parameters deviate from their true values, leading to suboptimal control decisions; and the MPC-RL model, which dynamically adjusts the learnable parameters during training to improve overall system performance.