Optimizing Matching Time Intervals in Ride-Hailing and Ride-Pooling Services Using Reinforcement Learning

More Info
expand_more

Abstract

Efficiently matching passengers with drivers is crucial for the performance of both ride-hailing and ride-pooling services, particularly when it comes to optimizing the time intervals for matching. Traditional methods often use fixed or real-time matching strategies, which do not fully consider the dynamic optimization of these intervals. This study introduces a dynamic optimization strategy based on reinforcement learning, applying the Proximal Policy Optimization (PPO) algorithm to flexibly adjust the matching intervals under various supply and demand conditions, aiming to reduce passenger wait times and improve vehicle utilization.

An efficient simulator is developed in this research to model the spatiotemporal matching processes for ride-hailing and ride-pooling services. Through experiments, the reinforcement learning-based matching strategy is compared against fixed interval and real-time matching strategies. Results indicate that the RL-based strategy outperforms in key metrics such as total waiting time and detour delay. Additionally, Potential-based Reward Shaping effectively addresses the sparse reward problem, further enhancing the model’s learning efficiency. The findings validate that dynamically optimizing matching intervals through reinforcement learning can improve the overall efficiency of ride-hailing services in dynamic supply-demand environments and is scalable and adaptable.

Files

Thesis_Bao_final.pdf
(pdf | 10.8 Mb)
Unknown license