Logistics and mobility services play a major role in our society, and efficient routing is a crucial part of this. However, even though routing problems have been widely researched, the solutions provided by algorithms do not always match drivers' expectations. Routing costs used
...
Logistics and mobility services play a major role in our society, and efficient routing is a crucial part of this. However, even though routing problems have been widely researched, the solutions provided by algorithms do not always match drivers' expectations. Routing costs used by these algorithms are often based on one or a few parameters, but in real-world operations, many factors and sometimes hard-to-define aspects are responsible for this. Drivers can consider these different aspects and some studies found that experienced drivers often plan better delivery routes than the optimization tools. In this research, we focus on using expert decision data as examples for learning the costs of routing and train a policy that can make decisions more in line with the expectations of the expert. We formulate state-action representations for the TSP and CVRP, which we use to interpret these routing problems as inverse optimization and multiclass classification problems. Additionally, we propose multiple policy training approaches as well as state feature vector transformations that can be used based on the characteristics of the routing problems. These different training configurations are utilized to train different existing algorithms with training data sets consisting of example state-action pairs. The performance of the trained models is compared to each other and the optimal solution. The optimal solution acted as expert example and was used to create the training data. We demonstrate that both inverse optimization and multiclass classification algorithms are able to imitate expert decision-making for new problem instances from example data. However, we also show a large variation in performance depending on the problem, state features, algorithm formulations and training configuration.