The fundamental challenge of the origin-destination (OD) matrix estimation problem is that it is severely under-determined. In this paper we propose a new data driven OD estimation method for cases where a supply pattern in the form of speeds and flows is available. We show that with these input data, we do not require an iterative dynamic network loading procedure that results in an equilibrium assignment, nor do we need an assumption on the kind of equilibrium that emerges from this process. The minimal number of ingredients which are needed are (a) a method to estimate/predict production and attraction time series; (b) a method to compute the N shortest paths from each OD zone to the next; and (c) two—possibly OD-specific—assumptions on the magnitude of N; and on the proportionality of path flows between these origins and destinations, respectively. The latter constitutes the most important behavioral assumption in our method, which relates to how we assume travelers have chosen their routes between OD pairs. We choose a proportionality factor that is inversely proportional to realized travel time, where we incorporate a penalty for path overlap. For large networks, these ingredients may be insufficient to solve the resulting system of equations. We show how additional constraints can be derived directly from the data by using principal component analysis, with which we exploit the fact that temporal patterns of production and attraction are similar across the network. Experimental results on a toy network and a large city network (Santander, Spain) show that our OD estimation method works satisfactorily, given a reasonable choice of N, and the use of so-called 3D supply patterns, which provide a compact representation of the supply dynamics over the entire network. Inclusion of topological information makes the method scalable both in terms of network size and for different topologies. Although we use a neural network to predict production and attraction in our experiments (which implies ground-truth OD data were needed), there are straight-forward paths to improve the method using additional data, such as demographic data, household survey data, social media and or movement traces, which could support estimating such ground-truth baseline production and attraction patterns. The proposed framework would fit very nicely in an online traffic modeling and control framework, and we see many paths to further refine and improve the method.
@en