Graph Neural Networks Training Set Analysis

Effect of Training Data Size

More Info
expand_more

Abstract

With the rapid increase in popularity of graph neural networks (GNNs) for the task of traffic forecasting, understanding the inner workings of these complex models becomes more important. This experiment aims to deepen our understanding of the importance that the training data has in regards to the ability of GNNs to accurately predict traffic. By repeatedly training the same GNN model with different training datasets spanning over various time frames and comparing standard performance metrics computed based on the predictions performed by the model, this paper concludes that while using less training data leads to a slight decrease in performance, this is heavily dependent on the quality of the dataset. If the data gathering process is short and the sensors are not properly maintained, GNNs are not able to accurately predict traffic. On the other hand, if the data gathering process goes well and there are few missing values, GNNs perform well even when trained with smaller amounts of historical data.

Files