Influence of tracking duration on the privacy of individual mobility graphs

More Info
expand_more

Abstract

Location graphs, compact representations of human mobility without geocoordinates, can be used to personalise location-based services. While they are more privacy-preserving than raw tracking data, it was shown that they still hold a considerable risk for users to be re-identified solely by the graph topology. However, it is unclear how this risk depends on the tracking duration. Here, we consider a scenario where the attacker wants to match the new tracking data of a user to a pool of previously recorded mobility profiles, and we analyse the dependence of the re-identification performance on the tracking duration. We find that the re-identification accuracy varies between 0.41% and 20.97% and is affected by both the pool duration and the test-user tracking duration, it is greater if both have the same duration, and it is not significantly affected by socio-demographics such as age or gender, but can to some extent be explained by different mobility and graph features. Overall, the influence of tracking duration on user privacy has clear implications for data collection and storage strategies. We advise data collectors to limit the tracking duration or to reset user IDs regularly when storing long-term tracking data.