In criminal investigations, individuals may be connected to illicit activities by linking their personal phone to an otherwise anonymous, crime-related phone. Several methods have been published that use cell tower registrations to differentiate between same-user and different-us
...
In criminal investigations, individuals may be connected to illicit activities by linking their personal phone to an otherwise anonymous, crime-related phone. Several methods have been published that use cell tower registrations to differentiate between same-user and different-user scenarios for the two phones. However, criminals may deviate in movement patterns and phone usage from the test subjects on which the methods are developed and evaluated. Whether the proposed methods are robust to such different behavioral profiles is unclear. The scarcity of readily available datasets on criminals' movements and phone usage further complicates this issue.
Lacking precise knowledge of the behavior of the population of interest, we propose a robustness analysis. Here, we present a tool for generating synthetic datasets, based on well-established models for the movement of individuals. We used the tool to generate data for a range of behavioral properties, encompassing variations in both underlying movement and phone usage. We evaluated three existing methods using our synthetic data. The first is a discriminatory approach that learns typical movement patterns and phone usage from a reference dataset. The second approach uses a model of cell tower behavior, making minimal assumptions on user behavior by choosing pairs of registrations close in time. The third is a generic statistical method for comparing event data. Additionally, we present a fourth method that combines the latter two, as conceptually, they use different aspects of the data.
Our analysis reveals that the discriminatory method performs best in a baseline scenario but is most sensitive to behavioral deviations. The cell tower method shows the lowest baseline performance yet exhibits the strongest resilience to variations. The generic model appears intermediate in terms of performance and sensitivity. Given the importance of robustness in evaluating evidence, we recommend using the combined approach, which is both reliable and effective across our defined variations.