Communicative diseases have put a burden on societal well-being throughout history. The global advent of the COVID-19 virus made the world painstakingly aware of the impact a pandemic can have on all levels of society. To combat the spread and impact of a pathogen, policymakers t
...
Communicative diseases have put a burden on societal well-being throughout history. The global advent of the COVID-19 virus made the world painstakingly aware of the impact a pandemic can have on all levels of society. To combat the spread and impact of a pathogen, policymakers turn to specialized tools to craft policy. One of these tools is mathematical modeling and simulation. By using the agent-based modeling paradigm, it is possible to evaluate the impact of policy measures on the viral spread and healthcare burden. An agent-based model always contains a set of agents. Agents are the digital representations of a studied entity, for example, a person, but agents can also be an organization or a vehicle. One of the characteristics of agent-based modeling is the reliance on data to create a valid agent population. However, data on the level of individual people, also known as microdata, can be difficult to acquire and work with due to GDPR regulations. Aggregated population data, also known as open data, may serve as a substitute for microdata. Both open and microdata may be used to synthesize a population of agents, this process is known as synthetic population generation. Although both of these forms of data can be used to synthesize an agent population, it is currently understudied how the results of the agent population synthesis process are affected by the type of used population data. This research project focuses on this gap in knowledge and systematically analyzes the differences between an open-data-based and a microdata-based agent population. The populations that are synthesized are part of the HERoS model. The HERoS model is an epidemiological agent-based model that maps the spread of COVID-19 in the city of The Hague. The agents in the HERoS model are digital individuals, each characterized by a set of attributes. These attributes define the behavior of an agent within the model. Examples of agent attributes in the HERoS model are age, household size, and social role. Two agent populations for the HERoS model are synthesized, one using a sample-free algorithm that uses open data, and one using an algorithm that converts microdata to an agent population. The differences between the agent populations are quantified by introducing two new metrics, the modified Freeman-Tukey statistic, and agent matching. It was found that the agent populations show similar characteristics on a city level, but show significant differences at the neighborhood level. Furthermore, it is shown that the quality of an agent population is both dependent on the number of attributes that are synthesized, as well as the type of attributes. Moreover, the relationship between the quality of an agent population and the outcome a model produces is also understudied. This relationship is investigated by using the open-data-based and microdata-based agent populations as input for the HERoS model. The model outcomes are then compared on a global and neighborhood level, using phase comparison and the quantification of the precision of the model. It is found that the effect of using open data instead of microdata is multifaceted. The model outcome curves show similar characteristics, albeit they differ in a numerical sense. Using open-data increases the number of hospitalizations, ICU occupants, and deaths in the HERoS model. Furthermore, the precision of the model decreases when open data is used. Finally, it observed that neighborhoods that significantly differ in input population also significantly differ in model outcomes, showing the importance of the quality of an agent population for smaller resolutions. Conclusively, this research project shows that open data can serve as a viable alternative to microdata when it comes to synthesizing agent populations for epidemiological agent-based models.