Inferring the residential building type from 3DBAG

More Info
expand_more

Abstract

Urban Energy Modeling (UEM) provides a comprehensive approach to urban planning, helping to create sustainable, resilient, and energy-efficient cities that meet the needs of current and future generations. The key inputs for UEM methodologies and tools are the geometry of the building stock and its thermophysical properties. In the Netherlands, the 3DBAG provides the building stock geometry, while the thermophysical properties can be approximated using energy consumption estimates specific to each residential building type from the IEE project TABULA. However, a challenge arises as open data on residential building types at a national level is currently not readily available, necessitating the development of a method to infer this information from other accessible data sources.

In response to similar successful studies, this thesis also focuses on utilising machine learning to address this challenge. Support Vector Machine (SVM) and Random Forest (RF) algorithms were tested and compared. These algorithms were trained using data on the residential building types obtained from the Rijssen-Holten energy testbed and EP-online, with the latter requiring preprocessing to obtain the relevant information. Additionally, 25 features derived from cadastral data and building geometry underwent a selection process to identify the essential features for accurate classification of building types.

Eight models were trained and applied across eight case studies, containing subsets of the Netherlands representative for the whole country. The combined results were analyzed to determine necessary features, required data, and the most suitable machine learning approach for this research.

The results revealed that features such as adjacency to other buildings, width, and volume in LoD2.2 correlated the most with Dutch residential building types. However, critical features like the number of storeys, the presence of an open porch, or galleries were not available as open data, even though they directly relate to the definition of certain residential building types.

This thesis presented successful models, demonstrating accuracies between 61.1% and 98.5%, and balanced accuracies ranging from 51.6% to 94.2%. Importantly, performance differences were observed in various case studies, particularly in distinguishing accuracy between multi-dwelling and single-dwelling houses. Despite the longer tuning and training time, the suitability and accuracy of the RF models generally outperformed the SVM models.

These findings highlights the capacity of machine learning to attain robust classification outcomes for the Dutch building stock when trained on representative datasets. Nevertheless, it is emphasized that the accuracy of results is contingent on data quality, and difficulties may arise in scenarios involving intricate buildings with multiple components and ambiguous classification rules.