Semantic understanding of urban scenes from textured meshes

More Info
expand_more

Abstract

The thesis explores the semantic understanding of urban textured meshes derived from photogrammetric methods. It primarily addresses three aspects with regard to urban textured meshes: 1) semantic annotation and the creation of benchmark datasets, 2) semantic segmentation, and 3) the automation of lightweight 3D city modeling using semantic information.

The first focus of the thesis is the development of a benchmark dataset to evaluate the performance of advanced 3D semantic segmentation methods in urban settings. An interactive 3D annotation framework has been proposed to assign ground truth labels to the urban meshes' triangle faces and texture pixels. This framework achieves efficient and accurate semi-automatic annotation through segment classification and structure-aware interactive selection. In the center of Helsinki, Finland, object-level annotations were made over approximately 4 km\(^2\) (including buildings, vegetation, and vehicles, etc.), and part-level annotations over about 2.5 km\(^2\) (including building parts like doors, windows, and road markings, etc.). The design of the annotation tools improves user operation and enables quick annotation of large scenes, while the resulting datasets allow researchers to refine their deep learning models for urban analysis.

Another research focus is on mesh segmentation algorithms. A novel semantic mesh segmentation algorithm has been introduced for large-scale urban environments, employing plane-sensitive over-segmentation combined with graph-based methods for contextual data integration. This approach, which utilizes graph convolutional networks for classification, significantly improves performance over traditional techniques based on our proposed benchmark datasets.

Finally, leveraging this semantic information, a pipeline for reconstructing lightweight 3D city models has been designed. This facilitates the automated reconstruction of CityGML-based LoD2 and LoD3 city models, ensuring high fidelity in geometric detail and semantic accuracy. The reconstructed large-scale, lightweight, and semantic city models significantly broaden applications in urban spatial intelligence, including automatic geometric measurements, interactive spatial computations, spatial analysis based on external data, and environment simulation using physical engines.

This thesis enhances the practicality of 3D data in real-world applications by utilizing semantic parsing of urban textured meshes to generate lightweight 3D urban semantic models, greatly enriching their usability. It also lays a solid foundation for future progress in understanding, modeling, and analyzing 3D urban scenes.