nD-PointCloud Data Management
continuous levels, adaptive histograms, and diverse query geometries
More Info
expand_more
Abstract
In the Geomatics domain, a point cloud refers to a data set which records the coordinates and other attributes of a huge number of points. Conceptually, each of these attributes can be regarded as a dimension, representing a specific type of information. Apart from routinely concerned spatio-temporal dimensions for coordinates, other dimensions such as intensity and classification are also widely used in spatial applications. In fact, more dimensions can be involved. For instance, a point in the hydraulic modelling grid also records the flow direction, speed, sediment concentration, and other related attributes. As these point cloud data can be directly collected, computed, stored and analyzed, this thesis proposes the term – nD-PointCloud, as a general spatial data representation to cover them.
At present, drastically increasing production of nD-PointCloud data raises essential demand for smart and highly efficient data management and querying solutions. However, we lack effective tools. Prevalent software for nD-PointCloud processing, analyzing and rendering are built on file-based systems, requiring substantial development of data structures and algorithms. To make things worse, when other data types are involved, multiple formats, libraries and systems need enormous effort to be integrated. Aimed at generic support for diverse applications, DataBase Management Systems (DBMSs) on the other hand avoid these issues to a large extent. However, since they are initially developed to resolve 2D or 3D issues, they do not provide native support for nD data indexing and operations. Yet the 2D and 3D operators cannot be easily extended to nD.
This thesis aims at developing a generic yet efficient solution for managing and querying nD-PointCloud data. The work is based on an existing solution called PlainSFC, which maps nD data into 1D space. PlainSFC is implemented in the DBMS, adopting space filling curve based clustering and B+-tree indexing strategies. Besides, PlainSFC applies an advanced querying mechanism which recursively refines hypercubic nD spaces to 1D ranges to approach the query geometry for primary filtering. This achieves high querying efficiency. However, the solution still has drawbacks, and this research focuses on resolving them by developing and using novel methods:
• A continuous Level of Importance (cLoI) method for data organization to eliminate visual artifacts of density shocks in points' rendering, which is introduced by conventional tree structures such as Quadtree or Octree. The cLoI method computes an importance value for every point according to an ideal distribution generalized from the discrete distributions of those tree structures. This forms an additional cLoI dimension, and each point actually represents a level. By integrating the cLoI dimension into PlainSFC, smooth and efficient rendering is realized.
• An nD-histogram approach to improve querying efficiency on non-uniformly distributed data. PlainSFC decomposes the nD space into sub-spaces recursively to approach the query geometry without considering point distribution. This is not optimal when the distribution of points is severely skewed. To improve this, an nD-histogram which records the number of points inside each nD sub-space is established as a representation of data distribution. The developed solution called HistSFC decomposes and refines the nD space more smartly, which improves the accuracy and efficiency of primary filtering.
• A convex polytope querying function. Besides orthogonal window queries, the polytope query, which is the extension of the widely adopted polygonal query in 2D, also plays a critical role in many nD spatial applications. To address this type of query, an easy-to-use polytope formulation for querying is firstly proposed. Then, based on PlainSFC and HistSFC, efficient intersection algorithms are developed for convex polytope querying on nD point clouds. These algorithms are tested through experiments with up to 10D point data. Using this newly developed function, applications including perspective view selections and flood risk queries are resolved more efficiently, achieving sub-second performance.
Additionally, other optimization techniques such as parallelization are developed and experimented with, which also bring performance gain. To verify the whole framework, several benchmark tests devised by considering real applications are conducted, and comparisons with different state-of-the-art solutions are performed. The result shows that the newly developed solution outperforms the others, overall. In certain cases, the solution can be applied without further optimizations. However, this will not be the end. Rapidly arising high tech such as cloud computing platforms can boost the solution further to incorporate more data and users. Potential nD-PointCloud based applications still need to be explored, prototyped and tested to serve the society in practice.