Health monitoring: a machine learning approach for anomaly detection in multi-sensor networks
More Info
expand_more
Abstract
Multi-sensor networks are becoming more and more popular in order to assess the post-occupancy performance of smart buildings, since they enable continuous monitoring with a high spatial resolution of the occupancy, thermal comfort and indoor air quality. An urgent, but poorly attended topic in this field is the automated detection of sensor anomalies. For example, CO2-sensors can perform auto-calibration, during which the data is not reliable. Without identifying the poor reliability of this data, any analysis based on it may be misleading. Automated detection and diagnosis of multi-sensor anomalies is a challenging task due to the complex characteristics of each data point, the variety of data points and the sheer number of data points. As a result, rule-based algorithms require an extensive expert-based set of rules, which makes them sensitive to threshold values and case specific exceptions. Machine learning algorithms can overcome these issues, but they require datasets with labelled sensor anomalies to do diagnosis. Acquiring such labelled datasets is labour intensive and therefore expensive. In this paper we show the potential of a transition from an unsupervised to a supervised machine learning approach. The unsupervised algorithm is used to detect anomalies and to identify anomaly classes of interest. This enables for labelling such classes efficiently in order to train classifiers for multiple classes of anomalies. The unsupervised and supervised algorithms are employed in parallel during the transition, allowing for the simultaneous detection of unknown anomaly classes and diagnosis of known anomaly classes. The improved performance of the combined classifier compared to unsupervised detection is shown by the precision-recall curve. Though the presented approach is rather generic, it does have some limitations. Because a window-based approach is used, only time windows can be detected as being anomalous, not the exact time. Also, we focus on the detection of sudden anomalies and the approach does not allow for detecting stationary or trend anomalies.