Outlier detection is an essential part of modern systems. It is used to detect anomalies in behaviour or performance of systems or subjects, such as fall detection in smartwatches or voltage irregularity detection in batteries. This provides early indications of something of pote
...
Outlier detection is an essential part of modern systems. It is used to detect anomalies in behaviour or performance of systems or subjects, such as fall detection in smartwatches or voltage irregularity detection in batteries. This provides early indications of something of potential problems.
A part of outlier detection that is not often analysed is the performance of algorithms in environments with data from only one subject, versus environments with data from multiple subjects. This paper aims to answer the questions regarding the performance of Gaussian Mixture Models (GMM) and DBSCAN in these different environments. This paper focuses on time series data collected from consumer-grade wearables like smartwatches. In this paper, the outliers are defined manually, as the used data set did not contain predefined outliers. This research considers both outliers defined within the subject data, and the use of other subjects as outliers.
Results from this paper indicate that the amount of subjects in the environment is not the sole factor in the performance of these algorithms. Rather, it is a combination of the amount of subjects in the environment and the type of outlier to be detected. Results show that a GMM has difficulty distinguishing subjects that are similar when using another subject as outlier data. On average, DBSCAN outperforms a GMM in almost all cases, and DBSCAN is a lot more consistent in its performance than a GMM.