Deep Learning Architectures for PM2.5 and Visibility Predictions

More Info
expand_more

Abstract

Facing the severe air pollution phenomenon in urban areas and the subsequent low visibility event in airports, it is urgent to conduct air quality and visibility predictions to better reflect their changing trends. However, the variations of PM2.5 and visibility involve complicated physical and chemical processes, which make their accurate predictions challenging.
In this thesis, methodologies to predict PM2.5, PM10, and visibility using Long Short-Term Memory Neural Networks (LSTM NN) were investigated. The first step of the proposed methodology was dataset analysis and preprocessing, which is an important step in almost all machine learning problems. Because missing data and confusing or incorrect data are common in large datasets, noise and errors were corrected and missing rates were calculated at first. Afterward, datasets were visualized to evaluate the missing phenomenon of different features. Due to the explored strong spatiotemporal correlations, for air quality features with high missing rates, linear interpolations were implemented when the missing granularity is small and k-Nearest Neighbor (kNN) imputations were used when the missing interval is large.
Furthermore, the PM2.5 or PM10 prediction is usually considered as a regression task and aimed at minimizing the mean squared error (MSE) between the predicted values and measured ones. However, due to the high variability and explored ‘class-imbalance’ phenomenon of visibility data, that is, most of the data we have are related to 'normal' situations and extreme conditions are rare events, its predictions can be better dealt with as a classification problem. Because the most interesting cases to be predicted are those rare extreme events, the target was adapted to minimize the weighted cross-entropy.
The second step of the proposed methodology was to configure the frameworks. For PM2.5 predictions, feature engineering was employed to the select appropriate features and some model hyperparameters were set through grid searches and coordinate descent. A coarse-to-fine sampling scheme was used to determine the weights in the loss function of visibility predictions.
The third step of our research was performance evaluation. For PM2.5 predictions, the proposed spatiotemporal LSTM framework can overcome the systematic underestimation that Lotos-Euros (a chemical transport models (CTMs) based system) generally produces by analyzing their scatter plots and confusion matrices. Additionally, it performs better than an LSTM-based prediction framework (Fan J et al. (2017) [9]) that also considers spatial correlations among stations and performs a similar task in a similar region when comparing their rooted mean square errors (RMSE) and mean absolute errors (MAE). Differences between the hyperparameters of these two frameworks were analyzed.
As for PM10 predictions, the training efficiency can be improved significantly by transferring knowledge from PM2.5 predictions to PM10 predictions through model fine-tuning. Compared with Lotos-Euros, the LSTM framework also has competitive performance in PM10 predictions. As the first attempt at applying LSTM NN to predict visibility, forecasts are acceptable in practice. The total accuracy rate reaches 90.61%. The recall rate of the normal situation (L1) is 93% while its precision rate is 96%, indicating its superior prediction performance in the normal situations. Besides, for each visibility level, the number of correct predictions is larger than that of negative predictions.

Files

Xie_Yu_Thesis_0821_.pdf
(pdf | 4 Mb)
- Embargo expired in 27-08-2018
Unknown license