PM2.5 concentration prediction and early warning system of extreme conditions based on the LSTM
More Info
expand_more
Abstract
This thesis project developed an alternative PM2.5 concentration prediction model and early warning system of extreme air pollution based on the long short-term memory (LSTM) and achieved satisfying performance. To research more deeply, we divided the task into two parts. The first task was predicting the PM2.5 concentration of next 24 hours and another one was building early warning system of extreme air pollution of next 12 hours.
To solve the first task, we started from the 1-hour prediction problem, that was predicting PM2.5 of next hour based on the last hours’ data. We did parameter optimization to derive the best network architecture and we got a RMSE of 19.7863. We then successfully built 24-hour prediction model that was predicting PM2.5 concentration of next 24 hours according to the optimal 1-hour prediction model. The proposed 24-hour prediction model exhibited satisfactory performance, including the 13-24 h prediction task which is predicting the mean PM2.5 concentration among next 13-24 hours (RMSE=49.41).
Although we got a satisfying RMSE for the PM2.5 prediction problem, we didn’t get accurate prediction for extreme conditions and that’s why we continued to focus on the second task. We regarded the highest PM2.5 value among 12 hours as the extreme air pollution of this period and we divided the warning level into 4 parts. Then we built the early warning system based on the LSTM to predict the warning level of highest PM2.5 value of next 12 hours. As indicated by the ACC and AUC, our LSTM model achieved sound performance (ACC=86.7%, AUC=0.837).
To improve the prediction performance, we focused on several model optimization techniques for the 1-hour prediction model and each technique has effectively improved the accuracy. Moreover, we combined these optimization methods together, which leaded to the lowest RMSE of 14.1937. The combined optimization method performed better than any single optimization method, which suggested that we can use some effective optimization methods together to improve the prediction accuracy of LSTM model. In addition, we also compared our model with the random forest (RF) model and the comparison result proved that LSTM network worked better for both tasks.