Combating air pollution has proven to be a difficult task for countries with rapidly developing economies. Poor air quality can be hazardous to people doing any outdoor activities. So being able to make accurate, short term air quality predictions can be very useful. However, ma
...
Combating air pollution has proven to be a difficult task for countries with rapidly developing economies. Poor air quality can be hazardous to people doing any outdoor activities. So being able to make accurate, short term air quality predictions can be very useful. However, making these predictions has proven to be quite difficult, since there are a lot of different physical and chemical processes involved in the emission and transport of the various aerosols that contribute to air pollution. So instead of the more traditional Chemical Transport Models (CTMs) we will be using neural networks in order to make predictions of one of these aerosols, PM2.5. In particular, we will be using a Long Short Term Memory (LSTM) network. In addition, we will include the simulations results from a CTM, LOTOS-EUROS, as input data to the LSTM network to improve the performance of the neural network. One of the main drawbacks of the LSTM approach is that whenever the PM2.5 concentration changes a lot, the predictions made by the LSTM network take some time to change as well, causing a visible time delay when looking at the measurements and predictions in the same time series plot. We will also try a simpler type of neural network, a Feedforward Neural Network (FNN) and compare its performance to that of LSTM. We found that using the simulation data does indeed improve the LSTM network. Not only in terms of the loss function used by the neural network and, but in particular in the amount gross overestimations by the network, which we use to quantify the LSTM time delay problem. We also found that FNN outperforms the LSTM approach, in particular on samples of high PM2.5 concentrations, which we argue is primarily caused by a low amount of samples in our dataset.