Ozone exceedance forecasting with enhanced extreme instance augmentation

A case study in Germany

More Info
expand_more

Abstract

Accurately forecasting ozone levels that exceed specific thresholds is pivotal for mitigating adverse effects on both the environment and public health. However, predicting such ozone exceedances remains challenging due to the infrequent occurrence of high-concentration ozone data. This research, leveraging data from 57 German monitoring stations from 1999 to 2018, introduces an Enhanced Extreme Instance Augmentation Random Forest (EEIA-RF) approach that significantly improves the prediction of days when the maximum daily 8-hour average ozone concentrations exceed 120μg/m3. A pre-trained machine learning model is used to generate additional high-concentration data, which, combined with selectively reduced low-concentration data, forms a new dataset for training a refined model. This method achieved an improvement of at least 8% in the accuracy of predicting days with ozone exceedances across Germany. Our experiment underscores the approach's value in enhancing atmospheric modeling and supporting public health advisories and environmental policy-making related to ozone pollution.