The chemical transport model LOTOS-EUROS uses a volatility basis set (VBS) approach to represent the formation of secondary organic aerosol (SOA) in the atmosphere. Inclusion of the VBS approximately doubles the dimensionality of LOTOS-EUROS and slows computation of the advection operator by a factor of two. This complexity limits SOA representation in operational forecasts. We develop a mass-conserving dimensionality reduction method based on matrix factorization to find latent patterns in the VBS tracers that correspond to a smaller set of superspecies. Tracers are reversibly compressed to superspecies before transport, and the superspecies are subsequently decompressed to tracers for process-based SOA modeling. This physically interpretable data-driven method conserves the total concentration and phase of the tracers throughout the process. The superspecies approach is implemented in LOTOS-EUROS and found to accelerate the advection operator by a factor of 1.5–1.8. Concentrations remain numerically stable over model simulation times of 2 weeks, including simulations at higher spatial resolutions than the data-driven models were trained on. The reversible compression of VBS tracers enables detailed, process-based SOA representation in LOTOS-EUROS operational forecasts in a computationally efficient manner. Beyond this case study, the physically consistent data-driven approach developed in this work enforces conservation laws that are essential to other Earth system modeling applications, and generalizes to other processes where computational benefit can be gained from a two-way mapping between detailed process variables and their representation in a reduced-dimensional space.
@en