🦬 Appa: Bending Weather Dynamics
with Latent Diffusion Models
for Global Data Assimilation

University of Liège, 2025
Architecture of Appa, composed of an autoencoder and a latent diffusion model.

Appa's latent diffusion model is composed of (a) an autoencoder that transforms atmospheric states to latent space, and (b) a spatio-temporal DiT that generates sequences of latent states. At generation, auxiliary observations can be incorporated by posterior conditioning without retraining.

Abstract

Deep learning has transformed weather forecasting by improving both its accuracy and computational efficiency. However, before any forecast can begin, weather centers must identify the current atmospheric state from vast amounts of observational data. To address this challenging problem, we introduce Appa, a score-based data assimilation model producing global atmospheric trajectories at 0.25-degree resolution and 1-hour intervals. Powered by a 1.5B-parameter spatio-temporal latent diffusion model trained on ERA5 reanalysis data, Appa can be conditioned on any type of observations to infer the posterior distribution of plausible state trajectories, without retraining. Our unified probabilistic framework flexibly tackles multiple inference tasks -- reanalysis, filtering, and forecasting -- using the same model, eliminating the need for task-specific architectures or training procedures. Experiments demonstrate physical consistency on a global scale and good reconstructions from observations, while showing competitive forecasting skills. Our results establish latent score-based data assimilation as a promising foundation for future global atmospheric modeling systems.

Reanalysis

Reanalysis aims to assimilate weather observations to produce plausible full-state trajectories. We demonstrate that Appa can generate trajectories conditioned on a sequence of partial satellite and ground-station observations. We display here six key variables: surface temperature, surface wind speed (eastward and northward), total precipitation, and atmospheric potential and temperature.

Forecasting

Appa's flexibility enables framing forecasting in different manners, including autoregressively, by sequentially generating blankets and conditioning on the initial ground truth and previous estimates. Appa reaches performance comparable to state-of-the-art machine learning forecasting models, despite not being specifically designed for that task.

Unconditional sampling

Thanks to its blanket mechanism, Appa can generate arbitrarily long sequences of weather states. Furthermore, this generation is embarrassingly parallel, reaching as low as about 20-second sampling regardless of the length of the sequence. We demonstrate this by generating a consistent trajectory of over 3 months.

BibTeX

@misc{andry2025appabendingweatherdynamics,
        title={Appa: Bending Weather Dynamics with Latent Diffusion Models for Global Data Assimilation},
        author={Gérôme Andry and François Rozet and Sacha Lewin and Omer Rochman and Victor Mangeleer and Matthias Pirlet and Elise Faulx and Marilaure Grégoire and Gilles Louppe},
        year={2025},
        eprint={2504.18720},
        archivePrefix={arXiv},
        primaryClass={cs.LG},
        url={https://arxiv.org/abs/2504.18720},
}