Computing for Climate Science

Jack Atkinson

Senior Research Software Engineer
ICCS - University of Cambridge

The ICCS Team (see end)



Slides and Materials

To access links or follow on your own device these slides can be found at:


Except where otherwise noted, these presentation materials are licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) License.

Vectors and icons by SVG Repo used under CC0(1.0)

Weather and Climate

Weather and climate disclaimer

Where did we come from?

Arrhenius (1896) attempted to predict the impact of CO2 on global temperature using:

  • observations
  • experimental data
  • physical laws

and combining them in a series of tabulated calculations.

Where did we come from?

In ~1916 Lewis Fry Richardson1 attemted2 to compute a 1 day forecast by hand using partial differential equations3.

He went on to publish Weather Prediction by Numerical Process (Richardson 1922)

Where are we now?

Since 1950 forecasting and modelling has advanced with computational power.

  • Accuracy has gone up
  • Speed has gone up
  • Ensemble forecasting has become possible
  • Like for Arrhenius, data assimilation has become important

Where are we now?

Climate models are large, complex, many-part systems.

Where are we now?

Subgrid processes are largest source of uncertainty

Microphysics by Sisi Chen Public Domain
Stagdered grid by NOAA under Public Domain
Globe grid with box by Caltech under Fair use

Where are we going?

  • The GPU revolution
    • Climate models are (mostly) written for CPU
      • Do many different calculations
    • GPU Porting is not straighforward
      • Do same calculation many times
  • The move towards ExaScale1
    • Simulations at the sub-kilometre scale will allow direct simulation previously parameterised processes2
    • The challenge is to compute efficently with the lowest least carbon dioxide emissions possible.

Where are we going?

  • The rise of ML
    • It is impossible to avoid ML
    • Promising for forecasting
      • Pangu-Weather, GraphCast, FourCastNet etc.
    • Considered use in science
      • Explainable AI
      • Data-driven models
      • Emulation (for speedup or pseudo-subgrid modelling)
    • Many other processes for climate:
      • Data (post)processing
      • Downscaling

Machine Learning

We typically think of Deep Learning as an end-to-end process;
a black box with an input and an output1.

Who’s that Pokémon?

\[\begin{bmatrix}\vdots\\a_{23}\\a_{24}\\a_{25}\\a_{26}\\a_{27}\\\vdots\\\end{bmatrix}=\begin{bmatrix}\vdots\\0\\0\\1\\0\\0\\\vdots\\\end{bmatrix}\] It’s Pikachu!

Neural Net by 3Blue1Brown under fair dealing.
Pikachu © The Pokemon Company, used under fair dealing.

Machine Learning in Science

Neural Net by 3Blue1Brown under fair dealing.
Pikachu © The Pokemon Company, used under fair dealing.




Research Software Engineering

Why does this matter?

def calc_p(n,t):
    return n*1.380649e-23*t
data = np.genfromtxt("mydata.csv")
p = calc_p(data[0,:],data[1,:]+273.15)

What does this code do?

# Boltzmann Constant and 0 Kelvin
Kb = 1.380649e-23
T0 = 273.15

def calc_pres(n, t):
    Calculate pressure using ideal gas law p = nkT

        n : array of number densities of molecules [N m-3]
        t : array of temperatures in [K]
         array of pressures [Pa]
    return n * Kb * t

# Read in data from file and convert T from [oC] to [K]
data = np.genfromtxt("mydata.csv")
n = data[0, :]
temp = data[1, :] + T0

# Calculate pressure, average, and print
pres = calc_pres(n, temp)
pres_av = np.sum(pres) / len(pres)

Our Work

Fortran-PyTorch coupling1

Many large scientific models are written in Fortran (or C, or C++).
Much machine learning is conducted in Python.

Mathematical Bridge by cmglee used under CC BY-SA 3.0
PyTorch, the PyTorch logo and any related marks are trademarks of The Linux Foundation.”
TensorFlow, the TensorFlow logo and any related marks are trademarks of Google Inc.

Model Profiling1

  • Scientists have developed a plugin for ray-tracing of atmospheric waves.
  • Significantly more accurate,
  • but too slow to be deployed in forecasting

RSE expertise:

  • Profile code to locate issues
  • Combine physics and computer science to understand
  • Build up of waves on some geographic areas leads to load imbalance

Software architecture1

FAIR and best practice1


  • Findable
  • Accessible
  • Interoperable
  • Reuseable


  • Oceanographic Cruise software
  • Bring a distributed software stack under version control

Educating on:

  • Open science and data
  • Licensing etc.
  • Testing
  • Version Control
  • Documentation
  • CI and automation
  • Software project management

Bayesian ML for Climate1

  • ML is typically used with large, quality data and low system knowledge
  • What if we have low quality data but some physical knowledge about the system?


  • Equation discovery for prediction of precipitation in the tropics

    • A typically poorly modelled problem.

Bayesian Optimisation image from (Zhang, Apley, and Chen 2020)


Functional programming language

  • Used to enhance data visualisation
  • increase trustworthyness and transparency
  • uses a bidirectional dynamic analysis and Galois dependencies
  • equip data points with information about how their parts relate to specific parts of the input

Teaching and Outreach

  • Annual ICCS Summer School
  • Software Carpentries training
  • Climate Code Clinic
  • Monthly Journal Club
  • NERC Summer School on Bayesian ML
  • Convening conference sessions
    • Platform for Advanced Scientific Computing
    • European Geophysical Union
    • Ocean Sciences
    • Principles of Programming Languages

Career Pathways - Me

Radiation belts by BAS SWA under Fair Use
Volcano by Abet Llacer under CC
Hurricane Isabel from the ISS by NASA

Career Pathways

Engineering -> Fluid Mechanics -> Research -> RSE/HPC

Physics -> HPC SysAdmin -> RSE

CompSci -> Numerical Methods -> RSE

Physics -> SE/Industry -> Materials -> Research -> RSE

Mathematics -> ML -> Industry/Startup -> Research

CompSci -> Research

CompSci -> RSE -> Research



  • Colm-Cille Caulfield - Mathematics
  • Emily Shuckburgh - Computer Science
  • Chris Edsall - RSE/Research Computing
  • Dominic Orchard - Computer Science
  • Marla Fuchs

Early Career Fellows:

  • Laura Cimoli - Oceanography
  • Henry Moss - Bayesian ML
  • Roly Perara - Programming languages

Research Software Engineers:

  • Paul Richmond - Team lead
  • Marion Weinzierl - Senior
  • Jack Atkinson - Senior
  • Tom Meltzer - Senior
  • Surbhi Ghoel
  • Tianzhang Cai
  • +1 Senior +5 RSE


  • Christian Fernandez Perotti
  • Filipa Goncalves

Where can I learn more?

Get in touch:


Arrhenius, Svante. 1896. “XXXI. On the Influence of Carbonic Acid in the Air Upon the Temperature of the Ground.” The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 41 (251): 237–76.
Richardson, Lewis F. 1922. Weather Prediction by Numerical Process. University Press.
Zhang, Yichi, Daniel W Apley, and Wei Chen. 2020. “Bayesian Optimization for Materials Design with Mixed Quantitative and Qualitative Variables.” Scientific Reports 10 (1): 4924.