Reducing the Overhead of Coupled Machine Learning Models between Python and Fortran

RSECon23, Swansea

Jack Atkinson

ICCS/Cambridge

Simon Clifford

ICCS/Cambridge

Athena Elafrou

NVIDIA

Tom Meltzer

ICCS/Cambridge

Chris Edsall

ICCS/Cambridge

2023-09-05

Precursors

Licensing

Except where otherwise noted, these presentation materials are licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) License.

Vectors and icons by SVG Repo used under CC0(1.0)

Slides

To access links or follow on your own device these slides can be found at
https://jackatkinson.net/slides

Introduction

The ICCS¹

The Institute of Computing for Climate Science

Domain-specific RSE group based at the University of Cambridge
Embedded support to several international climate science projects

Climate Modelling

Climate models are large, complex, many-part systems.

Machine Learning

We typically think of Deep Learning as an end-to-end process;
a black box with an input and an output.

Who’s that Pokémon?

\[\begin{bmatrix}\vdots\\a_{23}\\a_{24}\\a_{25}\\a_{26}\\a_{27}\\\vdots\\\end{bmatrix}=\begin{bmatrix}\vdots\\0\\0\\1\\0\\0\\\vdots\\\end{bmatrix}\] It’s Pikachu!

Neural Net by 3Blue1Brown under fair dealing.
Pikachu © The Pokemon Company, used under fair dealing.

Machine Learning in Science

Neural Net by 3Blue1Brown under fair dealing.
Pikachu © The Pokemon Company, used under fair dealing.

Replacing physics-based components

2 approaches:

emulation, or
data-driven.

Additional challenges:

Physical compatibility
- Physics-based models have conservation laws
  Required for accuracy and stability
Language interoperation

Language interoperation

Many large scientific models are written in Fortran (or C, or C++).
Much machine learning is conducted in Python.

Mathematical Bridge by cmglee used under CC BY-SA 3.0
PyTorch, the PyTorch logo and any related marks are trademarks of The Linux Foundation.”
TensorFlow, the TensorFlow logo and any related marks are trademarks of Google Inc.

Solutions

Considerations

There are 2 types of efficiency:

Computational
Developer

An ideal solution should:

not generate excess additional work,
- not require advanced computing skills,
- have a minimal learning curve,
not add excess dependencies,
be easy to maintain, and
maximise performance.

Possible solutions

Implement a NN in Fortran
Forpy/CFFI
SmartSim/Pipes
Fortran-Keras Bridge

Possible solutions

Implement a NN in Fortran
Forpy/CFFI
SmartSim/Pipes
Fortran-Keras Bridge

e.g. inference-engine, neural-fortran, own solution etc.

removes the two-language problem

how do you ensure you port the model correctly?
ML libraries are highly optimised, probably more so than your code.

Possible solutions

Implement a NN in Fortran
Forpy/CFFI
SmartSim/Pipes
Fortran-Keras Bridge

brings python types into Fortran

easy to add forpy.mod file and compile

verbose, with a learning curve
need to manage and link python environment
increases dependencies

Possible solutions

Implement a NN in Fortran
Forpy/CFFI
SmartSim/Pipes
Fortran-Keras Bridge

pass data between workers through a network glue layer
may be necessary for certain architectures

steep learning curve
involves data copying

Possible solutions

Implement a NN in Fortran
Forpy/CFFI
SmartSim/Pipes
Fortran-Keras Bridge

pure Fortran

TensorFlow (Keras) only
inactive and incomplete

Possible solutions

Python
env

Python
runtime

xkcd #1987 by Randall Munroe, used under CC BY-NC 2.5

Interfacing Libraries

Both PyTorch and TensorFlow have C++ backends and provide APIs to access.
Binding Fortran to C is straightforward¹ from 2003 using iso_c_binding.

PyTorch

C++ API
Archive model as Torchscript
- Statically typed subset of Python
Read and run via any Torch API

TensorFlow

C++ and C APIs
Archive model as Keras SavedModel
process_model provided to extract required opaque parameters and use API

Performant - Computational

No-copy access in memory.

Indexing issues and associated reshape can be avoided with Torch accessor.

Performant - Ease of use

Installation

CMake

Install libtorch, or the TensorFlow C API
Clone
Build using CMake (instructions provided)
Install
Link

CMake is a trademark of Kitware.

Tools

pt2ts.py script facilitates saving models to TorchScript.
process_model extracts TF model data.

Examples

Guide users through process from saving a python model to running in Fortran.
User-defined and preloaded (ResNet-18) cases.

Support

Use frameworks’ implementations directly
- feature support
- future support
- direct translation of python models ¹

Licensing and FOSS

The libraries are licensed under MIT and available as FOSS.

Highly permissive for use by all
OS development on GitHub using issues and PRs

Case Study

Gravity Wave parameterisation in MiMA

Atmospheric model (Jucker and Gerber 2017)
Replace physics-based gravity wave parameterisation

Neural Net
- Emulating Alexander and Dunkerton (1999) gravity wave parameterisation.
- Fully-connected multi-layer net with identical Pytorch and TensorFlow versions
- Initially interfaced (slowly) using forpy (Espinosa et al. 2022)

Coding example

Replace the forpy connected net with our direct coupled approach.

Test both PyTorch and TensorFlow.

Given a Fortran program with model inputs in arrays,

the original coupling using forpy requires

67

lines of boilerplate code,

whilst our library takes

39.

A fork of MiMA with these implementations of the interfaces is at:
https://github.com/DataWaveProject/MiMA-machine-learning

e.g. Loading a Torch model

ie = forpy_initialize()

type(module_py) :: run_emulator
type(list) :: paths
type(object) :: model
type(tuple) :: args
type(str) :: py_model_dir

ie = str_create(py_model_dir, trim('/path/to/saved/model'))
ie = get_sys_path(paths)
ie = paths%append(py_model_dir)

! import python modules to `run_emulator`
ie = import_py(run_emulator, trim(model_name))
if (ie .ne. 0) then
    call err_print
    call error_mesg(__FILE__, __LINE__, "forpy model not loaded")
end if

! use python module `run_emulator` to load a trained model
ie = call_py(model, run_emulator, "name_of_init_function")
if (ie .ne. 0) then
    call err_print
    call error_mesg(__FILE__, __LINE__, "call to `initialize` failed")
end if

e.g. Loading a Torch model

type(torch_module) :: model

model = torch_module_load('/path/to/saved/model.pt'//c_null_char)

Conclusions

Take away messages

Machine learning has many potential applications in scientific computing
Leveraging it effectively requires care
We have developed libraries allowing easy and efficient deployment of ML within Fortran models
For new projects we advise using PyTorch

Future work

Provide a tagged first release on github
- Publication through JOSS and zenodo
Further test GPU functionalities
Implement functionalities beyond inference?
- Online training is likely to become important

Get involved

Inform potential users
- Further testing and feedback wanted!
Developers welcome

Closing slide, thanks, and questions

The libraries can be found at:

Torch: https://github.com/Cambridge-ICCS/fortran-pytorch-lib

TensorFlow: https://github.com/Cambridge-ICCS/fortran-tf-lib

Slides available at: https://jackatkinson.net/slides/RSECon23/RSECon23.html

Get in touch:

Jack Atkinson

jackatkinson.net

jwa34[AT]cam.ac.uk

jatkinson1000

@jatkinson1000@fosstodon.org

The ICCS is funded by

References

Alexander, MJ, and TJ Dunkerton. 1999. “A Spectral Parameterization of Mean-Flow Forcing Due to Breaking Gravity Waves.” Journal of the Atmospheric Sciences 56 (24): 4167–82.

Espinosa, Zachary I, Aditi Sheshadri, Gerald R Cain, Edwin P Gerber, and Kevin J DallaSanta. 2022. “Machine Learning Gravity Wave Parameterization Generalizes to Capture the QBO and Response to Increased CO2.” Geophysical Research Letters 49 (8): e2022GL098174.

Jucker, Martin, and EP Gerber. 2017. “Untangling the Annual Cycle of the Tropical Tropopause Layer with an Idealized Moist Model.” Journal of Climate 30 (18): 7339–58.

Code

The libraries can be found at:

Torch: https://github.com/Cambridge-ICCS/fortran-pytorch-lib
TensorFlow: https://github.com/Cambridge-ICCS/fortran-tf-lib

Their implementation in the MiMA model can be found at:
https://github.com/DataWaveProject/MiMA-machine-learning

Benchmarking of PyTorch can be found at:
https://github.com/Cambridge-ICCS/fortran-pytorch-lib-benchmark/

Bonus Content

Code Example - PyTorchTake model file
Save as TorchScript
import torch
import my_ml_model

trained_model = my_ml_model.initialize()
scripted_model = torch.jit.script(model)
scripted_model.save("my_torchscript_model.pt")

Code Example - PyTorch

Neccessary imports:

use, intrinsic :: iso_c_binding, only: c_int64_t, c_float, c_char, &
                                       c_null_char, c_ptr, c_loc
use ftorch

Loading a pytorch model:

model = torch_module_load('/path/to/saved/model.pt'//c_null_char)

Code Example - PyTorch

Tensor creation from Fortran arrays:

! Fortran variables
real, dimension(:,:), target  :: SST, model_output
! C/Torch variables
integer(c_int), parameter :: dims_T = 2
integer(c_int64_t) :: shape_T(dims_T)
integer(c_int), parameter :: n_inputs = 1
type(torch_tensor), dimension(n_inputs), target :: model_inputs
type(torch_tensor) :: model_output_T

shape_T = shape(SST)

model_inputs(1) = torch_tensor_from_blob(c_loc(SST), dims_T, shape_T &
                                         torch_kFloat64, torch_kCPU)

model_output = torch_tensor_from_blob(c_loc(output), dims_T, shape_T, &
                                      torch_kFloat64, torch_kCPU)

Code Example - PyTorch

Running the model

call torch_module_forward(model, model_inputs, n_inputs, model_output_T)

Cleaning up:

call torch_tensor_delete(model_inputs(1))
call torch_module_delete(model)

Results

Timings (real seconds) for computing gravity wave drag in-situ.¹

	Forpy	Direct	% Direct/Forpy
PyTorch	94.43 s	134.81 s	142.8 %
TensorFlow	667.16 s	170.31 s	25.5 %

Timing data (real seconds) for benchmarking gravity wave drag with PyTorch on CSD3.

intel	Forpy	Direct	% Direct/Forpy
Mean	0.3126 s	0.3509 s	112.3 %
Std	0.0420 s	0.0547 s	-

gcc	Forpy	Direct	% Direct/Forpy
Mean	0.3405 s	0.3669 s	107.7 %
Std	0.0449 s	0.0586 s	-

Reducing the Overhead of Coupled Machine Learning Models between Python and Fortran

Precursors

Licensing

Slides

Introduction

The ICCS1

Climate Modelling

Machine Learning

Machine Learning in Science

Replacing physics-based components

Language interoperation

Solutions

Considerations

Possible solutions

Possible solutions

Possible solutions

Possible solutions

Possible solutions

Possible solutions

Interfacing Libraries

PyTorch

TensorFlow

Performant - Computational

Performant - Ease of use

Installation

Tools

Examples

Support

Licensing and FOSS

Case Study

Gravity Wave parameterisation in MiMA

Coding example

67

39.

e.g. Loading a Torch model

e.g. Loading a Torch model

Conclusions

Take away messages

Future work

Get involved

Closing slide, thanks, and questions

References

Code

Bonus Content

Code Example - PyTorch

Code Example - PyTorch

Code Example - PyTorch

Code Example - PyTorch

Results

The ICCS¹