2026-06-10
To access links or follow on your own device these slides can be found at:
jackatkinson.net/slides
Except where otherwise noted, these presentation materials are licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) License.
Vectors and icons by SVG Repo under CC0(1.0) or FontAwesome under SIL OFL 1.1
Large, complex, many-part systems.
Replacing select component(s) with ML scheme(s).

Neural Net by 3Blue1Brown under fair dealing.
Many large scientific models are written in Fortran (or C, or C++).
Much machine learning is conducted in Python.




![]()


![]()
Mathematical Bridge by cmglee used under CC BY-SA 3.0
PyTorch, the PyTorch logo and any related marks are trademarks of The Linux Foundation.”
TensorFlow, the TensorFlow logo and any related marks are trademarks of Google Inc.
We consider 2 types:
Computational
Developer
In research both have an effect on ‘time-to-science’.
Especially when extensive research software support is unavailable.
FTorch is published in JOSS!
Atkinson et al. (2025)
FTorch: a library for coupling PyTorch models to Fortran.
Journal of Open Source Software, 10(107), 7602,
DOI: 10.21105/joss.07602
Please cite if you use FTorch!

We also have a Mailing List: FTorch-Announce on JISC for future updates and usage examples!
layoutPreviously when calling torch_tensor_from_array one had to specify the memory layout. This was used to correctly stride in memory to avoid copying.
We now assume users want the [1, 2, ..., n] layout by default, with layout not needing to be specified.
Older code continues to work in v1.1, but in future layout will become an optional argument requiring a change in order of call arguments.
Advice: Use the default argument where possible.
The ICCS fork of the CESM FTorch interface has been updated to use the default.
FTorch creates Torch C++ objects for manipulation under the hood. As detailed in the documentation and examples, proper handling is required, like allocate and deallocate to prevent memory leakage.
Previously one had to explicitly delete torch objects to ensure that C++ memory was cleaned up, otherwise leakage could occur.
Now torch_delete is a finalizer, meaning it will be called whenever a tensor, model, or array of tensors goes out of scope.
Continuing to call torch_delete will still work, so old code remains valid.
Before: explicit cleanup required
use ftorch
implicit none
real, dimension(5), target :: in_data, out_data
type(torch_tensor), dimension(1) :: input_tensors, output_tensors
type(torch_model) :: torch_net
...
! Create Torch input/output tensors from the Fortran arrays
call torch_tensor_from_array(input_tensors(1), in_data, torch_kCPU)
call torch_tensor_from_array(output_tensors(1), out_data, torch_kCPU)
call torch_model_load(torch_net, 'path/to/saved/model.pt', torch_kCPU)
call torch_model_forward(torch_net, input_tensors, output_tensors)
...
! Cleanup
call torch_delete(torch_net)
call torch_delete(input_tensors)
call torch_delete(output_tensors)After: finalizer handles cleanup
use ftorch
implicit none
real, dimension(5), target :: in_data, out_data
type(torch_tensor), dimension(1) :: input_tensors, output_tensors
type(torch_model) :: torch_net
...
! Create Torch input/output tensors from the Fortran arrays
call torch_tensor_from_array(input_tensors(1), in_data, torch_kCPU)
call torch_tensor_from_array(output_tensors(1), out_data, torch_kCPU)
call torch_model_load(torch_net, 'path/to/saved/model.pt', torch_kCPU)
call torch_model_forward(torch_net, input_tensors, output_tensors)
...torch_tensor_delete has been made elemental, meaning that it applies to both tensors and arrays of tensors.
As such torch_tensor_array_delete has been removed.
This shoud not affect users as advice has always been to use the torch_delete interface instead of calling directly.
Batching works the same as in PyTorch — add a leading batch dimension to input arrays and FTorch applies the model independently to each element.
This has always been the case, but we have clarified this in the documentation and added a worked example - 04) Batching.
Key points:
! Single inference (1D input)
real(sp), dimension(5), target :: in_single, out_single
! Batched inference (3D input)
real(sp), dimension(2,3,5), target :: in_batch, out_batch
call torch_model_load(model, "model.pt", torch_kCPU)
! Single
call torch_tensor_from_array(in_tensors(1), in_single, torch_kCPU)
call torch_tensor_from_array(out_tensors(1), out_single, torch_kCPU)
call torch_model_forward(model, in_tensors, out_tensors)
! Batched
call torch_tensor_from_array(in_tensors(1), in_batch, torch_kCPU)
call torch_tensor_from_array(out_tensors(1), out_batch, torch_kCPU)
call torch_model_forward(model, in_tensors, out_tensors)Previously errors in Torch resulted in opaque error messages that referenced locations in the Torch library at the point of failure.
Now there are checks in Fortran to catch some input errors before they propagate to the C++ and raise them there, and C++ errors are now caught and handled in the CTorch layer to provide more information to the user about the point of failure.
Before: opaque Torch C++ exception
terminate called after throwing
an instance of 'torch::Error'
what(): Expected all tensors to
be on the same device, but found
at least two devices (cuda:0
and cpu)!
After: caught at CTorch layer
[ERROR]: One of the inputs to torch_jit_module_forward is not a Tensor
Internally, FTorch makes the assumption that data passed in is contiguous. in memory. This is due to the shared memory feature for efficiency.
If this is violated then data could be read incorrectly by Torch.
Validation is now applied in torch_tensor_from_array to check that input data is pointer, contiguous rather than simply target.
This is considered a bugfix but if you were passing in temporaries you will now need to create an array first.
The following calls are now forbidden:
What do you need to do?
Nothing, provided you always passed arrays into torch_tensor_from_array.
How shall ye know it?
The following error will be raised in compilation:
Error: There is no specific subroutine for the generic 'torch_tensor_from_array' at (1)
with the 1 identifying the input data argument.
FTorch v1.0 allowed running inference on GPU:
! Load model onto GPU device 0
call torch_model_load(torch_net_0, 'model.pt', torch_kCUDA, device_index=0)
! Cast data to tensors on different devices
call torch_tensor_from_array(in_tensors_0(1), in_data, torch_kCUDA, device_index=0)
call torch_tensor_from_array(out_tensor_0(1), out_data, torch_kCPU)
! Inference as usual — Torch handles cross-device transfers
call torch_model_forward(torch_net_0, in_tensors_0, out_tensor_0)Supported backends: NVIDIA CUDA, Intel XPU, Apple Silicon MPS
Multiple devices supported (e.g., device_index=0, device_index=1).
FTorch now supports AMD GPUs via the HIP backend.
PyTorch recommends reusing torch.cuda interfaces for HIP — FTorch builds against the CUDA backend, aliasing HIP at the CMake level.
Since the early days all FTorch source code existed in src/ftorch.F90. As features grew this became unsustainable so it is now distributed across several files:
src
├── ctorch.cpp
├── ctorch.h
├── ftorch_devices.F90
├── ftorch_model.f90
├── ftorch_optim.f90
├── ftorch_tensor.f90
├── ftorch_tensor.fypp
├── ftorch_types.f90
└── ftorch.f90
No change to usage — users still import from a parent ftorch module.
pkg-config --libs ftorch and pkg-config --cflags ftorch PR #464libftorch.so now includes RUNPATH to Torch so downstream targets find it automatically PR #43710 Total, 5 new!
ICCS ran a two-day ML Coupling Workshop in Cambridge, September 2025, bringing together researchers, RSEs, and modelling centres working on hybrid modelling challenges.
autograd functionality from Torch
requires_grad argument on tensor constructiontorch_tensor_backward for reverse-mode differentiationtorch_tensor_get_gradient to extract computed gradientstorch_tensor_zero_grad to reset gradients between backward passes=, +, -, *, /, **)
torch_optim derived type wrapping PyTorch optimizers
torch_optim%zero_grad — zeroes gradients at start of each steptorch_optim%step — takes one optimizer iterationSGD, Adam, AdamWTesting optimisation of a single tensor against PyTorch:

torch_loss_<...> subroutines create loss tensors.MSE and CrossEntropytorch_model_parameters interface added to access model weightsTraining SimpleNet in Fortran:

utils is now ftorch_utils, installable via pippt2ts is now a command-line script
No longer required to copy and modify script by hand.
provide model definition file, the class name, and where to save:
fpm (Fortran Package Manager)Automatic differentiation for Fortran, via Enzyme
Compiler plugin that differentiates LLVM IR, computing gradients of existing code without source-level rewriting. Currently supports C, C++, Julia, Rust. ICCS are working to add Fortran.
6-month (March-Aug) ICCS project led by Joe Wallwork.
Aiming to expose Enzyme through a Flang plugin and demonstrate differentiable Fortran on a UKCA atmospheric chemistry case study.
Eventual goal is to provide a simple tool for differentiable modelling of existing Fortran code.
Get in touch:
Thanks to Joe Wallwork, Tom Meltzer, Elliott Kasoar,
Niccolò Zanotti and the rest of the FTorch team.
The ICCS received support from 
FTorch has been supported by 

