Cambridge-ICCS · jatkinson1000 · Mar 28, 2024 · Mar 19, 2024 · Mar 19, 2024 · Mar 19, 2024
diff --git a/README.md b/README.md
@@ -187,7 +187,8 @@ adaptations to the code:
 2. When using FTorch in Fortran, set the device for the input
    tensor(s) to `torch_kCUDA`, rather than `torch_kCPU`.
 
-For detailed guidance about running on GPU please see the
+For detailed guidance about running on GPU, including instructions for using multiple
+devices, please see the
 [online GPU documentation](https://cambridge-iccs.github.io/FTorch/page/gpu.html).
 
 ## Examples

diff --git a/examples/1_SimpleNet/README.md b/examples/1_SimpleNet/README.md
@@ -9,7 +9,7 @@ covered in later examples.
 
 ## Description
 
-A python file `simplenet.py` is provided that defines a very simple pytorch 'net' that takes an input
+A python file `simplenet.py` is provided that defines a very simple PyTorch 'net' that takes an input
 vector of length 5 and applies a single `Linear` layer to multiply it by 2.
 
 A modified version of the `pt2ts.py` tool saves this simple net to TorchScript.
@@ -29,7 +29,7 @@ To run this example requires:
 ## Running
 
 To run this example install FTorch as described in the main documentation.
-Then from this directory create a virtual environment an install the necessary python
+Then from this directory create a virtual environment and install the necessary python
 modules:
 ```
 python3 -m venv venv
@@ -47,7 +47,7 @@ tensor([[0, 2, 4, 6, 8]])
 ```
 
 To save the SimpleNet model to TorchScript run the modified version of the
-`pt2ts.py` tool :
+`pt2ts.py` tool:
 ```
 python3 pt2ts.py
 ```

diff --git a/examples/1_SimpleNet/simplenet_infer_python.py b/examples/1_SimpleNet/simplenet_infer_python.py
@@ -38,14 +38,16 @@ def deploy(saved_model: str, device: str, batch_size: int = 1) -> torch.Tensor:
         output_gpu = model.forward(input_tensor_gpu)
         output = output_gpu.to(torch.device("cpu"))
 
+    else:
+        raise ValueError(f"Device '{device}' not recognised.")
+
     return output
 
 
 if __name__ == "__main__":
     saved_model_file = "saved_simplenet_model_cpu.pt"
 
     device_to_run = "cpu"
-    # device = "cuda"
 
     batch_size_to_run = 1
 

diff --git a/examples/2_ResNet18/resnet_infer_python.py b/examples/2_ResNet18/resnet_infer_python.py
@@ -47,6 +47,9 @@ def deploy(saved_model: str, device: str, batch_size: int = 1) -> torch.Tensor:
         output_gpu = model.forward(input_tensor_gpu)
         output = output_gpu.to(torch.device("cpu"))
 
+    else:
+        raise ValueError(f"Device '{device}' not recognised.")
+
     return output
 
 

diff --git a/examples/3_MultiGPU/CMakeLists.txt b/examples/3_MultiGPU/CMakeLists.txt
@@ -0,0 +1,21 @@
+cmake_minimum_required(VERSION 3.1 FATAL_ERROR)
+#policy CMP0076 - target_sources source files are relative to file where target_sources is run
+cmake_policy (SET CMP0076 NEW)
+
+set(PROJECT_NAME MultiGPUExample)
+
+project(${PROJECT_NAME} LANGUAGES Fortran)
+
+# Build in Debug mode if not specified
+if(NOT CMAKE_BUILD_TYPE)
+    set(CMAKE_BUILD_TYPE Debug CACHE STRING "" FORCE)
+endif()
+
+find_package(FTorch)
+find_package(MPI REQUIRED)
+message(STATUS "Building with Fortran PyTorch coupling")
+
+# Fortran example
+add_executable(simplenet_infer_fortran simplenet_infer_fortran.f90)
+target_link_libraries(simplenet_infer_fortran PRIVATE FTorch::ftorch)
+target_link_libraries(simplenet_infer_fortran PRIVATE MPI::MPI_Fortran)
diff --git a/examples/3_MultiGPU/README.md b/examples/3_MultiGPU/README.md
@@ -0,0 +1,113 @@
+# Example 3 - MultiGPU
+
+This example revisits the SimpleNet example and demonstrates how to run it using
+multiple GPU devices.
+
+
+## Description
+
+The same python file `simplenet.py` is used from the earlier example. Recall that it
+defines a very simple PyTorch network that takes an input of length 5 and applies a
+single `Linear` layer to multiply it by 2.
+
+The same `pt2ts.py` tool is used to save the simple network to TorchScript.
+
+A series of files `simplenet_infer_<LANG>` then bind from other languages to run the
+TorchScript model in inference mode.
+
+## Dependencies
+
+To run this example requires:
+
+- cmake
+- An MPI installation.
+- mpif90
+- FTorch (installed as described in main package)
+- python3
+
+## Running
+
+To run this example install FTorch as described in the main documentation. Then from
+this directory create a virtual environment and install the necessary python modules:
+```
+python3 -m venv venv
+source venv/bin/activate
+pip install -r requirements.txt
+```
+
+You can check that everything is working by running `simplenet.py`:
+```
+python3 simplenet.py
+```
+As before, this defines the network and runs it with an input tensor
+[0.0, 1.0, 2.0, 3.0, 4.0] to produce the result:
+```
+tensor([[0, 2, 4, 6, 8]])
+```
+
+To save the SimpleNet model to TorchScript run the modified version of the `pt2ts.py`
+tool:
+```
+python3 pt2ts.py
+```
+which will generate `saved_simplenet_model_cuda.pt` - the TorchScript instance of the
+network. The only difference with the earlier example is that the model is built to
+be run using CUDA rather than on CPU.
+
+You can check that everything is working by running the `simplenet_infer_python.py`
+script. It's set up with MPI such that a different GPU device is associated with each
+MPI rank. You should substitute `<NP>` with the number of GPUs you wish to run with:
+```
+mpiexec -np <NP> python3 simplenet_infer_python.py
+```
+This reads the model in from the TorchScript file and runs it with an different input
+tensor on each GPU device: [0.0, 1.0, 2.0, 3.0, 4.0], plus the device index in each
+entry. The result should be (some permutation of):
+```
+0: tensor([[0., 2., 4., 6., 8.]])
+1: tensor([[ 2., 4.,  6.,  8., 10.]])
+2: tensor([[ 4., 6.,  8., 10., 12.]])
+3: tensor([[ 6., 8., 10., 12., 14.]])
+```
+
+At this point we no longer require python, so can deactivate the virtual environment:
+```
+deactivate
+```
+
+To call the saved SimpleNet model from Fortran we need to compiler the `simplnet_infer`
+files. This can be done using the included `CMakeLists.txt` as follows, noting that we
+need to use an MPI-enabled Fortran compiler:
+```
+mkdir build
+cd build
+cmake .. -DCMAKE_PREFIX_PATH=<path/to/your/installation/of/library/> -DCMAKE_BUILD_TYPE=Release
+cmake --build .
+```
+
+To run the compiled code calling the saved SimpleNet TorchScript from Fortran, run the
+executable with an argument of the saved model file. Again, specify the number of MPI
+processes according to the desired number of GPUs:
+```
+mpiexec -np <NP> ./simplenet_infer_fortran ../saved_simplenet_model_cuda.pt
+```
+
+This runs the model with the same inputs as described above and should produce (some
+permutation of) the output:
+```
+input on rank0: [  0.0,  1.0,  2.0,  3.0,  4.0]
+input on rank1: [  1.0,  2.0,  3.0,  4.0,  5.0]
+input on rank2: [  2.0,  3.0,  4.0,  5.0,  6.0]
+input on rank3: [  3.0,  4.0,  5.0,  6.0,  7.0]
+output on rank0: [  0.0,  2.0,  4.0,  6.0,  8.0]
+output on rank1: [  2.0,  4.0,  6.0,  8.0, 10.0]
+output on rank2: [  4.0,  6.0,  8.0, 10.0, 12.0]
+output on rank3: [  6.0,  8.0, 10.0, 12.0, 14.0]
+```
+
+Alternatively, we can use `make`, instead of cmake, copying the Makefile over from the
+first example:
+```
+cp ../1_SimpleNet/Makefile .
+```
+See the instructions in that example directory for further details.
diff --git a/examples/3_MultiGPU/pt2ts.py b/examples/3_MultiGPU/pt2ts.py
@@ -0,0 +1,158 @@
+"""Load a PyTorch model and convert it to TorchScript."""
+
+from typing import Optional
+import torch
+
+# FPTLIB-TODO
+# Add a module import with your model here:
+# This example assumes the model architecture is in an adjacent module `my_ml_model.py`
+import simplenet
+
+
+def script_to_torchscript(
+    model: torch.nn.Module, filename: Optional[str] = "scripted_model.pt"
+) -> None:
+    """
+    Save PyTorch model to TorchScript using scripting.
+
+    Parameters
+    ----------
+    model : torch.NN.Module
+        a PyTorch model
+    filename : str
+        name of file to save to
+    """
+    print("Saving model using scripting...", end="")
+    scripted_model = torch.jit.script(model)
+    # print(scripted_model.code)
+    scripted_model.save(filename)
+    print("done.")
+
+
+def trace_to_torchscript(
+    model: torch.nn.Module,
+    dummy_input: torch.Tensor,
+    filename: Optional[str] = "traced_model.pt",
+) -> None:
+    """
+    Save PyTorch model to TorchScript using tracing.
+
+    Parameters
+    ----------
+    model : torch.NN.Module
+        a PyTorch model
+    dummy_input : torch.Tensor
+        appropriate size Tensor to act as input to model
+    filename : str
+        name of file to save to
+    """
+    print("Saving model using tracing...", end="")
+    traced_model = torch.jit.trace(model, dummy_input)
+    frozen_model = torch.jit.freeze(traced_model)
+    ## print(frozen_model.graph)
+    ## print(frozen_model.code)
+    frozen_model.save(filename)
+    print("done.")
+
+
+def load_torchscript(filename: Optional[str] = "saved_model.pt") -> torch.nn.Module:
+    """
+    Load a TorchScript from file.
+
+    Parameters
+    ----------
+    filename : str
+        name of file containing TorchScript model
+    """
+    model = torch.jit.load(filename)
+
+    return model
+
+
+if __name__ == "__main__":
+    # =====================================================
+    # Load model and prepare for saving
+    # =====================================================
+
+    # FPTLIB-TODO
+    # Load a pre-trained PyTorch model
+    # Insert code here to load your model as `trained_model`.
+    # This example assumes my_ml_model has a method `initialize` to load
+    # architecture, weights, and place in inference mode
+    trained_model = simplenet.SimpleNet()
+
+    # Switch off specific layers/parts of the model that behave
+    # differently during training and inference.
+    # This may have been done by the user already, so just make sure here.
+    trained_model.eval()
+
+    # =====================================================
+    # Prepare dummy input and check model runs
+    # =====================================================
+
+    # FPTLIB-TODO
+    # Generate a dummy input Tensor `dummy_input` to the model of appropriate size.
+    # This example assumes one input of size (5)
+    trained_model_dummy_input = torch.ones(5)
+
+    # FPTLIB-TODO
+    # Uncomment the following lines to save for inference on GPU (rather than CPU):
+    device = torch.device("cuda")
+    trained_model = trained_model.to(device)
+    trained_model.eval()
+    trained_model_dummy_input = trained_model_dummy_input.to(device)
+
+    # FPTLIB-TODO
+    # Run model for dummy inputs
+    # If something isn't working This will generate an error
+    trained_model_dummy_output = trained_model(
+        trained_model_dummy_input,
+    )
+
+    # =====================================================
+    # Save model
+    # =====================================================
+
+    # FPTLIB-TODO
+    # Set the name of the file you want to save the torchscript model to:
+    saved_ts_filename = "saved_simplenet_model_cuda.pt"
+
+    # FPTLIB-TODO
+    # Save the PyTorch model using either scripting (recommended where possible) or tracing
+    # -----------
+    # Scripting
+    # -----------
+    script_to_torchscript(trained_model, filename=saved_ts_filename)
+
+    # -----------
+    # Tracing
+    # -----------
+    # trace_to_torchscript(trained_model, trained_model_dummy_input, filename=saved_ts_filename)
+
+    print(f"Saved model to TorchScript in '{saved_ts_filename}'.")
+
+    # =====================================================
+    # Check model saved OK
+    # =====================================================
+
+    # Load torchscript and run model as a test
+    # FPTLIB-TODO
+    # Scale inputs as above and, if required, move inputs and mode to GPU
+    trained_model_dummy_input = 2.0 * trained_model_dummy_input
+    trained_model_dummy_input = trained_model_dummy_input.to("cuda")
+    trained_model_testing_output = trained_model(
+        trained_model_dummy_input,
+    )
+    ts_model = load_torchscript(filename=saved_ts_filename)
+    ts_model_output = ts_model(
+        trained_model_dummy_input,
+    )
+
+    if torch.all(ts_model_output.eq(trained_model_testing_output)):
+        print("Saved TorchScript model working as expected in a basic test.")
+        print("Users should perform further validation as appropriate.")
+    else:
+        raise RuntimeError(
+            "Saved Torchscript model is not performing as expected.\n"
+            "Consider using scripting if you used tracing, or investigate further."
+        )
diff --git a/examples/3_MultiGPU/requirements.txt b/examples/3_MultiGPU/requirements.txt
@@ -0,0 +1,2 @@
+mpi4py
+torch