-
-

torch.compile for GPU

+
+

torch.compile for GPU (Experimental)

Introduction

Intel® Extension for PyTorch* now empowers users to seamlessly harness graph compilation capabilities for optimal PyTorch model performance on Intel GPU via the flagship torch.compile API through the default “inductor” backend (TorchInductor). The Triton compiler has been the core of the Inductor codegen supporting various accelerator devices. Intel has extended TorchInductor by adding Intel GPU support to Triton. Additionally, post-op fusions for convolution and matrix multiplication, facilitated by oneDNN fusion kernels, contribute to enhanced efficiency for computational intensive operations. Leveraging these features is as simple as using the default “inductor” backend, making it easier than ever to unlock the full potential of your PyTorch models on Intel GPU platforms.

-

Note: torch.compile for GPU is an experimental feature and available from 2.1.10. So far, the feature is functional on Intel® GPU Max Series.

+

Note: torch.compile for GPU is an experimental feature and available from 2.1.10. So far, the feature is functional on Intel® Data Center GPU Max Series.

+
+
+

Required Dependencies

+

Verified version:

+
    +
  • torch : v2.1.0

  • +
  • intel_extension_for_pytorch : v2.1.10

  • +
  • triton : v2.1.0 with Intel® XPU Backend for Triton* backend enabled.

  • +
+

Follow Intel® Extension for PyTorch* Installation to install torch and intel_extension_for_pytorch firstly.

+

Then install Intel® XPU Backend for Triton* backend for triton package. You may install it via prebuilt wheel package or build it from the source. We recommend installing via prebuilt package:

+
    +
  • Download the wheel package from release page. Note that you don’t need to install the LLVM release manually.

  • +
  • Install the wheel package by pip install. Note that this wheel package is a triton package with Intel GPU support, so you don’t need to pip install triton again.

  • +
+
python -m pip install --force-reinstall  triton-2.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+
+
+

Please follow the Intel® XPU Backend for Triton* Installation for more detailed installation steps.

+

Note that if you install triton using make triton command inside PyTorch* repo, the installed triton does not compile with Intel GPU support by default, you will need to manually set TRITON_CODEGEN_INTEL_XPU_BACKEND=1 for enabling Intel GPU support. In addition, for building from the source via the triton repo, the commit needs to be pinned at a tested triton commit. Please follow the Intel® XPU Backend for Triton* Installation #build from the source section for more information about build triton package from the source.

Inferenece with torch.compile

import torch
@@ -203,7 +224,7 @@ 

Training with torch.compileSphinx using a theme provided by Read the Docs. - +

diff --git a/xpu/2.1.10+xpu/tutorials/getting_started.html b/xpu/2.1.10+xpu/tutorials/getting_started.html index c7c4c8fc4..6c4d1e443 100644 --- a/xpu/2.1.10+xpu/tutorials/getting_started.html +++ b/xpu/2.1.10+xpu/tutorials/getting_started.html @@ -183,7 +183,7 @@

ExecutionSphinx using a theme provided by Read the Docs. - +

diff --git a/xpu/2.1.10+xpu/tutorials/installation.html b/xpu/2.1.10+xpu/tutorials/installation.html index c09d6e312..9c8cb5d01 100644 --- a/xpu/2.1.10+xpu/tutorials/installation.html +++ b/xpu/2.1.10+xpu/tutorials/installation.html @@ -131,7 +131,7 @@

InstallationSphinx using a theme provided by Read the Docs. - +

diff --git a/xpu/2.1.10+xpu/tutorials/introduction.html b/xpu/2.1.10+xpu/tutorials/introduction.html index 6aaefb996..dcd3b42e9 100644 --- a/xpu/2.1.10+xpu/tutorials/introduction.html +++ b/xpu/2.1.10+xpu/tutorials/introduction.html @@ -149,7 +149,7 @@

API DocumentationSphinx using a theme provided by Read the Docs. - +

diff --git a/xpu/2.1.10+xpu/tutorials/license.html b/xpu/2.1.10+xpu/tutorials/license.html index 55317571d..ba2708920 100644 --- a/xpu/2.1.10+xpu/tutorials/license.html +++ b/xpu/2.1.10+xpu/tutorials/license.html @@ -132,7 +132,7 @@

LicenseSphinx using a theme provided by Read the Docs. - +

diff --git a/xpu/2.1.10+xpu/tutorials/llm.html b/xpu/2.1.10+xpu/tutorials/llm.html index 30c5e8cb5..be1fcede3 100644 --- a/xpu/2.1.10+xpu/tutorials/llm.html +++ b/xpu/2.1.10+xpu/tutorials/llm.html @@ -239,7 +239,7 @@

Low Precision Data TypesSphinx using a theme provided by Read the Docs. - +

diff --git a/xpu/2.1.10+xpu/tutorials/llm/llm_optimize_transformers.html b/xpu/2.1.10+xpu/tutorials/llm/llm_optimize_transformers.html index 2d20a9d2d..ee56119df 100644 --- a/xpu/2.1.10+xpu/tutorials/llm/llm_optimize_transformers.html +++ b/xpu/2.1.10+xpu/tutorials/llm/llm_optimize_transformers.html @@ -278,7 +278,7 @@

Distributed Inference with DeepSpeedSphinx using a theme provided by Read the Docs. - +

diff --git a/xpu/2.1.10+xpu/tutorials/performance_tuning.html b/xpu/2.1.10+xpu/tutorials/performance_tuning.html index b4c22e4a6..b720bac86 100644 --- a/xpu/2.1.10+xpu/tutorials/performance_tuning.html +++ b/xpu/2.1.10+xpu/tutorials/performance_tuning.html @@ -133,7 +133,7 @@

Performance Tuning GuideSphinx using a theme provided by Read the Docs. - +

diff --git a/xpu/2.1.10+xpu/tutorials/performance_tuning/known_issues.html b/xpu/2.1.10+xpu/tutorials/performance_tuning/known_issues.html index 6ba435681..cef3ba928 100644 --- a/xpu/2.1.10+xpu/tutorials/performance_tuning/known_issues.html +++ b/xpu/2.1.10+xpu/tutorials/performance_tuning/known_issues.html @@ -483,7 +483,7 @@

Float32 TrainingSphinx using a theme provided by Read the Docs. - +

diff --git a/xpu/2.1.10+xpu/tutorials/performance_tuning/launch_script.html b/xpu/2.1.10+xpu/tutorials/performance_tuning/launch_script.html index 2a89b6287..38ca50938 100644 --- a/xpu/2.1.10+xpu/tutorials/performance_tuning/launch_script.html +++ b/xpu/2.1.10+xpu/tutorials/performance_tuning/launch_script.html @@ -829,7 +829,7 @@

GNU OpenMP LibrarySphinx using a theme provided by Read the Docs. - +

diff --git a/xpu/2.1.10+xpu/tutorials/performance_tuning/torchserve.html b/xpu/2.1.10+xpu/tutorials/performance_tuning/torchserve.html index 90f117d15..09d878160 100644 --- a/xpu/2.1.10+xpu/tutorials/performance_tuning/torchserve.html +++ b/xpu/2.1.10+xpu/tutorials/performance_tuning/torchserve.html @@ -462,7 +462,7 @@

Performance Boost with Intel® Extension for PyTorch* and LauncherSphinx using a theme provided by Read the Docs. - +

diff --git a/xpu/2.1.10+xpu/tutorials/performance_tuning/tuning_guide.html b/xpu/2.1.10+xpu/tutorials/performance_tuning/tuning_guide.html index 1bec21fd2..47dee8280 100644 --- a/xpu/2.1.10+xpu/tutorials/performance_tuning/tuning_guide.html +++ b/xpu/2.1.10+xpu/tutorials/performance_tuning/tuning_guide.html @@ -358,7 +358,7 @@

OneDNN primitive cacheSphinx using a theme provided by Read the Docs. - +

diff --git a/xpu/2.1.10+xpu/tutorials/releases.html b/xpu/2.1.10+xpu/tutorials/releases.html index aa2aff214..9d63b2617 100644 --- a/xpu/2.1.10+xpu/tutorials/releases.html +++ b/xpu/2.1.10+xpu/tutorials/releases.html @@ -417,7 +417,7 @@

Known IssuesSphinx using a theme provided by Read the Docs. - +

diff --git a/xpu/2.1.10+xpu/tutorials/technical_details.html b/xpu/2.1.10+xpu/tutorials/technical_details.html index 34e131818..e47259787 100644 --- a/xpu/2.1.10+xpu/tutorials/technical_details.html +++ b/xpu/2.1.10+xpu/tutorials/technical_details.html @@ -194,7 +194,7 @@

Ahead of Time Compilation (AOT) [GPU]Sphinx using a theme provided by Read the Docs. - +

diff --git a/xpu/2.1.10+xpu/tutorials/technical_details/AOT.html b/xpu/2.1.10+xpu/tutorials/technical_details/AOT.html index df3d0e27d..ff0f23418 100644 --- a/xpu/2.1.10+xpu/tutorials/technical_details/AOT.html +++ b/xpu/2.1.10+xpu/tutorials/technical_details/AOT.html @@ -181,7 +181,7 @@

RequirementSphinx using a theme provided by Read the Docs. - +

diff --git a/xpu/2.1.10+xpu/tutorials/technical_details/graph_optimization.html b/xpu/2.1.10+xpu/tutorials/technical_details/graph_optimization.html index 4a471e99e..3d2d94b9f 100644 --- a/xpu/2.1.10+xpu/tutorials/technical_details/graph_optimization.html +++ b/xpu/2.1.10+xpu/tutorials/technical_details/graph_optimization.html @@ -351,7 +351,7 @@

FoldingSphinx using a theme provided by Read the Docs. - +

diff --git a/xpu/2.1.10+xpu/tutorials/technical_details/isa_dynamic_dispatch.html b/xpu/2.1.10+xpu/tutorials/technical_details/isa_dynamic_dispatch.html index d4143f90d..063f495db 100644 --- a/xpu/2.1.10+xpu/tutorials/technical_details/isa_dynamic_dispatch.html +++ b/xpu/2.1.10+xpu/tutorials/technical_details/isa_dynamic_dispatch.html @@ -331,7 +331,7 @@

CPU feature checkSphinx using a theme provided by Read the Docs. - +

diff --git a/xpu/2.1.10+xpu/tutorials/technical_details/memory_management.html b/xpu/2.1.10+xpu/tutorials/technical_details/memory_management.html index 91124ea47..aa8cf7290 100644 --- a/xpu/2.1.10+xpu/tutorials/technical_details/memory_management.html +++ b/xpu/2.1.10+xpu/tutorials/technical_details/memory_management.html @@ -153,7 +153,7 @@

Memory ManagementSphinx using a theme provided by Read the Docs. - +

diff --git a/xpu/2.1.10+xpu/tutorials/technical_details/optimizer_fusion_cpu.html b/xpu/2.1.10+xpu/tutorials/technical_details/optimizer_fusion_cpu.html index aa9998572..fa377c88f 100644 --- a/xpu/2.1.10+xpu/tutorials/technical_details/optimizer_fusion_cpu.html +++ b/xpu/2.1.10+xpu/tutorials/technical_details/optimizer_fusion_cpu.html @@ -175,7 +175,7 @@

Operation FusionSphinx using a theme provided by Read the Docs. - +

diff --git a/xpu/2.1.10+xpu/tutorials/technical_details/optimizer_fusion_gpu.html b/xpu/2.1.10+xpu/tutorials/technical_details/optimizer_fusion_gpu.html index bb60010d2..ccc51c198 100644 --- a/xpu/2.1.10+xpu/tutorials/technical_details/optimizer_fusion_gpu.html +++ b/xpu/2.1.10+xpu/tutorials/technical_details/optimizer_fusion_gpu.html @@ -180,7 +180,7 @@

Operation FusionSphinx using a theme provided by Read the Docs. - +

diff --git a/xpu/2.1.10+xpu/tutorials/technical_details/split_sgd.html b/xpu/2.1.10+xpu/tutorials/technical_details/split_sgd.html index cb68236a1..edc431d7b 100644 --- a/xpu/2.1.10+xpu/tutorials/technical_details/split_sgd.html +++ b/xpu/2.1.10+xpu/tutorials/technical_details/split_sgd.html @@ -209,7 +209,7 @@

Split SGDSphinx using a theme provided by Read the Docs. - +