diff --git a/_static/img/python_extension_autoload_impl.png b/_static/img/python_extension_autoload_impl.png
new file mode 100644
index 0000000000..64e18fc7b4
Binary files /dev/null and b/_static/img/python_extension_autoload_impl.png differ
diff --git a/advanced_source/python_extension_autoload.rst b/advanced_source/python_extension_autoload.rst
new file mode 100644
index 0000000000..ee7af5d49e
--- /dev/null
+++ b/advanced_source/python_extension_autoload.rst
@@ -0,0 +1,184 @@
+Autoloading Out-of-Tree Extension
+=================================
+
+**Author:** `Yuanhao Ji <https://github.com/shink>`__
+
+The extension autoloading mechanism enables PyTorch to automatically
+load out-of-tree backend extensions without explicit import statements. This
+feature is beneficial for users as it enhances their
+experience and enables them to follow the familiar PyTorch device
+programming model without having to explicitly load or import device-specific
+extensions. Additionally, it facilitates effortless
+adoption of existing PyTorch applications with zero-code changes on
+out-of-tree devices. For further details, refer to the
+`[RFC] Autoload Device Extension <https://github.com/pytorch/pytorch/issues/122468>`_.
+
+.. grid:: 2
+
+    .. grid-item-card:: :octicon:`mortar-board;1em;` What you will learn
+       :class-card: card-prerequisites
+
+       * How to use out-of-tree extension autoloading in PyTorch
+       * Review examples with Intel Gaudi HPU, Huawei Ascend NPU
+
+    .. grid-item-card:: :octicon:`list-unordered;1em;` Prerequisites
+       :class-card: card-prerequisites
+
+       * PyTorch v2.5 or later
+
+.. note::
+
+    This feature is enabled by default and can be disabled by using
+    ``export TORCH_DEVICE_BACKEND_AUTOLOAD=0``.
+    If you get an error like this: "Failed to load the backend extension",
+    this error is independent with PyTorch, you should disable this feature
+    and ask the out-of-tree extension maintainer for help.
+
+How to apply this mechanism to out-of-tree extensions?
+------------------------------------------------------
+
+For instance, suppose you have a backend named ``foo`` and a corresponding package named ``torch_foo``. Ensure that
+your package is compatible with PyTorch 2.5 or later and includes the following snippet in its ``__init__.py`` file:
+
+.. code-block:: python
+
+    def _autoload():
+        print("Check things are working with `torch.foo.is_available()`.")
+
+Then, the only thing you need to do is define an entry point within your Python package:
+
+.. code-block:: python
+
+    setup(
+        name="torch_foo",
+        version="1.0",
+        entry_points={
+            "torch.backends": [
+                "torch_foo = torch_foo:_autoload",
+            ],
+        }
+    )
+
+Now you can import the ``torch_foo`` module by simply adding the ``import torch`` statement without the need to add ``import torch_foo``:
+
+.. code-block:: python
+
+    >>> import torch
+    Check things are working with `torch.foo.is_available()`.
+    >>> torch.foo.is_available()
+    True
+
+In some cases, you might encounter issues with circular imports. The examples below demonstrate how you can address them.
+
+Examples
+^^^^^^^^
+
+In this example, we will be using Intel Gaudi HPU and Huawei Ascend NPU to determine how to
+integrate your out-of-tree extension with PyTorch using the autoloading feature.
+
+`habana_frameworks.torch`_ is a Python package that enables users to run
+PyTorch programs on Intel Gaudi by using the PyTorch ``HPU`` device key.
+
+.. _habana_frameworks.torch: https://docs.habana.ai/en/latest/PyTorch/Getting_Started_with_PyTorch_and_Gaudi/Getting_Started_with_PyTorch.html
+
+``habana_frameworks.torch`` is a submodule of ``habana_frameworks``, we add an entry point to
+``__autoload()`` in ``habana_frameworks/setup.py``:
+
+.. code-block:: diff
+
+    setup(
+        name="habana_frameworks",
+        version="2.5",
+    +   entry_points={
+    +       'torch.backends': [
+    +           "device_backend = habana_frameworks:__autoload",
+    +       ],
+    +   }
+    )
+
+In ``habana_frameworks/init.py``, we use a global variable to track if our module has been loaded:
+
+.. code-block:: python
+
+    import os
+
+    is_loaded = False  # A member variable of habana_frameworks module to track if our module has been imported
+
+    def __autoload():
+        # This is an entrypoint for pytorch autoload mechanism
+        # If the following condition is true, that means our backend has already been loaded, either explicitly
+        # or by the autoload mechanism and importing it again should be skipped to avoid circular imports
+        global is_loaded
+        if is_loaded:
+            return
+        import habana_frameworks.torch
+
+In ``habana_frameworks/torch/init.py``, we prevent circular imports by updating the state of the global variable:
+
+.. code-block:: python
+
+    import os
+
+    # This is to prevent torch autoload mechanism from causing circular imports
+    import habana_frameworks
+
+    habana_frameworks.is_loaded = True
+
+`torch_npu`_ enables users to run PyTorch programs on Huawei Ascend NPU, it
+leverages the ``PrivateUse1`` device key and exposes the device name
+as ``npu`` to the end users.
+
+.. _torch_npu: https://github.com/Ascend/pytorch
+
+We define an entry point in `torch_npu/setup.py`_:
+
+.. _torch_npu/setup.py: https://github.com/Ascend/pytorch/blob/master/setup.py#L618
+
+.. code-block:: diff
+
+    setup(
+        name="torch_npu",
+        version="2.5",
+    +   entry_points={
+    +       'torch.backends': [
+    +           'torch_npu = torch_npu:_autoload',
+    +       ],
+    +   }
+    )
+
+Unlike ``habana_frameworks``, ``torch_npu`` uses the environment variable ``TORCH_DEVICE_BACKEND_AUTOLOAD``
+to control the autoloading process. For example, we set it to ``0`` to disable autoloading to prevent circular imports:
+
+.. code-block:: python
+
+    # Disable autoloading before running 'import torch'
+    os.environ['TORCH_DEVICE_BACKEND_AUTOLOAD'] = '0'
+
+    import torch
+
+How it works
+------------
+
+.. image:: ../_static/img/python_extension_autoload_impl.png
+   :alt: Autoloading implementation
+   :align: center
+
+Autoloading is implemented based on Python's `Entrypoints
+<https://packaging.python.org/en/latest/specifications/entry-points/>`_
+mechanism. We discover and load all of the specific entry points
+in ``torch/__init__.py`` that are defined by out-of-tree extensions.
+
+As shown above, after installing ``torch_foo``, your Python module can be imported
+when loading the entrypoint that you have defined, and then you can do some necessary work when
+calling it.
+
+See the implementation in this pull request: `[RFC] Add support for device extension autoloading
+<https://github.com/pytorch/pytorch/pull/127074>`_.
+
+Conclusion
+----------
+
+In this tutorial, we learned about the out-of-tree extension autoloading mechanism in PyTorch, which automatically
+loads backend extensions eliminating the need to add additional import statements. We also learned how to apply
+this mechanism to out-of-tree extensions by defining an entry point and how to prevent circular imports.
+We also reviewed an example on how to use the autoloading mechanism with Intel Gaudi HPU and Huawei Ascend NPU.
diff --git a/index.rst b/index.rst
index 95c4a8f3ef..d0287ef026 100644
--- a/index.rst
+++ b/index.rst
@@ -509,6 +509,13 @@ Welcome to PyTorch Tutorials
    :link: advanced/privateuseone.html
    :tags: Extending-PyTorch,Frontend-APIs,C++
 
+.. customcarditem::
+   :header: Out-of-tree extension autoloading in Python
+   :card_description: Learn how to improve the seamless integration of out-of-tree extension with PyTorch based on the autoloading mechanism.
+   :image: _static/img/thumbnails/cropped/generic-pytorch-logo.png
+   :link: advanced/python_extension_autoload.html
+   :tags: Extending-PyTorch,Frontend-APIs
+
 .. customcarditem::
    :header: Custom Function Tutorial: Double Backward
    :card_description: Learn how to write a custom autograd Function that supports double backward.
@@ -1110,6 +1117,7 @@ Additional Resources
    advanced/dispatcher
    advanced/extend_dispatcher
    advanced/privateuseone
+   advanced/python_extension_autoload
 
 .. toctree::
    :maxdepth: 2