deepmodeling · wanghan-iapcm · Aug 21, 2024 · Aug 20, 2024 · Aug 20, 2024 · Aug 21, 2024
diff --git a/doc/env.md b/doc/env.md
@@ -0,0 +1,79 @@
+# Runtime environment variables
+
+:::{note}
+For build-time environment variables, see [Install from source code](./install/install-from-source.md).
+:::
+
+## All interfaces
+
+:::{envvar} DP_INTER_OP_PARALLELISM_THREADS
+
+**Alias**: `TF_INTER_OP_PARALLELISM_THREADS`
+**Default**: `0`
+
+Control parallelism within TensorFlow (when TensorFlow is built against Eigen) and PyTorch native OPs for CPU devices.
+See [How to control the parallelism of a job](./troubleshooting/howtoset_num_nodes.md) for details.
+:::
+
+:::{envvar} DP_INTRA_OP_PARALLELISM_THREADS
+
+**Alias**: `TF_INTRA_OP_PARALLELISM_THREADS`\*\*
+**Default**: `0`
+
+Control parallelism within TensorFlow (when TensorFlow is built against Eigen) and PyTorch native OPs.
+See [How to control the parallelism of a job](./troubleshooting/howtoset_num_nodes.md) for details.
+:::
+
+## Environment variables of dependencies
+
+- If OpenMP is used, [OpenMP environment variables](https://www.openmp.org/spec-html/5.0/openmpch6.html) can be used to control OpenMP threads, such as [`OMP_NUM_THREADS`](https://www.openmp.org/spec-html/5.0/openmpse50.html#x289-20540006.2).
+- If CUDA is used, [CUDA environment variables](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#cuda-environment-variables) can be used to control CUDA devices, such as `CUDA_VISIBLE_DEVICES`.
+- If ROCm is used, [ROCm environment variables](https://rocm.docs.amd.com/en/latest/conceptual/gpu-isolation.html#environment-variables) can be used to control ROCm devices.
+- {{ tensorflow_icon }} If TensorFlow is used, TensorFlow environment variables can be used.
+- {{ pytorch_icon }} If PyTorch is used, [PyTorch environment variables](https://pytorch.org/docs/stable/torch_environment_variables.html) can be used.
+
+## Python interface only
+
+:::{envvar} DP_INTERFACE_PREC
+
+**Choices**: `high`, `low`; **Default**: `high`
+
+Control high (double) or low (float) precision of training.
+:::
+
+:::{envvar} DP_AUTO_PARALLELIZATION
+
+**Choices**: `0`, `1`; **Default**: `0`
+
+{{ tensorflow_icon }} Enable auto parallelization for CPU operators.
+:::
+
+:::{envvar} DP_JIT
+
+**Choices**: `0`, `1`; **Default**: `0`
+
+{{ tensorflow_icon }} Enable JIT. Note that this option may either improve or decrease the performance. Requires TensorFlow to support JIT.
+:::
+
+:::{envvar} DP_INFER_BATCH_SIZE
+
+**Default**: `1024` on CPUs and as maximum as possible until out-of-memory on GPUs
+
+Inference batch size, calculated by multiplying the number of frames with the number of atoms.
+:::
+
+:::{envvar} DP_BACKEND
+
+**Default**: `tensorflow`
+
+Default backend.
+:::
+
+:::{envvar} NUM_WORKERS
+
+**Default**: 8 or the number of cores (whichever is smaller)
+
+{{ pytorch_icon }} Number of subprocesses to use for data loading in the PyTorch backend.
+See [PyTorch documentation](https://pytorch.org/docs/stable/data.html) for details.
+
+:::
diff --git a/doc/index.rst b/doc/index.rst
@@ -45,6 +45,7 @@ DeePMD-kit is a package written in Python/C++, designed to minimize the effort r
    cli
    third-party/index
    nvnmd/index
+   env
    troubleshooting/index
 
 

diff --git a/doc/inference/cxx.md b/doc/inference/cxx.md
@@ -1,5 +1,9 @@
 # C/C++ interface
 
+:::{note}
+See [Environment variables](../env.md) for the runtime environment variables.
+:::
+
 ## C++ interface
 
 The C++ interface of DeePMD-kit is also available for the model interface, which is considered faster than the Python interface. An example `infer_water.cpp` is given below:

diff --git a/doc/inference/nodejs.md b/doc/inference/nodejs.md
@@ -1,5 +1,9 @@
 # Node.js interface
 
+:::{note}
+See [Environment variables](../env.md) for the runtime environment variables.
+:::
+
 If [Node.js interface is installed](../install/install-nodejs.md), one can use the Node.js interface for model inference, which is a wrapper of [the header-only C++ API](./cxx.md).
 
 A simple example is shown below.

diff --git a/doc/inference/python.md b/doc/inference/python.md
@@ -1,5 +1,9 @@
 # Python interface
 
+:::{note}
+See [Environment variables](../env.md) for the runtime environment variables.
+:::
+
 One may use the python interface of DeePMD-kit for model inference, an example is given as follows
 
 ```python

diff --git a/doc/install/install-from-source.md b/doc/install/install-from-source.md
@@ -136,18 +136,79 @@ pip install .
 
 One may set the following environment variables before executing `pip`:
 
-| Environment variables                               | Allowed value         | Default value          | Usage                                                                                                                                                                                                                                                                                                                                                                                                                                               |
-| --------------------------------------------------- | --------------------- | ---------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| DP_VARIANT                                          | `cpu`, `cuda`, `rocm` | `cpu`                  | Build CPU variant or GPU variant with CUDA or ROCM support.                                                                                                                                                                                                                                                                                                                                                                                         |
-| CUDAToolkit_ROOT                                    | Path                  | Detected automatically | The path to the CUDA toolkit directory. CUDA 9.0 or later is supported. NVCC is required.                                                                                                                                                                                                                                                                                                                                                           |
-| ROCM_ROOT                                           | Path                  | Detected automatically | The path to the ROCM toolkit directory.                                                                                                                                                                                                                                                                                                                                                                                                             |
-| DP_ENABLE_TENSORFLOW                                | 0, 1                  | 1                      | {{ tensorflow_icon }} Enable the TensorFlow backend.                                                                                                                                                                                                                                                                                                                                                                                                |
-| DP_ENABLE_PYTORCH                                   | 0, 1                  | 0                      | {{ pytorch_icon }} Enable customized C++ OPs for the PyTorch backend. PyTorch can still run without customized C++ OPs, but features will be limited.                                                                                                                                                                                                                                                                                               |
-| TENSORFLOW_ROOT                                     | Path                  | Detected automatically | {{ tensorflow_icon }} The path to TensorFlow Python library. By default the installer only finds TensorFlow under user site-package directory (`site.getusersitepackages()`) or system site-package directory (`sysconfig.get_path("purelib")`) due to limitation of [PEP-517](https://peps.python.org/pep-0517/). If not found, the latest TensorFlow (or the environment variable `TENSORFLOW_VERSION` if given) from PyPI will be built against. |
-| PYTORCH_ROOT                                        | Path                  | Detected automatically | {{ pytorch_icon }} The path to PyTorch Python library. By default, the installer only finds PyTorch under the user site-package directory (`site.getusersitepackages()`) or the system site-package directory (`sysconfig.get_path("purelib")`) due to the limitation of [PEP-517](https://peps.python.org/pep-0517/). If not found, the latest PyTorch (or the environment variable `PYTORCH_VERSION` if given) from PyPI will be built against.   |
-| DP_ENABLE_NATIVE_OPTIMIZATION                       | 0, 1                  | 0                      | Enable compilation optimization for the native machine's CPU type. Do not enable it if generated code will run on different CPUs.                                                                                                                                                                                                                                                                                                                   |
-| CMAKE_ARGS                                          | str                   | -                      | Additional CMake arguments                                                                                                                                                                                                                                                                                                                                                                                                                          |
-| &lt;LANG&gt;FLAGS (`<LANG>`=`CXX`, `CUDA` or `HIP`) | str                   | -                      | Default compilation flags to be used when compiling `<LANG>` files. See [CMake documentation](https://cmake.org/cmake/help/latest/variable/CMAKE_LANG_FLAGS.html).                                                                                                                                                                                                                                                                                  |
+:::{envvar} DP_VARIANT
+
+**Choices**: `cpu`, `cuda`, `rocm`; **Default**: `cpu`
+
+Build CPU variant or GPU variant with CUDA or ROCM support.
+:::
+
+:::{envvar} CUDAToolkit_ROOT
+
+**Type**: Path; **Default**: Detected automatically
+
+The path to the CUDA toolkit directory. CUDA 9.0 or later is supported. NVCC is required.
+:::
+
+:::{envvar} ROCM_ROOT
+
+**Type**: Path; **Default**: Detected automatically
+
+The path to the ROCM toolkit directory.
+:::
+
+:::{envvar} DP_ENABLE_TENSORFLOW
+
+**Choices**: `0`, `1`; **Default**: `1`
+
+{{ tensorflow_icon }} Enable the TensorFlow backend.
+:::
+
+:::{envvar} DP_ENABLE_PYTORCH
+
+**Choices**: `0`, `1`; **Default**: `1`
+
+{{ pytorch_icon }} Enable customized C++ OPs for the PyTorch backend. PyTorch can still run without customized C++ OPs, but features will be limited.
+:::
+
+:::{envvar} TENSORFLOW_ROOT
+
+**Type**: Path; **Default**: Detected automatically
+
+{{ tensorflow_icon }} The path to TensorFlow Python library. If not given, by default the installer only finds TensorFlow under user site-package directory (`site.getusersitepackages()`) or system site-package directory (`sysconfig.get_path("purelib")`) due to limitation of [PEP-517](https://peps.python.org/pep-0517/). If not found, the latest TensorFlow (or the environment variable `TENSORFLOW_VERSION` if given) from PyPI will be built against.
+:::
+
+:::{envvar} PYTORCH_ROOT
+
+**Type**: Path; **Default**: Detected automatically
+
+{{ pytorch_icon }} The path to PyTorch Python library. If not given, by default, the installer only finds PyTorch under the user site-package directory (`site.getusersitepackages()`) or the system site-package directory (`sysconfig.get_path("purelib")`) due to the limitation of [PEP-517](https://peps.python.org/pep-0517/). If not found, the latest PyTorch (or the environment variable `PYTORCH_VERSION` if given) from PyPI will be built against.
+:::
+
+:::{envvar} DP_ENABLE_NATIVE_OPTIMIZATION
+
+**Choices**: `0`, `1`; **Default**: `0`
+
+Enable compilation optimization for the native machine's CPU type. Do not enable it if generated code will run on different CPUs.
+:::
+
+:::{envvar} CMAKE_ARGS
+
+**Type**: string
+
+Control high (double) or low (float) precision of training.
+:::
+
+:::{envvar} <LANG>FLAGS
+
+`<LANG>`=`CXX`, `CUDA` or `HIP`
+
+**Type**: string
+
+Default compilation flags to be used when compiling `<LANG>` files. See [CMake documentation](https://cmake.org/cmake/help/latest/variable/CMAKE_LANG_FLAGS.html) for details.
+:::
+
+Other [CMake environment variables](https://cmake.org/cmake/help/latest/manual/cmake-env-variables.7.html) may also be critical.
 
 To test the installation, one should first jump out of the source directory
 

diff --git a/doc/third-party/ase.md b/doc/third-party/ase.md
@@ -1,5 +1,9 @@
 # Use deep potential with ASE
 
+:::{note}
+See [Environment variables](../env.md) for the runtime environment variables.
+:::
+
 Deep potential can be set up as a calculator with ASE to obtain potential energies and forces.
 
 ```python

diff --git a/doc/third-party/dpdata.md b/doc/third-party/dpdata.md
@@ -1,5 +1,9 @@
 # Use deep potential with dpdata
 
+:::{note}
+See [Environment variables](../env.md) for the runtime environment variables.
+:::
+
 DeePMD-kit provides a driver for [dpdata](https://github.com/deepmodeling/dpdata) >=0.2.7 via the plugin mechanism, making it possible to call the `predict` method for `System` class:
 
 ```py

diff --git a/doc/third-party/gromacs.md b/doc/third-party/gromacs.md
@@ -1,5 +1,9 @@
 # Running MD with GROMACS
 
+:::{note}
+See [Environment variables](../env.md) for the runtime environment variables.
+:::
+
 ## DP/MM Simulation
 
 This part gives a simple tutorial on how to run a DP/MM simulation for methane in water, which means using DP for methane and TIP3P for water. All relevant files can be found in `examples/methane`.

diff --git a/doc/third-party/ipi.md b/doc/third-party/ipi.md
@@ -1,5 +1,9 @@
 # Run path-integral MD with i-PI
 
+:::{note}
+See [Environment variables](../env.md) for the runtime environment variables.
+:::
+
 The i-PI works in a client-server model. The i-PI provides the server for integrating the replica positions of atoms, while the DeePMD-kit provides a client named `dp_ipi` that computes the interactions (including energy, forces and virials). The server and client communicate via the Unix domain socket or the Internet socket. Installation instructions for i-PI can be found [here](../install/install-ipi.md). The client can be started by
 
 ```bash

diff --git a/doc/third-party/lammps-command.md b/doc/third-party/lammps-command.md
@@ -1,5 +1,9 @@
 # Run MD with LAMMPS
 
+:::{note}
+See [Environment variables](../env.md) for the runtime environment variables.
+:::
+
 ## units
 
 All units in LAMMPS except `lj` are supported. `lj` is not supported.

diff --git a/doc/train/training-advanced.md b/doc/train/training-advanced.md
@@ -161,14 +161,7 @@ optional arguments:
 **`--skip-neighbor-stat`** will skip calculating neighbor statistics if one is concerned about performance. Some features will be disabled.
 
 To maximize the performance, one should follow [FAQ: How to control the parallelism of a job](../troubleshooting/howtoset_num_nodes.md) to control the number of threads.
-
-One can set other environmental variables:
-
-| Environment variables   | Allowed value | Default value | Usage                                                                                                               |
-| ----------------------- | ------------- | ------------- | ------------------------------------------------------------------------------------------------------------------- |
-| DP_INTERFACE_PREC       | `high`, `low` | `high`        | Control high (double) or low (float) precision of training.                                                         |
-| DP_AUTO_PARALLELIZATION | 0, 1          | 0             | Enable auto parallelization for CPU operators.                                                                      |
-| DP_JIT                  | 0, 1          | 0             | Enable JIT. Note that this option may either improve or decrease the performance. Requires TensorFlow supports JIT. |
+See [Runtime environment variables](../env.md) for all runtime environment variables.
 
 ## Adjust `sel` of a frozen model {{ tensorflow_icon }}
 

diff --git a/doc/troubleshooting/howtoset_num_nodes.md b/doc/troubleshooting/howtoset_num_nodes.md
@@ -30,12 +30,12 @@ For CPU devices, TensorFlow and PyTorch use multiple streams to run independent
 export DP_INTER_OP_PARALLELISM_THREADS=3
 ```
 
-However, for GPU devices, TensorFlow uses only one compute stream and multiple copy streams.
-Note that some of DeePMD-kit OPs do not have GPU support, so it is still encouraged to set environmental variables even if one has a GPU.
+However, for GPU devices, TensorFlow and PyTorch use only one compute stream and multiple copy streams.
+Note that some of DeePMD-kit OPs do not have GPU support, so it is still encouraged to set environment variables even if one has a GPU.
 
-## Parallelism within an individual operators
+## Parallelism within individual operators
 
-For CPU devices, `DP_INTRA_OP_PARALLELISM_THREADS` controls parallelism within TensorFlow (when TensorFlow is built against Eigen) and PyTorch native OPs.
+For CPU devices, {envvar}`DP_INTRA_OP_PARALLELISM_THREADS` controls parallelism within TensorFlow (when TensorFlow is built against Eigen) and PyTorch native OPs.
 
 ```bash
 export DP_INTRA_OP_PARALLELISM_THREADS=2
@@ -49,7 +49,7 @@ It may also control parallelism for NumPy when NumPy is built against OpenMP, so
 export OMP_NUM_THREADS=2
 ```
 
-There are several other environmental variables for OpenMP, such as `KMP_BLOCKTIME`.
+There are several other environment variables for OpenMP, such as `KMP_BLOCKTIME`.
 
 ::::{tab-set}
 
@@ -70,7 +70,7 @@ See [PyTorch documentation](https://pytorch.org/tutorials/recipes/recipes/tuning
 There is no one general parallel configuration that works for all situations, so you are encouraged to tune parallel configurations yourself after empirical testing.
 
 Here are some empirical examples.
-If you wish to use 3 cores of 2 CPUs on one node, you may set the environmental variables and run DeePMD-kit as follows:
+If you wish to use 3 cores of 2 CPUs on one node, you may set the environment variables and run DeePMD-kit as follows:
 
 ::::{tab-set}
 

diff --git a/source/api_cc/include/common.h b/source/api_cc/include/common.h
@@ -142,7 +142,7 @@ void select_map_inv(typename std::vector<VT>::iterator out,
 
 /**
  * @brief Get the number of threads from the environment variable.
- * @details A warning will be thrown if environmental variables are not set.
+ * @details A warning will be thrown if environment variables are not set.
  * @param[out] num_intra_nthreads The number of intra threads. Read from
  *DP_INTRA_OP_PARALLELISM_THREADS.
  * @param[out] num_inter_nthreads The number of inter threads. Read from