Bug: Error when loading a model with the se_a_mask descriptor #3928

wanghan-iapcm · 2024-06-29T02:30:15Z

Discussed in #3924

^{Originally posted by lukasbaldauf June 28, 2024}
Hi All,
I want to evaluate a trained model using the se_a_mask descriptor, but I'm encountering an error. The training goes smoothly and I get good accuracies for my system, I'm just having trouble loading the model. I get the same error when evaluating the trained zinc_protein example system (see the error message below). It seems like something related to "dfparam" and "daparam", where the Tensors are missing.

I get the same errors for deepmd versions 2.2.7 and 2.2.10, and tensorflow versions 2.9.0 and 2.15.0.

For the zinc example, I train and freeze the model as such:

dp train zinc_se_a_mask.json --skip-neighbor-stat
dp freeze -o graph.mask.pb

The problem occurs when I want to load the model:

from deepmd.infer import DeepPot
model = DeepPot("graph.mask.pb")

Traceback (most recent call last):
File "", line 1, in
File "/home/lukasb/miniforge3/envs/deepmd_gpu_2.2.7/lib/python3.10/site-packages/deepmd/infer/deep_pot.py", line 156, in init
self._get_tensor(tensor_name, attr_name)
File "/home/lukasb/miniforge3/envs/deepmd_gpu_2.2.7/lib/python3.10/site-packages/deepmd/infer/deep_eval.py", line 165, in _get_tensor
tensor = self.graph.get_tensor_by_name(tensor_path)
File "/home/lukasb/miniforge3/envs/deepmd_gpu_2.2.7/lib/python3.10/site-packages/tensorflow/python/framework/ops.py", line 4128, in get_tensor_by_name
return self.as_graph_element(name, allow_tensor=True, allow_operation=False)
File "/home/lukasb/miniforge3/envs/deepmd_gpu_2.2.7/lib/python3.10/site-packages/tensorflow/python/framework/ops.py", line 3952, in as_graph_element
return self._as_graph_element_locked(obj, allow_tensor, allow_operation)
File "/home/lukasb/miniforge3/envs/deepmd_gpu_2.2.7/lib/python3.10/site-packages/tensorflow/python/framework/ops.py", line 3992, in _as_graph_element_locked
raise KeyError("The name %s refers to a Tensor which does not "
KeyError: "The name 'load/fitting_attr/dfparam:0' refers to a Tensor which does not exist. The operation, 'load/fitting_attr/dfparam', does not exist in the graph."

The text was updated successfully, but these errors were encountered:

njzjz · 2024-06-29T04:58:37Z

Reproduced in v2.2.10 but failed to reproduce in devel.

Fix deepmodeling#3928. Prevent `fitting_attr` from becoming `fitting_attr_1`. Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>

…_attr_1` (#3930) Fix #3928. Prevent `fitting_attr` from becoming `fitting_attr_1`.  ## Summary by CodeRabbit - **Refactor** - Improved TensorFlow variable scope management by switching to `tf.AUTO_REUSE` to streamline code and reduce the likelihood of variable reuse conflicts.  --------- Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>

…_attr_1` (deepmodeling#3930) Fix deepmodeling#3928. Prevent `fitting_attr` from becoming `fitting_attr_1`.  ## Summary by CodeRabbit - **Refactor** - Improved TensorFlow variable scope management by switching to `tf.AUTO_REUSE` to streamline code and reduce the likelihood of variable reuse conflicts.  --------- Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu> (cherry picked from commit e809e64)

…_attr_1` (#3930) Fix #3928. Prevent `fitting_attr` from becoming `fitting_attr_1`.  ## Summary by CodeRabbit - **Refactor** - Improved TensorFlow variable scope management by switching to `tf.AUTO_REUSE` to streamline code and reduce the likelihood of variable reuse conflicts.  --------- Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu> (cherry picked from commit e809e64)

…_attr_1` (deepmodeling#3930) Fix deepmodeling#3928. Prevent `fitting_attr` from becoming `fitting_attr_1`.  ## Summary by CodeRabbit - **Refactor** - Improved TensorFlow variable scope management by switching to `tf.AUTO_REUSE` to streamline code and reduce the likelihood of variable reuse conflicts.  --------- Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>

wanghan-iapcm added the reproduced This bug has been reproduced by developers label Jun 29, 2024

njzjz added failed to reproduce bug reproduced This bug has been reproduced by developers and removed reproduced This bug has been reproduced by developers failed to reproduce labels Jun 29, 2024

njzjz added a commit to njzjz/deepmd-kit that referenced this issue Jun 29, 2024

fix(pt): change fitting_attr variable scope reuse to AUTO_REUSE

39bd6be

Fix deepmodeling#3928. Prevent `fitting_attr` from becoming `fitting_attr_1`. Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>

njzjz mentioned this issue Jun 29, 2024

fix(tf): prevent fitting_attr variable scope from becoming fitting_attr_1 #3930

Merged

njzjz linked a pull request Jun 29, 2024 that will close this issue

fix(tf): prevent fitting_attr variable scope from becoming fitting_attr_1 #3930

Merged

njzjz self-assigned this Jun 29, 2024

njzjz closed this as completed Jul 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: Error when loading a model with the se_a_mask descriptor #3928

Bug: Error when loading a model with the se_a_mask descriptor #3928

wanghan-iapcm commented Jun 29, 2024

njzjz commented Jun 29, 2024

Bug: Error when loading a model with the se_a_mask descriptor #3928

Bug: Error when loading a model with the se_a_mask descriptor #3928

Comments

wanghan-iapcm commented Jun 29, 2024

Discussed in #3924

njzjz commented Jun 29, 2024