Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: add model format for dpa1 #3211

Closed
wants to merge 7 commits into from
Closed

Conversation

iProzd
Copy link
Collaborator

@iProzd iProzd commented Feb 1, 2024

This PR add model format for DPA1 model:

  • Add torch reformat implementation for DPA1 model
  • Add numpy implementation for DPA1 model without attention layer
  • Align the torch and numpy implementations

TODO:

  • Add numpy implementation for DPA1 model with attention layer
  • Align the TF and numpy implementations
  • Align the smoothness implementations
  • Make filter_layers._networks in torch be accessable from outside

deepmd/pt/model/descriptor/dpa1.py Fixed Show fixed Hide fixed
atype_embd = atype_embd_ext[:, :nloc, :]
# nf x nloc x nnei x tebd_dim
atype_embd_nnei = np.tile(atype_embd[:, :, np.newaxis, :], (1, 1, nnei, 1))
nlist_mask = nlist != -1

Check notice

Code scanning / CodeQL

Unused local variable Note

Variable nlist_mask is not used.
deepmd/model_format/dpa1.py Fixed Show fixed Hide fixed
source/tests/pt/test_dpa1.py Fixed Show fixed Hide fixed
source/tests/pt/test_dpa1.py Fixed Show fixed Hide fixed
):
dtype = PRECISION_DICT[prec]
rtol, atol = get_tols(prec)
err_msg = f"idt={idt} prec={prec}"

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable err_msg is not used.
dd0.se_atten.mean = torch.tensor(davg, dtype=dtype, device=env.DEVICE)
dd0.se_atten.dstd = torch.tensor(dstd, dtype=dtype, device=env.DEVICE)
# dd1 = DescrptDPA1.deserialize(dd0.serialize())
model = torch.jit.script(dd0)

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable model is not used.
resnet=False,
precision=precision,
)
self.w = self.w.squeeze(0) # keep the weight shape to be [num_in]

Check warning

Code scanning / CodeQL

Overwriting attribute in super-class or sub-class Warning

Assignment overwrites attribute w, which was previously defined in superclass
NativeLayer
.
)
self.w = self.w.squeeze(0) # keep the weight shape to be [num_in]
if self.uni_init:
self.w = 1.0

Check warning

Code scanning / CodeQL

Overwriting attribute in super-class or sub-class Warning

Assignment overwrites attribute w, which was previously defined in superclass
NativeLayer
.
self.w = self.w.squeeze(0) # keep the weight shape to be [num_in]
if self.uni_init:
self.w = 1.0
self.b = 0.0

Check warning

Code scanning / CodeQL

Overwriting attribute in super-class or sub-class Warning

Assignment overwrites attribute b, which was previously defined in superclass
NativeLayer
.
Copy link

codecov bot commented Feb 1, 2024

Codecov Report

Attention: 529 lines in your changes are missing coverage. Please review.

Comparison is base (afb440a) 74.39% compared to head (a96cab0) 20.72%.
Report is 2 commits behind head on devel.

Files Patch % Lines
deepmd/pt/model/descriptor/se_atten.py 0.00% 200 Missing ⚠️
deepmd/model_format/dpa1.py 0.00% 117 Missing ⚠️
deepmd/model_format/network.py 0.00% 109 Missing ⚠️
deepmd/pt/model/network/mlp.py 0.00% 64 Missing ⚠️
deepmd/pt/model/descriptor/dpa1.py 0.00% 36 Missing ⚠️
deepmd/model_format/__init__.py 0.00% 1 Missing ⚠️
deepmd/pt/model/descriptor/se_a.py 0.00% 1 Missing ⚠️
deepmd/pt/model/task/ener.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##            devel    #3211       +/-   ##
===========================================
- Coverage   74.39%   20.72%   -53.68%     
===========================================
  Files         345      346        +1     
  Lines       31981    32509      +528     
  Branches     1592     1594        +2     
===========================================
- Hits        23791     6736    -17055     
- Misses       7265    25075    +17810     
+ Partials      925      698      -227     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

embeddings = data.pop("embeddings")
type_embedding = data.pop("type_embedding")
attention_layers = data.pop("attention_layers")
env_mat = data.pop("env_mat")

Check notice

Code scanning / CodeQL

Unused local variable Note

Variable env_mat is not used.
Copy link
Collaborator

@wanghan-iapcm wanghan-iapcm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The serialize and de-serialize of the model_format/dpa1 should be tested.

variables = data.pop("@variables")
embeddings = data.pop("embeddings")
type_embedding = data.pop("type_embedding")
attention_layers = data.pop("attention_layers", None)

Check notice

Code scanning / CodeQL

Unused local variable Note

Variable attention_layers is not used.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it pop and not used?

dd0_state_dict = dd0.se_atten.state_dict()
dd4_state_dict = dd4.se_atten.state_dict()

dd0_state_dict_attn = dd0.se_atten.dpa1_attention.state_dict()

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable dd0_state_dict_attn is not used.
dd4_state_dict = dd4.se_atten.state_dict()

dd0_state_dict_attn = dd0.se_atten.dpa1_attention.state_dict()
dd4_state_dict_attn = dd4.se_atten.dpa1_attention.state_dict()

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable dd4_state_dict_attn is not used.
data = copy.deepcopy(data)
variables = data.pop("@variables")
embeddings = data.pop("embeddings")
type_embedding = data.pop("type_embedding")

Check failure

Code scanning / CodeQL

Modification of parameter with default Error

This expression mutates a
default value
.
variables = data.pop("@variables")
embeddings = data.pop("embeddings")
type_embedding = data.pop("type_embedding")
attention_layers = data.pop("attention_layers", None)

Check failure

Code scanning / CodeQL

Modification of parameter with default Error

This expression mutates a
default value
.
@njzjz njzjz added the Test CUDA Trigger test CUDA workflow label Feb 2, 2024
@github-actions github-actions bot removed the Test CUDA Trigger test CUDA workflow label Feb 2, 2024
Then the scaled dot-product attention method is adopted:

.. math::
A(\mathcal{Q}^{i,l}, \mathcal{K}^{i,l}, \mathcal{V}^{i,l}, \mathcal{R}^{i,l})=\varphi\left(\mathcal{Q}^{i,l}, \mathcal{K}^{i,l},\mathcal{R}^{i,l}\right)\mathcal{V}^{i,l},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

variables = data.pop("@variables")
embeddings = data.pop("embeddings")
type_embedding = data.pop("type_embedding")
attention_layers = data.pop("attention_layers", None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it pop and not used?

Comment on lines +330 to +331
w : np.ndarray, optional
The embedding weights of the layer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mismatch the actual parameters.

Comment on lines +444 to +447
w : np.ndarray, optional
The learnable weights of the normalization scale in the layer.
b : np.ndarray, optional
The learnable biases of the normalization shift in the layer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mismatch the actual parameters.

@njzjz njzjz linked an issue Mar 19, 2024 that may be closed by this pull request
@iProzd
Copy link
Collaborator Author

iProzd commented Apr 21, 2024

This PR is merged into #3696

@iProzd iProzd closed this Apr 21, 2024
@iProzd iProzd deleted the rf_dpa1 branch April 24, 2024 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] pt: refactor DPA-1 in the PyTorch backend
3 participants