Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] _Error with Pytorch models #3738

Closed
thangckt opened this issue May 4, 2024 · 0 comments · Fixed by #3740
Closed

[BUG] _Error with Pytorch models #3738

thangckt opened this issue May 4, 2024 · 0 comments · Fixed by #3740
Assignees
Labels

Comments

@thangckt
Copy link

thangckt commented May 4, 2024

Bug summary

I try to run a simulation using with model trained with Pytorch backend, and get this error

....../deepmd_kit_dev/source/lmp/pair_deepmd.cpp:463: virtual void LAMMPS_NS::PairDeepMD::compute(int, int): Assertion `sizeof(MPI_Comm) == sizeof(int)' failed.

DeePMD-kit Version

v3.0.0a0-134-gebd809b5

Backend and its version

Pytorch 2.3.0 and 2.2.0

How did you download the software?

Built from source

Input Files, Running Commands, Error Log, etc.

DeePMD-kit WARNING: Environmental variable DP_INTER_OP_PARALLELISM_THREADS is not set. Tune DP_INTER_OP_PARALLELISM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
DeePMD-kit WARNING: Environmental variable OMP_NUM_THREADS is not set. Tune OMP_NUM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
lmp_mpi: /home1/p001cao/0SourceCode/deepmd_kit_dev/source/lmp/pair_deepmd.cpp:463: virtual void LAMMPS_NS::PairDeepMD::compute(int, int): Assertion `sizeof(MPI_Comm) == sizeof(int)' failed.
[com312:10025] *** Process received signal ***
[com312:10025] Signal: Aborted (6)
[com312:10025] Signal code:  (-6)
lmp_mpi: /home1/p001cao/0SourceCode/deepmd_kit_dev/source/lmp/pair_deepmd.cpp:463: virtual void LAMMPS_NS::PairDeepMD::compute(int, int): Assertion `sizeof(MPI_Comm) == sizeof(int)' failed.
[com312:10026] *** Process received signal ***
[com312:10026] Signal: Aborted (6)
[com312:10026] Signal code:  (-6)
lmp_mpi: /home1/p001cao/0SourceCode/deepmd_kit_dev/source/lmp/pair_deepmd.cpp:463: virtual void LAMMPS_NS::PairDeepMD::compute(int, int): Assertion `sizeof(MPI_Comm) == sizeof(int)' failed.
[com312:10027] *** Process received signal ***
[com312:10027] Signal: Aborted (6)
[com312:10027] Signal code:  (-6)
lmp_mpi: /home1/p001cao/0SourceCode/deepmd_kit_dev/source/lmp/pair_deepmd.cpp:463: virtual void LAMMPS_NS::PairDeepMD::compute(int, int): Assertion `sizeof(MPI_Comm) == sizeof(int)' failed.
[com312:10028] *** Process received signal ***
[com312:10028] Signal: Aborted (6)
[com312:10028] Signal code:  (-6)
lmp_mpi: /home1/p001cao/0SourceCode/deepmd_kit_dev/source/lmp/pair_deepmd.cpp:463: virtual void LAMMPS_NS::PairDeepMD::compute(int, int): Assertion `sizeof(MPI_Comm) == sizeof(int)' failed.
lmp_mpi: /home1/p001cao/0SourceCode/deepmd_kit_dev/source/lmp/pair_deepmd.cpp:463: virtual void LAMMPS_NS::PairDeepMD::compute(int, int): Assertion `sizeof(MPI_Comm) == sizeof(int)' failed.
[com312:10024] *** Process received signal ***
[com312:10024] Signal: Aborted (6)
[com312:10024] Signal code:  (-6)
[com312:10029] *** Process received signal ***
[com312:10029] Signal: Aborted (6)
[com312:10029] Signal code:  (-6)
lmp_mpi: /home1/p001cao/0SourceCode/deepmd_kit_dev/source/lmp/pair_deepmd.cpp:463: virtual void LAMMPS_NS::PairDeepMD::compute(int, int): Assertion `sizeof(MPI_Comm) == sizeof(int)' failed.
lmp_mpi: /home1/p001cao/0SourceCode/deepmd_kit_dev/source/lmp/pair_deepmd.cpp:463: virtual void LAMMPS_NS::PairDeepMD::compute(int, int): Assertion `sizeof(MPI_Comm) == sizeof(int)' failed.
[com312:10030] *** Process received signal ***
[com312:10023] *** Process received signal ***

Steps to Reproduce

Install DeepMD-kit from this branch: https://github.com/deepmodeling/deepmd-kit/tree/devel

Train model following this tutorial using pytorch as backend.

And use the model in Lammps

Further Information, Files, and Links

No response

@thangckt thangckt added the bug label May 4, 2024
@njzjz njzjz linked a pull request May 5, 2024 that will close this issue
github-merge-queue bot pushed a commit that referenced this issue May 5, 2024
#3738

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Refactor**
- Improved the efficiency of computation by removing unnecessary checks.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
@njzjz njzjz closed this as completed May 5, 2024
mtaillefumier pushed a commit to mtaillefumier/deepmd-kit that referenced this issue Sep 18, 2024
deepmodeling#3738

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Refactor**
- Improved the efficiency of computation by removing unnecessary checks.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants