-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add SevenNet to Matbench Discovery leaderboard #112
Conversation
Thanks for the submission look's like a great addition and open source open data SOTA assuming that @janosh replicates your metrics when we look at updating the static figures. I have a few questions about how SevenNet-O relates to Nequip for the purposes of MBD. The proposed algorithm for calculating energies and forces when deployed in a spatial decomposition setup makes no difference to the predictions of energies and forces? As such there are no additional hyper-parameters relating to training to be included and the only difference is this self interaction embedding? |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
@YutackPark could you check the hyperparameters listed before we merge, I took the values from the paper but it appears that the values used for this snapshot might be different: https://github.com/MDIL-SNU/SevenNet/blob/1c97a881bf52f86f985a030028f1044d93a549f4/pretrained_potentials/SevenNet_0__11July2024/pre_train.yaml. We're running some more analysis and so it may be a few more day before we merge but this looks like a great contribution based on the preliminaries! |
Thanks, @CompRhys! It looks like a lot of work has been done while I was sleeping. Here are my answers to your questions:
Yes, that is correct. However, for this benchmark, we did not use our parallel algorithm. Our proposed algorithm is beneficial when the target system has a sufficient number of atoms (typically more than 1000 atoms), which is not the case for this benchmark.
Yes. You can refer to the third paragraph of section 5.1 in our JCTC paper for reference. This is the only notable change compared to NequIP. If you set
I apologize for the confusion. This is because there are two SevenNet-0 potentials. The first one, SevenNet-0 (22May2024), is introduced in the paper. The second one, SevenNet-0 (11July2024), is trained on the MPTrj dataset and is different from the one mentioned in our paper. The second one is the potential we are submitting. We had to modify the scheduler for use with the MPTrj dataset. The pre_train.yaml is what we prepared for reproducibility as it is not introduced in the paper. I will update the sevennet.yml accordingly. Thank you for your efforts. If you have any further questions or need additional assistance, please let me know. P.S. If there are no concerns, may I link the official journal article in the |
Thanks for helping to update the metadata in the
Our strong preference is for the paper field to link to a place where the manuscript or a preprint is open access so anyone can read it, we use the doi field to point to any journal publication, see |
re MACE I believe the readme is out of date and that the snapshot used for these predictions was trained with 2023-12-03-mace-128-L1.sh which corresponds to the |
Thank you for clarifying. I have updated
I agree; there’s no issue with copying
If that’s the case, I’m happy to link to arXiv instead of the journal. I will update the arXiv with the peer-reviewed version, as it is currently outdated. |
If there's a CLI, a shell script + the YAML would be great. Don't worry about a Python script or refactoring anything just for the purposes of merging your SevenNet results into the benchmark table. If in the future you make tweaks/refactor and train an improved model, we can always update the metadata about training then to capture the best practices. |
@YutackPark are you planning to make a PyPI release for the |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
rename write_no_bad.py to filter_bad_preds.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @YutackPark for this excellent addition! 👍
congrats on having trained such a strong model
Dear Developers,
I hope this message finds you well. Thank you for your efforts for maintaining the leaderboard. I am writing to submit a pull request to update the leaderboard with our model, SevenNet. I have carefully followed your instructions for submitting new models. However, if there is anything missing or unclear, please let me know.
Our test codes are very similar to
test_mace.py
. We found that the relaxation optimizer choices and hyperparameters used by MACE are also suitable for our model.List of files
2024-07-11-sevennet-preds.csv.gz
2024-07-11-sevennet-preds-no-bad.csv.gz
2024-07-11-sevennet-preds-bad.csv.gz
sevennet.yml
test_sevennet.py
join_results.py
write_no_bad.py
File Descriptions
2024-07-11-sevennet-preds.csv.gz
: Result of our model without outlier screening.2024-07-11-sevennet-preds-no-bad.csv.gz
: The main result we wish to submit to the leaderboard.2024-07-11-sevennet-preds-bad.csv.gz
: Screened outliers using the same criteria as injoin_mace_results.py
. We have three missing predictions according to this screening.sevennet.yml
: Metadata about our modeltest_sevennet.py
: Main benchmark code, similar totest_mace.py
join_results.py
: Script for joining results into single .csv and .json files, similar tojoin_mace_results.py
write_no_bad.py
: Script for screening outliers from model predictions, copied fromjoin_mace_results.py
Our code, training procedures, and model are all freely available via the SevenNet repository.
Note for our Model
Our paper primarily discusses a parallel algorithm for GNN interatomic potentials, including the UIP model SevenNet-0 (22May2024) trained on the MPF dataset. However, the model we are submitting is not the same as the one introduced in the paper. We trained a new model, SevenNet-0 (11July2024), using the MPTrj dataset with the same model hyperparameters and architecture. SevenNet-0 (11July2024) is the model used to produce the results we are submitting.
Detailed training configurations to reproduce the potential SevenNet-0 (11July2024) are available in our repository.
Below is our internal result regarding the metrics on the leaderboard. Please inform us if there are any inconsistencies.
Thank you once more for your efforts in developing and maintaining this leaderboard.
Best regards,
Yutack Park
Materials Data & Informatics Laboratory
Seoul National University