-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ALIGNN test on WBM data #37
Conversation
Added ALIGNN test on WBM data
rename perturb->n_perturb in train_cgcnn.py
This is great, thank you @pbenner! 🎉 Sorry for the slow reply here. Was about to hit send this morning when the day suddenly turned super packed. Needless to say, I'm excited about this 1st external model submission to Matbench Discovery! I'd thought about running ALIGNN myself but expected it to be similar to MEGNet and so in the end decided against it. But looks like I was wrong. Performance is quite a bit better than expected! I'll refactor the code here a bit and then start updating all the site metrics. You didn't by any chance log the training and test runs for ALIGNN with |
Sorry no, had some minor trouble with wandb and excluded it, but should be easy to add to the existing code. |
@pbenner Just out of curiosity, did you try the pre-trained from alignn.pretrained import get_figshare_model
model = get_figshare_model("mp_e_form_alignnn") If not, I'll run it. Would be good to know how it compares to the ALIGNN specifically trained for this benchmark. Also, looks like we can avoid writing all structures to disk as POSCAR and then reading them in again as Jarvis Atoms. We can convert them directly using from pymatgen.io.jarvis import JarvisAtomsAdaptor |
@pbenner Could you briefly explain the need for |
remove now unneeded make_test_data.py
I didn't know about the JarvisAtomsAdaptor, that's a good hint! I tried the pretrained model and got a MAE of 0.1095, so slightly worse. However, I'm not sure if I have to apply some corrections to the energies, so training the model myself seemed safer. In general, predictive performance is quite sensitive to the number of training samples, so it seems best to me to always use the exact same training data instead of pretrained models (at least if the goal is to compare model architectures). The ALIGNN patch has multiple purposes. First, it fixes some issues when there is no test set (see ALIGNN Issue #104). In this case, we actually forked a test set, but it might be better to use the entire data, as mentioned above, especially if the test set by chance contains some important outliers. Second, the Checkpoint() handler in ALIGNN does not define a score name (see ALIGNN train.py), so it will just save the last two models during training. With this patch, also the best model in terms of accuracy on the validation set is saved, which is the one I use for predictions. This is important, because I used a relatively large n_early_stopping in case the validation accuracy shows a double descent (see Figure 10). Maybe one has to consider here also how individual models are trained, since longer training might be beneficial. |
I also ran the pre-trained ALIGNN before merging this PR just to check how it compares. The MAE doesn't tell the whole story. As a classifier, it significantly underperforms the ALIGNN trained from scratch for MBD. Raw pre-trained ALIGNN (
|
Sure, the model is relatively small, so I just added it to the repo: Alignn best model |
@pbenner do you know whether it's possible to run your alignn model in a uip-gnn setup where it first relaxes itself according to it's own predictions? In some of the metrics we see what we've been calling a characteristic fire-hook curve which we believe is a function of the task mismatch seen in the IS2RE task that is mostly resolved with uip-relaxed-IS2RE. |
@CompRhys The original ALIGNN doesn't predict forces. They released a force field version of ALIGNN a year later in 2022 which looks similar to M3GNet. You could use the original ALIGNN to do repeated energy preds to get finite-difference forces but every time I've seen that it worked poorly. The ALIGNN FF would be interesting to add though. |
2023/06/02 Add ALIGNN test on WBM data * refactor ALIGNN train and test scripts * add Gibson et al. ref link to perturb_structure() doc str rename perturb->n_perturb in train_cgcnn.py * ALIGNN readme add git patch apply instructions * add alignn/metadata.yml * param case output dirs * convert test_alignn_result.json to 2023-06-02-alignn-wbm-IS2RE.csv * add jarvis-tools, alignn to pyproject running-models optional deps * test_alignn.py inline load_data_directory() function * add ALIGNN results to metrics tables and /models page * fix module name clash with tests/test_data.py causing mypy error * add test_pretrained_alignn.py * test_pretrained_alignn.py add wandb logging and slurm_submit() * gzip 2023-06-02-alignn-wbm-IS2RE.csv * scripts/readme.md point to module doc strings for details on each script * drop alignn/requirements.txt (main deps versions are in metadata.yml) * merge test_pretrained_alignn.py into test_alignn.py remove now unneeded make_test_data.py * fix test_pred_files() * add preds from pretrained mp_e_form_alignn * change all model pred files from .csv to .csv.gz --------- Co-authored-by: Philipp Benner <philipp.benner@bam.de> Co-authored-by: Janosh Riebesell <janosh.riebesell@gmail.com>
2023/06/02 Add ALIGNN test on WBM data * refactor ALIGNN train and test scripts * add Gibson et al. ref link to perturb_structure() doc str rename perturb->n_perturb in train_cgcnn.py * ALIGNN readme add git patch apply instructions * add alignn/metadata.yml * param case output dirs * convert test_alignn_result.json to 2023-06-02-alignn-wbm-IS2RE.csv * add jarvis-tools, alignn to pyproject running-models optional deps * test_alignn.py inline load_data_directory() function * add ALIGNN results to metrics tables and /models page * fix module name clash with tests/test_data.py causing mypy error * add test_pretrained_alignn.py * test_pretrained_alignn.py add wandb logging and slurm_submit() * gzip 2023-06-02-alignn-wbm-IS2RE.csv * scripts/readme.md point to module doc strings for details on each script * drop alignn/requirements.txt (main deps versions are in metadata.yml) * merge test_pretrained_alignn.py into test_alignn.py remove now unneeded make_test_data.py * fix test_pred_files() * add preds from pretrained mp_e_form_alignn * change all model pred files from .csv to .csv.gz --------- Co-authored-by: Philipp Benner <philipp.benner@bam.de> Co-authored-by: Janosh Riebesell <janosh.riebesell@gmail.com>
Just a small update. I am currently testing the ALIGNN-FF model for computing relaxed structures: |
Nice! 🎉 For CHGnet and M3GNet, I used matbench-discovery/models/chgnet/test_chgnet.py Lines 67 to 69 in 4a90dee
|
Thanks! It takes around 10 days on 4 A100 GPUs. Results should be done next week. |
Sweet, looking forward to it! |
The relaxed predictions are finally done! You can find the results here: I'm not filing a pull request, because my repository contains the full set of relaxed structures, which you might want to upload to figshare. I had a lot of fun computing the relaxed structures. Some of them converged very poorly, which is why this computation took so long. Also it seems that the prediction error went up. Unfortunately, there is also a bug in ALIGNN or ASE:
which forced me to turn off any logging/trajectories. To debug we would have to recompute some trajectories, but the ones that converge poorly take several days to finish. |
Excellent. That's great news!
Sounds good. I'll start prepping a PR with your code and results then?
What?! That sounds problematic. You're saying some of the 100 test set batches (~2500 structures) took several days? If so, suggests lack of robustness to data variety in the model. |
Added ALIGNN test on WBM data, achieving a MAE of 0.0916. The code needs some double checking to make sure that matbench-discovery is correctly used. coGN would be the next step.