Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not fetch GBW file by default #73

Merged
merged 3 commits into from
May 21, 2023

Conversation

danielhollas
Copy link
Collaborator

@danielhollas danielhollas commented May 12, 2023

ORCA stores molecular wavefunction in a binary GBW file, which can then be used to restart a calculation or used as a wavefunction guess. Currently, in OrcaCalculation we always fetch this file as part of the retrieved folder. However, this is not ideal since these files can get very big and it is not possible to delete them once the workflow finishes since they become part of the AiiDA provenance.

This PR is a subset of #72. Here I simply remove the GBW file from the default retrieve list. User can still add it when needed via inputs.metadata.options.additional_retrieve_list.

In #72, I proposed a new separate keyword that would instruct the calcjob to attach the GBW file as a SinglefileData output node, for easier manipulation. But as @mbercx explained in a detailed comment #72 (comment), we might want to instead have a more general input node parent_calc_folder, which is in line how other AiiDA plugins handle these big files and restarts.

I am postponing that design discussion to a future PR, since I am now in a rush to publish a first working version of my App.
Hence this PR does only the minimal thing to unblock me. Since this is a breaking change, it should be released as a version 0.7.

Since we're doing a breaking change anyway, I decided to do a little cleanup and remove the settings input Dict since it is not really needed (at least for now).

Step towards #68

@danielhollas danielhollas force-pushed the gbw-not-default branch 2 times, most recently from f1f1b7d to 1ed347b Compare May 13, 2023 10:28
@codecov-commenter
Copy link

codecov-commenter commented May 13, 2023

Codecov Report

Merging #73 (1ed347b) into develop (e267e8b) will decrease coverage by 0.06%.
The diff coverage is 100.00%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop      #73      +/-   ##
===========================================
- Coverage    57.33%   57.27%   -0.06%     
===========================================
  Files           15       15              
  Lines         1589     1587       -2     
===========================================
- Hits           911      909       -2     
  Misses         678      678              
Impacted Files Coverage Δ
aiida_orca/parsers/__init__.py 88.46% <ø> (-0.43%) ⬇️
aiida_orca/calculations/orca_orca.py 97.22% <100.00%> (ø)

@danielhollas danielhollas changed the title WIP: Do not fetch GBW file by default Do not fetch GBW file by default May 16, 2023
ORCA stores molecular wavefunction in a binary GBW file,
which can then be used to restart a calculation or used
as a wavefunction guess. Currently, in OrcaCalculation
we always fetch this file as part of the retrieved folder.
This is not ideal since these files can get very big and
it is not possible to delete them once the workflow finishes
since they become part of the AiiDA provenance.

I simply remove the GBW file from the default retrieve list.
User can still add it when needed via inputs.metadata.options.additional_retrieve_list
The additional_retrieve_list option is now available as
inputs.metadata.options.additional_retrieve_list since AiiDA 1.6.0.
aiidateam/aiida-core#4423
The only remaining supported keyword in settings Dict was 'cmdline'
for passing extra command line parameters.
However, I had a look at ORCA manual and orca binary actually does not support
any other params besides the input file. The only exception are
extra MPI params, that must be passed **after** the input file,
not before, as was done here, and needs to be escaped by quotes.
See section "Hints of the use of Parallel ORCA" in the manual.

If we ever need to support this,
it's better to give it a different and more specific name anyway.
Until then let's remove the settings input to keep the code clean.
@danielhollas danielhollas marked this pull request as ready for review May 16, 2023 15:50
@danielhollas danielhollas self-assigned this May 16, 2023
@danielhollas danielhollas requested a review from ezpzbz May 16, 2023 15:55
@danielhollas
Copy link
Collaborator Author

Paging @mbercx in case you have a bit of time to give this PR your blessing. (it's very annoying the Github does not allow me to add you as a reviewer, not sure if this can be tweaked in repo settings?)

@ezpzbz ezpzbz merged commit fc94ed1 into ezpzbz:develop May 21, 2023
@mbercx
Copy link

mbercx commented May 22, 2023

Ah, I seem to have missed this ping, apologies @danielhollas! As penance, I will review #72 in great detail once it's ready.

the additional_retrieve_list is now supported directly in aiida-core since 1.6.0. aiidateam/aiida-core#4423

I actually wasn't aware of this one. 😅 I guess we can remove it from our settings too then!

@danielhollas danielhollas deleted the gbw-not-default branch May 22, 2023 11:04
@danielhollas
Copy link
Collaborator Author

@mbercx no worries, no penance needed. 😄

@danielhollas
Copy link
Collaborator Author

@pzarabadip thanks for merging and publishing a new version! 👏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants