Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete VASP files when rerunnning jobs? #197

Closed
QuantumChemist opened this issue Oct 17, 2024 · 7 comments · Fixed by #201
Closed

Delete VASP files when rerunnning jobs? #197

QuantumChemist opened this issue Oct 17, 2024 · 7 comments · Fixed by #201

Comments

@QuantumChemist
Copy link
Contributor

Hello everyone 😃 ,

I wanted to know if there is a possibility to delete all (VASP) files automatically when rerunning a job?

For now, when I want to rerun a VASP job, I have to manually delete the old VASP output files or the job will fail because there is a problem with overwriting the old files. With other job types, there is no such problem.

Thank you for your help in advance!

@utf
Copy link
Collaborator

utf commented Oct 17, 2024

One option would be to set force_overwrite="force" in copy_vasp_outputs. This is controlled in VASP jobs by the copy_vasp_kwargs option of the BaseVaspMaker. You can change this when submitting the job or by updating the jobflow remote database before resubmitting.

@gpetretto
Copy link
Contributor

Hi @QuantumChemist, indeed this is a point that we have been discussing internally as well. As we have already added the option to delete the files when deleting the job or a flow, it would make sense to have this option when rerunning.
I would take the chance to discuss a few points for the implementation:

  1. Should the file deletion be the default? or should it be an option? In general it should not be a problem, but it will slow down the execution of the rerun.
  2. what should happen if the file deletion fails? Should the job be rerun anyway or stop with an error? When dealing with a single Job it may be not be a big deal, but if rerunning a Job plus some of its children, having a failure due to the file deletion may leave some jobs that have completed the rerun procedure and some not. Of course this could always happen in case of issue with the DB connection or a bug in the code, but adding the file deletion likely increases the chances.
  3. For the Flows deletion I was afraid of accidental removal of additional files, so I have added a secutiry check to prevent it. See the PR for details: Minor updates #156. I would be inclined to preserve the same option, but without a failure if the file is not identified.
  4. Following @utf suggestion, there could also be an entirely alternative option for the implementation: instead of deleting the files when rerunning, a job could be marked as rerunned and jobflow-remote could delete all the files (except those related to jobflow-remote) in the folder just before calling Job.run(). Actually this could also always be done in principle, without the need to mark a job a rerunned.

What do you think?

@QuantumChemist
Copy link
Contributor Author

One option would be to set force_overwrite="force" in copy_vasp_outputs. This is controlled in VASP jobs by the copy_vasp_kwargs option of the BaseVaspMaker. You can change this when submitting the job or by updating the jobflow remote database before resubmitting.

Thank you @utf ! I wasn't aware that this option exists.

@JaGeo , what do you say? Should I hardcode the VASP makers in the workflow to always use force_overwrite when the jobs are submitted? 👀

@utf
Copy link
Collaborator

utf commented Oct 17, 2024

@gpetretto These are my personal preferences but I'm very open to other thoughts:

  1. I think that deleting the files should be the default. It will help to make re-runs more deterministic.
  2. If deletion fails it should stop with an error. If you've tried to rerun many jobs, the script should continue onto the next one and report the issue at the end.
  3. This makes sense.
  4. I suggested this as a temporary fix, I think what you're proposing is a better solution.

@QuantumChemist
Copy link
Contributor Author

Hi @QuantumChemist, indeed this is a point that we have been discussing internally as well. As we have already added the option to delete the files when deleting the job or a flow, it would make sense to have this option when rerunning. I would take the chance to discuss a few points for the implementation:

  1. Should the file deletion be the default? or should it be an option? In general it should not be a problem, but it will slow down the execution of the rerun.
  2. what should happen if the file deletion fails? Should the job be rerun anyway or stop with an error? When dealing with a single Job it may be not be a big deal, but if rerunning a Job plus some of its children, having a failure due to the file deletion may leave some jobs that have completed the rerun procedure and some not. Of course this could always happen in case of issue with the DB connection or a bug in the code, but adding the file deletion likely increases the chances.
  3. For the Flows deletion I was afraid of accidental removal of additional files, so I have added a secutiry check to prevent it. See the PR for details: Minor updates #156. I would be inclined to preserve the same option, but without a failure if the file is not identified.
  4. Following @utf suggestion, there could also be an entirely alternative option for the implementation: instead of deleting the files when rerunning, a job could be marked as rerunned and jobflow-remote could delete all the files (except those related to jobflow-remote) in the folder just before calling Job.run(). Actually this could also always be done in principle, without the need to mark a job a rerunned.

What do you think?

Hi @gpetretto :)

I would say the best option would be to make the deletion optional and if the deletion fails that rerunning will stop with an error.

To 4: I would say that in general it might be a really good idea to mark the jobs as "RERUNNING" or similar

@JaGeo
Copy link
Collaborator

JaGeo commented Oct 18, 2024

I personally think deletion should be the default to make it deterministic, as @utf says, but maybe users should be warned that files will be deleted during a standard rerun.

@QuantumChemist
Copy link
Contributor Author

Thank you! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants