Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[query] VDS combiner will not load a successfully completed plan causing combiner to be rerun when a successful run has been completed #14079

Closed
chrisvittal opened this issue Dec 7, 2023 · 2 comments · Fixed by #14087
Assignees

Comments

@chrisvittal
Copy link
Collaborator

The constructor to the VDS Combiner has this sanity check:

if not (vdses or gvcfs):
raise ValueError("one of 'vdses' or 'gvcfs' must be nonempty")

A complete combiner will not have any vdses or gvcfs present, so that sanity check will fail and the combiner will be rerun in its entirety.

It is a valid state for a VariantDatasetCombiner to have no vdses or gvcfs (when it is done), and so the fix is straightforward, remove the sanity check. A similar one already exists in new_combiner and VariantDatasetCombiner.__init__ isn't really part of the public interface.

I'm undecided if we should add a different sanity check to maybe_load_from_saved_path to see if the final file is present if the combiner is done. Though better logging will be added to that function so that the message from the exception is logged.

@chrisvittal chrisvittal added the bug label Dec 7, 2023
@chrisvittal chrisvittal self-assigned this Dec 7, 2023
@danking danking added the query label Dec 7, 2023
@danking danking changed the title VDS combiner will not load a successfully completed plan causing combiner to be rerun when a successful run has been completed [query] VDS combiner will not load a successfully completed plan causing combiner to be rerun when a successful run has been completed Dec 7, 2023
@danking
Copy link
Contributor

danking commented Dec 7, 2023

Either as part of this issue or in a new issue can we also add a check that prevents deletion of a VDS whose reference and variant MTs have _SUCCESS files?

@chrisvittal
Copy link
Collaborator Author

There's nothing in the code that currently can delete the final VDS, but I will add a check for existence to the various ways of obtaining a combiner object.

danking pushed a commit that referenced this issue Dec 9, 2023
…14087)

* Add assertion to `load_combiner` and `new_combiner` to fail if the
output vds exists
* Remove assertion that disallows empty `gvcfs` and `vdses` in
`VariantDatasetCombiner.__init__`

Resolves #14079
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants