Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add workflows to add to or create library and sample tables based on demultiplexing output #507

Merged
merged 34 commits into from
Jan 29, 2024

Conversation

tomkinsc
Copy link
Member

@tomkinsc tomkinsc commented Jan 26, 2024

This PR adds one new workflows:

  • populate_library_and_sample_tables_from_flowcell populate library and sample tables with per-library-lane and per-sample (i.e. named references to one or more libraries) using existing demultiplexing output

It also adds the same functionality of the above workflow as an optional step executed after demux by the existing demux_deplete workflow, if the input insert_demux_outputs_into_terra_tables=true, using outputs passed directly from demultiplexing rather than live table data

These workflows rely on a new task, also added by this PR:

  • tasks_terra.wdl::create_or_update_sample_tables

…tebook; add demux workflow calling it at the end
output method version (branch/tag if from Dockstore), source (dockstore, etc.), and path (source URL) from check_terra_env
refactor gcloud token fetch to a single call and reuse value (default expiration is 3600 seconds)
coerce demux bam File array to String array (so we have their GS paths during table population but do not try to localize the actual bam files)
see: https://github.com/openwdl/wdl/blob/main/versions/1.0/SPEC.md#type-coercion
@tomkinsc tomkinsc requested a review from dpark01 January 29, 2024 17:03
Copy link
Member

@dpark01 dpark01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I looked through this and overall looks great. The populate_library_and_sample_tables_from_flowcell is a mouthful but I can't think of anything better.

My main thing is I'm not a fan of the demux_deplete_and_table_insert as it currently stands. One minor thing is that it's got a ton of boilerplate workflow level inputs that are only used once (to set subworkflow inputs that could have been just left unbound).

But the bigger thing is that I think it might be better if this whole thing were just an optional toggle-able behavior in demux_deplete and not a separate workflow. I.e. maybe a boolean workflow input to demux_deplete that says "do the table insert thing" and then add an if block to the end that calls the two terra-specific tasks (if both requested and possible).

…le insertion to demux_deplete

remove demux_deplete_and_table_insert workflow; add conditional post-demux table insertion to demux_deplete, which can be toggled via the added boolean 'insert_demux_outputs_into_terra_tables'
@tomkinsc
Copy link
Member Author

Good point RE the half-dozen or so workflow-level inputs of demux_deplete_and_table_insert.
If we're ready to include the table functionality as an optional step of demux_deplete, I'll add it to that workflow and remove the then-redundant demux_deplete_and_table_insert workflow.

@tomkinsc tomkinsc merged commit a019ed8 into master Jan 29, 2024
12 checks passed
@tomkinsc tomkinsc deleted the ct-setup-sample-tables branch January 29, 2024 22:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants