-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
guess instrument model in demux; pass to output #260
Conversation
@@ -164,6 +164,7 @@ workflow demux_deplete { | |||
File spikein_counts = spike_summary.count_summary | |||
|
|||
String run_date = illumina_demux.run_info[0]['run_start_date'] | |||
String instrument_model = select_first(flatten([illumina_demux.instrument_model,[""]])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
miniwdl check
is erroring in travis because it doesn't like workflow output variables having the same name as workflow input variables -- perhaps change the name slightly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was residual, but I'll keep it as an override—much better. Thanks for taking a look at this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the continuing CI issue though is that the output variable name is the same as the input variable name. Maybe change the outputs to instrument_model_inferred
or something?
@@ -116,7 +116,7 @@ workflow demux_deplete { | |||
platform = "ILLUMINA", | |||
paired = (illumina_demux.run_info[0]['indexes'] == '2'), | |||
out_name = "sra_metadata-~{flowcell_id}.tsv", | |||
instrument_model = select_first([instrument_model]), | |||
instrument_model = select_first(flatten([illumina_demux.instrument_model,[""]])), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused what the flatten
is for and why this can't be constructed a little simpler… (like select_first([illumina_demux.instrument_model,""])
) and, actually, if we're going to keep the instrument_model
) workflow level input, maybe the workflow level input should be optional (String?
) and this line could become:
instrument_model = select_first([instrument_model, illumina_demux.instrument_model])
Meaning that a user-specified value would override, but absent that, it defaults to auto-inferring it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
illumina_demux.instrument_model
is an array as returned by the scatter-gather of demux jobs, no? I was under the impression we'd need to flatten it since select_first()
would still return an Array[String]
in the proposed change. Or maybe I'm misunderstanding the WDL docs?
allow for override of instrument_model with instrument_model_user_specified
…assembly comment output instrument_model from demux_deplete in sarscov2_illumina_full assembly comment; pass in overrride to to demux_deplete via instrument_model_user_specified
…titute/viral-pipelines into ct-guess-instrument-model
rename instrument_model to instrument_model_inferred in outputs to avoid collision with task-level output naming
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is ready (presuming viral-core is ready, i've lost track)
This is currently pending including of viral-core as a required module in this repo 2.1.24 (#274), and following that manual check of the resulting CI demux job on DNAnexus. |
Thanks, I switched to use |
pipes/WDL/tasks/tasks_demux.wdl
Outdated
String instrument_model = read_string("MACHINE_MODEL") | ||
String flowcell_lane_count = read_string("LANE_COUNT") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
String instrument_model = read_string("MACHINE_MODEL") | |
String flowcell_lane_count = read_string("LANE_COUNT") | |
String instrument_model = read_json("~{out_base}-runinfo.json")["sequencer_model"] | |
String flowcell_lane_count = read_json("~{out_base}-runinfo.json")["lane_count"] |
pipes/WDL/tasks/tasks_demux.wdl
Outdated
illumina.py flowcell_metadata --inDir $FLOWCELL_DIR flowcellMetadataFile.tsv | ||
|
||
# output machine model and lane count | ||
grep "machine" flowcellMetadataFile.tsv | cut -f2 | tee MACHINE_MODEL | ||
grep "lane_count" flowcellMetadataFile.tsv | cut -f2 | tee LANE_COUNT | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can drop these lines if you make the changes below in the output section
WIP: this depends on the unreleased viral-core:2.1.24 (or later) with the following merged in: