Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

guess instrument model in demux; pass to output #260

Merged
merged 25 commits into from
May 1, 2021

Conversation

tomkinsc
Copy link
Member

@tomkinsc tomkinsc commented Apr 14, 2021

WIP: this depends on the unreleased viral-core:2.1.24 (or later) with the following merged in:

@@ -164,6 +164,7 @@ workflow demux_deplete {
File spikein_counts = spike_summary.count_summary

String run_date = illumina_demux.run_info[0]['run_start_date']
String instrument_model = select_first(flatten([illumina_demux.instrument_model,[""]]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

miniwdl check is erroring in travis because it doesn't like workflow output variables having the same name as workflow input variables -- perhaps change the name slightly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was residual, but I'll keep it as an override—much better. Thanks for taking a look at this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the continuing CI issue though is that the output variable name is the same as the input variable name. Maybe change the outputs to instrument_model_inferred or something?

@@ -116,7 +116,7 @@ workflow demux_deplete {
platform = "ILLUMINA",
paired = (illumina_demux.run_info[0]['indexes'] == '2'),
out_name = "sra_metadata-~{flowcell_id}.tsv",
instrument_model = select_first([instrument_model]),
instrument_model = select_first(flatten([illumina_demux.instrument_model,[""]])),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused what the flatten is for and why this can't be constructed a little simpler… (like select_first([illumina_demux.instrument_model,""])) and, actually, if we're going to keep the instrument_model) workflow level input, maybe the workflow level input should be optional (String?) and this line could become:

instrument_model = select_first([instrument_model, illumina_demux.instrument_model])

Meaning that a user-specified value would override, but absent that, it defaults to auto-inferring it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

illumina_demux.instrument_model is an array as returned by the scatter-gather of demux jobs, no? I was under the impression we'd need to flatten it since select_first() would still return an Array[String] in the proposed change. Or maybe I'm misunderstanding the WDL docs?

tomkinsc and others added 12 commits April 24, 2021 23:58
allow for override of instrument_model with instrument_model_user_specified
…assembly comment

output instrument_model from demux_deplete in sarscov2_illumina_full assembly comment; pass in overrride to to demux_deplete via instrument_model_user_specified
…titute/viral-pipelines into ct-guess-instrument-model
rename instrument_model to instrument_model_inferred in outputs to avoid collision with task-level output naming
Copy link
Member

@dpark01 dpark01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is ready (presuming viral-core is ready, i've lost track)

@tomkinsc
Copy link
Member Author

This is currently pending including of viral-core as a required module in this repo 2.1.24 (#274), and following that manual check of the resulting CI demux job on DNAnexus.

@tomkinsc tomkinsc marked this pull request as draft April 29, 2021 16:57
@dpark01
Copy link
Member

dpark01 commented Apr 30, 2021

Hm, I just merged #275 which now adds a run_info Map[String][String] to the illumina_demux task output. So you could just pull these out via illumina_demux.run_info['lane_count'] and illumina_demux.runinfo['sequencer_model'] (or whatever we load here)

@tomkinsc tomkinsc marked this pull request as ready for review April 30, 2021 17:48
@tomkinsc
Copy link
Member Author

Thanks, I switched to use illumina_demux.runinfo['sequencer_model'] where appropriate. The demux test on DNAnexus for this PR did output the machine name, so I guess this is ready to merge?

Comment on lines 362 to 363
String instrument_model = read_string("MACHINE_MODEL")
String flowcell_lane_count = read_string("LANE_COUNT")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
String instrument_model = read_string("MACHINE_MODEL")
String flowcell_lane_count = read_string("LANE_COUNT")
String instrument_model = read_json("~{out_base}-runinfo.json")["sequencer_model"]
String flowcell_lane_count = read_json("~{out_base}-runinfo.json")["lane_count"]

Comment on lines 295 to 300
illumina.py flowcell_metadata --inDir $FLOWCELL_DIR flowcellMetadataFile.tsv

# output machine model and lane count
grep "machine" flowcellMetadataFile.tsv | cut -f2 | tee MACHINE_MODEL
grep "lane_count" flowcellMetadataFile.tsv | cut -f2 | tee LANE_COUNT

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can drop these lines if you make the changes below in the output section

@dpark01 dpark01 merged commit e1b71c2 into master May 1, 2021
@dpark01 dpark01 deleted the ct-guess-instrument-model branch May 1, 2021 03:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants