Ewing cell typing part 3 #134

allyhawkins · 2025-03-07T15:46:56Z

Closes #129

This PR adds running the cell type assignment script to the cell-type-ewings main workflow and should be the last PR to complete that module.

To do this, I had to use the output from the cell-type-consensus module as input. I chose to filter to the correct samples in main.nf in the root of this repo so that the Ewing module only takes in the correct samples rather than everything. Are we okay with this choice or would we prefer to filter within the Ewing module workflow?
I had to make a small change to the script for assigning cell types in testing to account for when no cells are assigned to any of the tumor cell states.
I had been struggling with how to match the input files up for the process when accounting for multiple libraries per sample, but ended up dealing with that in bash rather than in groovy. I think I like my approach and it's easy to follow, but let me know if you have an alternative approach you prefer.

jashapiro

This looks good, and I overall like your approach. I suggested a change to get the library ids with groovy instead though, as it adds a bit more flexibility if file names change for some reason (like your change to not compressing).

I also don't think you need to filter both input channels, but I do think maybe it makes sense to filter in the main workflow? I can't decide, to be honest.

jashapiro · 2025-03-07T15:50:49Z

config/module_params.config

@@ -18,6 +18,7 @@ params{
  cell_type_ewings_msigdb_list = 'https://raw.githubusercontent.com/AlexsLemonade/OpenScPCA-analysis/refs/tags/v0.2.2/analyses/cell-type-ewings/references/msigdb-gene-sets.tsv'
  cell_type_ewings_ews_high_list = 'https://raw.githubusercontent.com/AlexsLemonade/OpenScPCA-analysis/refs/tags/v0.2.2/analyses/cell-type-ewings/references/gene_signatures/aynaud-ews-targets.tsv'
  cell_type_ewings_ews_low_list = 'https://raw.githubusercontent.com/AlexsLemonade/OpenScPCA-analysis/refs/tags/v0.2.2/analyses/cell-type-ewings/references/gene_signatures/wrenn-nt5e-genes.tsv'
-  cell_type_ewings_marker_gene_file = 'https://raw.githubusercontent.com/AlexsLemonade/OpenScPCA-analysis/refs/tags/v0.2.2/analyses/cell-type-ewings/references/gene_signatures/tumor-cell-state-markers.tsv'
+  cell_type_ewings_marker_gene_file = 'https://raw.githubusercontent.com/AlexsLemonade/OpenScPCA-analysis/refs/tags/v0.2.2/analyses/cell-type-ewings/references/tumor-cell-state-markers.tsv'
+  cell_type_ewings_auc_thresholds_file = "${projectDir}/modules/cell-type-ewings/resources/auc-thresholds.tsv"


I can't quite decide if this file should be a parameter, but I think it is fine to leave it as one.

main.nf