Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ewing cell typing part 3 #134

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

allyhawkins
Copy link
Member

Closes #129

This PR adds running the cell type assignment script to the cell-type-ewings main workflow and should be the last PR to complete that module.

  • To do this, I had to use the output from the cell-type-consensus module as input. I chose to filter to the correct samples in main.nf in the root of this repo so that the Ewing module only takes in the correct samples rather than everything. Are we okay with this choice or would we prefer to filter within the Ewing module workflow?
  • I had to make a small change to the script for assigning cell types in testing to account for when no cells are assigned to any of the tumor cell states.
  • I had been struggling with how to match the input files up for the process when accounting for multiple libraries per sample, but ended up dealing with that in bash rather than in groovy. I think I like my approach and it's easy to follow, but let me know if you have an alternative approach you prefer.

@allyhawkins allyhawkins requested a review from jashapiro March 7, 2025 15:47
Copy link
Member

@jashapiro jashapiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, and I overall like your approach. I suggested a change to get the library ids with groovy instead though, as it adds a bit more flexibility if file names change for some reason (like your change to not compressing).

I also don't think you need to filter both input channels, but I do think maybe it makes sense to filter in the main workflow? I can't decide, to be honest.

@@ -18,6 +18,7 @@ params{
cell_type_ewings_msigdb_list = 'https://raw.githubusercontent.com/AlexsLemonade/OpenScPCA-analysis/refs/tags/v0.2.2/analyses/cell-type-ewings/references/msigdb-gene-sets.tsv'
cell_type_ewings_ews_high_list = 'https://raw.githubusercontent.com/AlexsLemonade/OpenScPCA-analysis/refs/tags/v0.2.2/analyses/cell-type-ewings/references/gene_signatures/aynaud-ews-targets.tsv'
cell_type_ewings_ews_low_list = 'https://raw.githubusercontent.com/AlexsLemonade/OpenScPCA-analysis/refs/tags/v0.2.2/analyses/cell-type-ewings/references/gene_signatures/wrenn-nt5e-genes.tsv'
cell_type_ewings_marker_gene_file = 'https://raw.githubusercontent.com/AlexsLemonade/OpenScPCA-analysis/refs/tags/v0.2.2/analyses/cell-type-ewings/references/gene_signatures/tumor-cell-state-markers.tsv'
cell_type_ewings_marker_gene_file = 'https://raw.githubusercontent.com/AlexsLemonade/OpenScPCA-analysis/refs/tags/v0.2.2/analyses/cell-type-ewings/references/tumor-cell-state-markers.tsv'
cell_type_ewings_auc_thresholds_file = "${projectDir}/modules/cell-type-ewings/resources/auc-thresholds.tsv"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't quite decide if this file should be a parameter, but I think it is fine to leave it as one.

// combine aucell and gene set output with consensus cell types
assign_ch = ewing_aucell.out
// join by sample ID and project ID
.join(consensus_ch, by: [0, 1]) // sample id, project id, aucell, mean exp, consensus, consensus gene exp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Join drops non-matches, which is why we shouldn't need to pre-filter the consensus_ch

@@ -4,7 +4,7 @@

process ewing_aucell {
container params.cell_type_ewing_container
tag "${project_id}"
tag "${sample_id}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor, but can an you also update this in cell-type-consensus?

allyhawkins and others added 3 commits March 11, 2025 16:19
Co-authored-by: Joshua Shapiro <josh.shapiro@ccdatalab.org>
path(celltype_assignment_output_files)
script:
library_ids = aucell_files.collect{(it.name =~ /SCPCL\d{6}/)[0]}
celltype_assignment_output_files = library_ids.collect{"${it}_ewing-aucell-results.tsv"}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, again.

Suggested change
celltype_assignment_output_files = library_ids.collect{"${it}_ewing-aucell-results.tsv"}
celltype_assignment_output_files = library_ids.collect{"${it}_ewing-celltype-assignments.tsv"}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add cell-type-ewings module
2 participants