Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difference between gene_spectra_score.k_3.dt_0_1.txt and gene_spectra_tpm.k_3.dt_0_1.txt #98

Open
tujchl opened this issue Oct 8, 2024 · 0 comments

Comments

@tujchl
Copy link

tujchl commented Oct 8, 2024

Hi,

I applied cNMF to analyze my scRNA-seq data, and I focused on two output files: gene_spectra_score.k_3.dt_0_1.txt and gene_spectra_tpm.k_3.dt_0_1.txt. According to the manual, these files should represent essentially the same information but on different scales (Z-score vs. TPM). However, I noticed a significant discrepancy between the top 50 genes from each file, with only 15 genes overlapping between them. I used the R code below to extract the top genes for each GEP. Could there be an issue with my approach?
##########################################
exprSet = read.table(file = "MC.Merge.gene_spectra_score.k_3.dt_0_1.txt", header = TRUE, sep = "\t")
OR
exprSet = read.table(file = "MC.Merge.gene_spectra_tpm.k_3.dt_0_1.txt", header = TRUE, sep = "\t")
exprSet = exprSet[,-1]
rownames(exprSet) = paste(rep("C", 3), seq(1, 3, 1), sep = "")
top_genes = apply(exprSet, 1, function(x){ names(sort(x, decreasing = TRUE))[1:50]})
###########################################
Additionally, I exported the counts file with only HVGs from Seurat before applying cNMF, assuming it would save computation time. Is this approach acceptable?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant