-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Different results between absoluted and doabsoluted? #26
Comments
code is below |
Hi, @xiasijian Thanks for your report about the comparison. As the DoAbsolute is just an open-source wrapper of ABSOLUTE, there are only two reasons for explaining the minor difference.
I guess you are using the same version of ABSOLUTE, so the point 2 would be the only reason. The code related to the reason has been given as the below. Lines 18 to 55 in 8748cfb
Lines 136 to 141 in 8748cfb
Line 203 in 8748cfb
|
Thanks for your quick reply, but I found a problem in your example data. |
然而,如果只是用原的测试数据,两者的结果是一样的。 |
Thanks. |
I do not know why this is, can you tell me? Thanks. |
我不是很理解,你不是说结果一致了吗? 原的测试数据又是什么意思? |
王老师,是这样的,我发现当MAF文件和Seg文件的数目比较少时(如上图,我做了一定抽样),两者结果是不一样的,而我说的原测试数据,就是不进行任何抽样。
…---Original---
From: "Shixiang Wang ***@***.***>
Date: Fri, Feb 24, 2023 18:58 PM
To: ***@***.***>;
Cc: "Sijian ***@***.******@***.***>;
Subject: Re: [ShixiangWang/DoAbsolute] Different results between absoluted anddoabsoluted? (Issue #26)
然而,如果只是用原的测试数据,两者的结果是一样的。
我不是很理解,你不是说结果一致了吗? 原的测试数据又是什么意思?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
那我只能理解为2次抽样的数据不一样。另外 DoAbsolute 看你使用并行没有,如果是的话不一定 set.seed 能 work。其他的可能的原因我上面也已经说了。 我这边也不存在其他的可能解释。 一个最简单测试重复的办法是运行到调用 RunAbsolute 函数时,对比下这里的输入文件和你单独运行 ABSOLUTE 的输入文件是否一致。 Lines 285 to 295 in 8748cfb
你可以指定 cores = 1,然后运行 |
王老师,这个输入是一致的。
…---Original---
From: "Shixiang Wang ***@***.***>
Date: Fri, Feb 24, 2023 19:27 PM
To: ***@***.***>;
Cc: "Sijian ***@***.******@***.***>;
Subject: Re: [ShixiangWang/DoAbsolute] Different results between absoluted anddoabsoluted? (Issue #26)
那我只能理解为2次抽样的数据不一样。另外 DoAbsolute 看你使用并行没有,如果是的话不一定 set.seed 能 work。其他的可能的原因我上面也已经说了。 我这边也不存在其他的可能解释。
一个最简单测试重复的办法是运行到调用 RunAbsolute 函数时,对比下这里的输入文件和你单独运行 ABSOLUTE 的输入文件是否一致。
https://github.com/ShixiangWang/DoAbsolute/blob/8748cfb39eca5059f539a57829c49c83a869a31a/R/DoAbsolute.R#L285-L295
你可以指定 cores = 1,然后运行 debug(ABSOLUTE::RunAbsolute),然后跑 DoAbsolute,等待跳到运行上面代码之前。
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
好的,谢谢,我试下您提到的方法。
…---Original---
From: "Shixiang Wang ***@***.***>
Date: Fri, Feb 24, 2023 19:27 PM
To: ***@***.***>;
Cc: "Sijian ***@***.******@***.***>;
Subject: Re: [ShixiangWang/DoAbsolute] Different results between absoluted anddoabsoluted? (Issue #26)
那我只能理解为2次抽样的数据不一样。另外 DoAbsolute 看你使用并行没有,如果是的话不一定 set.seed 能 work。其他的可能的原因我上面也已经说了。 我这边也不存在其他的可能解释。
一个最简单测试重复的办法是运行到调用 RunAbsolute 函数时,对比下这里的输入文件和你单独运行 ABSOLUTE 的输入文件是否一致。
https://github.com/ShixiangWang/DoAbsolute/blob/8748cfb39eca5059f539a57829c49c83a869a31a/R/DoAbsolute.R#L285-L295
你可以指定 cores = 1,然后运行 debug(ABSOLUTE::RunAbsolute),然后跑 DoAbsolute,等待跳到运行上面代码之前。
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
王老师,这是我测试的脚本和输出,我载入了两者的最后保存的Rdata文件,确认输入的一样的,但是在model.res的结果却完全不一样,因此导致结果不一致。 |
王老师,还有一个问题,DoAbsolute最后的输出为NA,具体含义是? |
那这就很奇怪了啊,你使用的应该是同一个absolute版本才对 |
那就是调用的默认参数不同,你再看看,不应该有其他解释,那太玄学了 |
王老师,确实很让人疑惑,我把您的脚本看了不下10遍,没发现很影响结果的代码,但当Seg文件和Maf文件的行数较少时,这两个软件就会出现偏差,这种偏差既体现在肿瘤纯度,也体现在最后得到的Rdata文件。 |
那就暂时不要纠结这个了,你直接用 ABSOLUTE 更好的话就直接用 for 循环调用即可,我这个包写的目的也就是为了方便一点。 下面这段代码就给后人参考吧,我会在 README 中说明一下。你后续如果发现真正的原因也可以后续再讨论。 rm(list=ls())
gc()
library(DoAbsolute)
library(dplyr)
example_path = system.file("extdata", package = "DoAbsolute", mustWork = T)
library(data.table)
library(ABSOLUTE)
##set workspace
outdir="./output/"
setwd(workdir)
# segmentation file
seg_normal = file.path(example_path, "SNP6_blood_normal.seg.txt")
seg_solid = file.path(example_path, "SNP6_solid_tumor.seg.txt")
seg_metastatic = file.path(example_path, "SNP6_metastatic_tumor.seg.txt")
# MAF file
maf_solid = file.path(example_path, "solid_tumor.maf.txt")
maf_metastatic = file.path(example_path, "metastatic_tumor.maf.txt")
# read data
seg_normal = fread(seg_normal)
seg_solid = fread(seg_solid)
seg_metastatic = fread(seg_metastatic)
maf_solid = fread(maf_solid)
maf_metastatic = fread(maf_metastatic)
# merge data
Seg = Reduce(rbind, list(seg_normal, seg_solid, seg_metastatic))
Maf = Reduce(rbind, list(maf_solid, maf_metastatic))
Seg$Sample = substr(Seg$Sample, 1, 15)
Maf$Tumor_Sample_Barcode = substr(Maf$Tumor_Sample_Barcode, 1, 15)
test_Seg=Seg %>% subset(Sample %in% c("TCGA-DK-A1A6-01"))
test_Seg=test_Seg[1:42,]
table(test_Seg$Chromosome)
test_Maf=Maf %>% subset(Tumor_Sample_Barcode %in% c("TCGA-DK-A1A6-01"))
table(test_Maf$Chromosome)
test_Maf$Chromosome=ifelse(test_Maf$Chromosome=="X",23,test_Maf$Chromosome)
test_Maf=test_Maf[1:28,]
############################ DoAbsolute #################################
# test function
DoAbsolute(Seg = test_Seg, Maf = test_Maf,
platform = "SNP_6.0",
copy.num.type = "total",
results.dir = "output_doabsolute",
nThread = 1,
sigma.p = 0,
max.sigma.h = 0.2,
min.ploidy = 0.5,
max.ploidy = 10,
keepAllResult = FALSE,
primary.disease = "Tumor",
clean.temp = FALSE,
max.as.seg.count = 1500,
max.non.clonal = 0.05,
max.neg.genome = 0.005,
min.mut.af = 0.1,
min.no.mut = 0,
verbose = TRUE)
############################ Absolute #################################
for(patient in c("TCGA-DK-A1A6-01")){
print(patient)
test_seg=test_Seg%>%subset(Sample==patient)
test_maf=test_Maf %>% subset(Tumor_Sample_Barcode==patient)
write.table(test_seg,file = paste0(outdir,patient,"_test_output.seg"),sep="\t",row.names = F,quote = F)
write.table(test_maf,file = paste0(outdir,patient,"_test_output.maf"),sep="\t",row.names = F,quote = F)
RunAbsolute(seg.dat.fn = paste0(outdir,patient,"_test_output.seg"),
maf.fn = paste0(outdir,patient,"_test_output.maf"),
platform = "SNP_6.0",
copy_num_type = "total",
sigma.p=0,
results.dir = "output_absolute",
sample.name=patient,
primary.disease="Tumor",
max.sigma.h=0.5,
min.ploidy=0.5,
max.ploidy=10,
max.as.seg.count=1600,
max.neg.genome=0.005,
max.non.clonal=0.05,
verbose = TRUE,
min.mut.af=0.1)
}
######################################
## merge absolute data
######################################
ab_outdir="./output_absolute/"
all_files=list.files(paste0(ab_outdir))
rdata_files=all_files[grepl(pattern = ".RData",all_files)]
absolute.files=paste0(ab_outdir,rdata_files)
results.dir <- file.path(ab_outdir, "abs_summary")
CreateReviewObject("DRAWS_summary", absolute.files, results.dir, "total", verbose=TRUE)
load(file = paste0(results.dir,"/","DRAWS_summary.PP-modes.data.Rdata"))
calls.path = file.path(paste0(results.dir, "/","DRAWS_summary.PP-calls_tab.txt"))
modes.path = file.path(paste0(results.dir, "/", "DRAWS_summary.PP-modes.data.RData"))
output.path = file.path(paste0(outdir, "abs_extract"))
ExtractReviewedResults(calls.path, "test", modes.path, output.path, "absolute", "total")
|
好的,谢谢您,王老师🙏
…---Original---
From: "Shixiang Wang ***@***.***>
Date: Sun, Feb 26, 2023 09:50 AM
To: ***@***.***>;
Cc: "Sijian ***@***.******@***.***>;
Subject: Re: [ShixiangWang/DoAbsolute] Different results between absoluted anddoabsoluted? (Issue #26)
那就暂时不要纠结这个了,你直接用 ABSOLUTE 更好的话就直接用 for 循环调用即可,我这个包写的目的也就是为了方便一点。 下面这段代码就给后人参考吧,我会在 README 中说明一下。你后续如果发现真正的原因也可以后续再讨论。
rm(list=ls()) gc() library(DoAbsolute) library(dplyr) example_path = system.file("extdata", package = "DoAbsolute", mustWork = T) library(data.table) library(ABSOLUTE) ##set workspace outdir="./output/" setwd(workdir) # segmentation file seg_normal = file.path(example_path, "SNP6_blood_normal.seg.txt") seg_solid = file.path(example_path, "SNP6_solid_tumor.seg.txt") seg_metastatic = file.path(example_path, "SNP6_metastatic_tumor.seg.txt") # MAF file maf_solid = file.path(example_path, "solid_tumor.maf.txt") maf_metastatic = file.path(example_path, "metastatic_tumor.maf.txt") # read data seg_normal = fread(seg_normal) seg_solid = fread(seg_solid) seg_metastatic = fread(seg_metastatic) maf_solid = fread(maf_solid) maf_metastatic = fread(maf_metastatic) # merge data Seg = Reduce(rbind, list(seg_normal, seg_solid, seg_metastatic)) Maf = Reduce(rbind, list(maf_solid, maf_metastatic)) Seg$Sample = substr(Seg$Sample, 1, 15) Maf$Tumor_Sample_Barcode = substr(Maf$Tumor_Sample_Barcode, 1, 15) test_Seg=Seg %>% subset(Sample %in% c("TCGA-DK-A1A6-01")) test_Seg=test_Seg[1:42,] table(test_Seg$Chromosome) test_Maf=Maf %>% subset(Tumor_Sample_Barcode %in% c("TCGA-DK-A1A6-01")) table(test_Maf$Chromosome) test_Maf$Chromosome=ifelse(test_Maf$Chromosome=="X",23,test_Maf$Chromosome) test_Maf=test_Maf[1:28,] ############################ DoAbsolute ################################# # test function DoAbsolute(Seg = test_Seg, Maf = test_Maf, platform = "SNP_6.0", copy.num.type = "total", results.dir = "output_doabsolute", nThread = 1, sigma.p = 0, max.sigma.h = 0.2, min.ploidy = 0.5, max.ploidy = 10, keepAllResult = FALSE, primary.disease = "Tumor", clean.temp = FALSE, max.as.seg.count = 1500, max.non.clonal = 0.05, max.neg.genome = 0.005, min.mut.af = 0.1, min.no.mut = 0, verbose = TRUE) ############################ Absolute ################################# for(patient in c("TCGA-DK-A1A6-01")){ print(patient) test_seg=test_Seg%>%subset(Sample==patient) test_maf=test_Maf %>% subset(Tumor_Sample_Barcode==patient) write.table(test_seg,file = paste0(outdir,patient,"_test_output.seg"),sep="\t",row.names = F,quote = F) write.table(test_maf,file = paste0(outdir,patient,"_test_output.maf"),sep="\t",row.names = F,quote = F) RunAbsolute(seg.dat.fn = paste0(outdir,patient,"_test_output.seg"), maf.fn = paste0(outdir,patient,"_test_output.maf"), platform = "SNP_6.0", copy_num_type = "total", sigma.p=0, results.dir = "output_absolute", sample.name=patient, primary.disease="Tumor", max.sigma.h=0.5, min.ploidy=0.5, max.ploidy=10, max.as.seg.count=1600, max.neg.genome=0.005, max.non.clonal=0.05, verbose = TRUE, min.mut.af=0.1) } ###################################### ## merge absolute data ###################################### ab_outdir="./output_absolute/" all_files=list.files(paste0(ab_outdir)) rdata_files=all_files[grepl(pattern = ".RData",all_files)] absolute.files=paste0(ab_outdir,rdata_files) results.dir <- file.path(ab_outdir, "abs_summary") CreateReviewObject("DRAWS_summary", absolute.files, results.dir, "total", verbose=TRUE) load(file = paste0(results.dir,"/","DRAWS_summary.PP-modes.data.Rdata")) calls.path = file.path(paste0(results.dir, "/","DRAWS_summary.PP-calls_tab.txt")) modes.path = file.path(paste0(results.dir, "/", "DRAWS_summary.PP-modes.data.RData")) output.path = file.path(paste0(outdir, "abs_extract")) ExtractReviewedResults(calls.path, "test", modes.path, output.path, "absolute", "total")
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
The question is below
The text was updated successfully, but these errors were encountered: