Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why am I getting "Total number of sites analyzed: 0"? #474

Open
jingliulj opened this issue Apr 6, 2022 · 8 comments · Fixed by #533
Open

Why am I getting "Total number of sites analyzed: 0"? #474

jingliulj opened this issue Apr 6, 2022 · 8 comments · Fixed by #533

Comments

@jingliulj
Copy link

jingliulj commented Apr 6, 2022

I'm running angsd from the singularity image and from conda (angsd=0.937) as well. I don't know why I keep getting no sites with the following command. Can anyone help here? Thanks!
singularity exec angsd_0.933.sif angsd -b bamlist.txt -doVcf 1 -GL 1 -doPost 1 -doMajorMinor 1 -doMaf 1 -SNP_pval 1e-4 -P 4 -doGeno 4 -minMapQ 20 -minQ 20 -minInd 2 -doDepth 1 -doCounts 1 -out out -r chr1:
-> angsd version: 0.933 (htslib: 1.11) build(May 12 2021 15:07:08)
-> SNP-filter using a pvalue: 1.000000e-04 correspond to 15.136705 likelihood units
[bammer_main] 47 samples in 47 input files
-> Parsing 47 number of samples
-> Region lookup 1/1

    -> Done reading data waiting for calculations to finish
    -> Done waiting for threads
    -> Output filenames:
            ->"out.arg"
            ->"out.mafs.gz"
            ->"out.geno.gz"
            ->"out.vcf.gz"
            ->"out.depthSample"
            ->"out.depthGlobal"
    -> Wed Apr  6 16:47:47 2022
    -> Arguments and parameters for all analysis are located in .arg file
    -> Total number of sites analyzed: 0
    -> Number of sites retained after filtering: 0 
    [ALL done] cpu-time used =  101.79 sec
    [ALL done] walltime used =  101.00 sec
@TeresaPegan
Copy link

I think the first place to look is at the parameter -r chr1:. Try to run ANGSD without that parameter and see if it will run on all the sites. Perhaps that parameter is not being read properly and is not matching the chromosome names in your data. If there is something wrong there, ANGSD might discard all of your data because it can't match it to the chromosome name.

@jingliulj
Copy link
Author

I think the first place to look is at the parameter -r chr1:. Try to run ANGSD without that parameter and see if it will run on all the sites. Perhaps that parameter is not being read properly and is not matching the chromosome names in your data. If there is something wrong there, ANGSD might discard all of your data because it can't match it to the chromosome name.

This is what I get after removing the -r parameter:

        -> angsd version: 0.933 (htslib: 1.11) build(May 12 2021 15:07:08)
        -> SNP-filter using a pvalue: 1.000000e-04 correspond to 15.136705 likelihood units
[bammer_main] 3 samples in 3 input files
        -> Parsing 3 number of samples
No data for chromoId=0 chromoname=chr1
This could either indicate that there really is no data for this chromosome
Or it could be problem with this program regSize=0 notDone=0

        -> Done reading data waiting for calculations to finish
        -> Done waiting for threads
        -> Output filenames:
                ->"out.arg"
                ->"out.mafs.gz"
                ->"out.geno.gz"
                ->"out.vcf.gz"
                ->"out.depthSample"
                ->"out.depthGlobal"
        -> Fri May 20 12:43:41 2022
        -> Arguments and parameters for all analysis are located in .arg file
        -> Total number of sites analyzed: 0
        -> Number of sites retained after filtering: 0
        [ALL done] cpu-time used =  167.86 sec
        [ALL done] walltime used =  169.00 sec

This is how my bam file looks:
V350031435L1C001R00500558704 417 chr1 3003560 60 52H47M = 3003682 217 CTGCAGTCTGCATGCTGATCTGCGCAGACTGTTCTCAGAGGGATCTGEFF=FFEEFEBFFFFFF@E8FF=FDEF7FFFDDCF+GFDFCF)F<FF SA:Z:chr4,148454751,+,58M41S,60,1; MC:Z:95M MD:Z:47 RG:Z:WT NM:i:0 AS:i:47 XS:i:21

The "chr1" chromosome is in the data, but ANGSD just didn't recognize it. Any idea why?

@TeresaPegan
Copy link

TeresaPegan commented May 20, 2022

I wonder if the reason it's having a problem with the the "-r chr1:" part of your command is that it doesn't like the ":" without any sites. This might be why it doesn't retain any, because there is nothing after the colon. Per this page, you should be able to select a specific region like chr1 without specifying any sites and in that case you'd just leave out the colon. You could try:

singularity exec angsd_0.933.sif angsd -b bamlist.txt -doVcf 1 -GL 1 -doPost 1 -doMajorMinor 1 -doMaf 1 -SNP_pval 1e-4 -P 4 -doGeno 4 -minMapQ 20 -minQ 20 -minInd 2 -doDepth 1 -doCounts 1 -out out -r chr1

Another idea: you are asking ANGSD to filter by a SNP pvalue of 1e-4, and you also have only 3 samples. Especially if your data are very low coverage, maybe ANGSD is never assigning a SNP pvalue that low to any locus, even to variant sites, so you get no sites returned. In my experience the SNP pval that ANGSD assigns to a site is related to its minor allele frequency. When I use a stringent pval cutoff like 1e-6, I end up getting sites where the minor allele frequency is pretty high, because when an allele is apparently present only one or a few times it's hard to be sure it's not read errors? So it could be the case that you're discarding all your sites with that filter too. You could try without that filter and see what happens.

Is there a reason you are using version 0.933? Version 0.937 has been working for me for the GL and doGeno functions, although I got loads of segfaults with some of the prior versions. Maybe try with 0.937?

Good luck! p.s. I am just a user, not a developer, so I might not be able to fully solve this, but hopefully it's helpful!
-Teresa

@ANGSD
Copy link
Owner

ANGSD commented Jun 26, 2022

Dear all, sorry for late reply. Could you maybe make a single sam/bam file containing only one read and see if that works.

Best

@ANGSD
Copy link
Owner

ANGSD commented Jul 10, 2022

I assume this has been resolved or is not relevant anymore, so I am closing this issue. Feel free to reopen if needed.

@ANGSD ANGSD closed this as completed Jul 10, 2022
isinaltinkaya added a commit that referenced this issue Oct 11, 2022
Add function aio::doAssert to replace asserts
Did not use aio::assert as name  since aio.h
namespace complains due to assert being a macro
Fixes the major bug explained in #527
Fixes issues #520 #474 #466 #420 #405 #396 #385
Possibly others; other issues should rerun the
commands using the latest version.
@jingliulj
Copy link
Author

Dear all, sorry for late reply. Could you maybe make a single sam/bam file containing only one read and see if that works.

Best

Tried the latest version fo ANSGD (angsd version: 0.940-dirty (htslib: 1.16)) with one read. But the output is the same, nothing, "Total number of sites analyzed: 0". Attached is 3 one-read example files I used with the command.
example.zip

@jingliulj
Copy link
Author

I assume this has been resolved or is not relevant anymore, so I am closing this issue. Feel free to reopen if needed.

Please reopen this issue. Thanks!

@StuntsPT
Copy link

I am experiencing the same issue here (v0.940-dirty). Maybe this should be reopened?

@isinaltinkaya isinaltinkaya reopened this Jan 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants