-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
5964 Segmentation fault #26
Comments
Well that's not good. Could you give me some more information?
Hopefully this will help identify the problem |
Thanks for your reply, the answers for the your questions What Linux distro are you running? I was trying to run the tool on cluster environment, which is LSF batch system. The Linux distro:
Has this version of rnaseqc worked on this computer before? Yes, on Mouse samples with mouse reference Could you try running when the flag -vv and posting the output up to and including the error?
|
Unfortunately, I was unable to reproduce the crash, so I need some more information.
|
1. Could you show the command line arguments you gave to rnaseqc when it crashed? rnaseqc.v2.3.1.linux -vv /annotation/gencode.v28lift37.annotation.gtf /BAM_files/EOC04_L1Aligned.sortedByCoord.out.bam /alignment/RNA-SeqC/EOC04_L1Aligned.sortedByCoord.out.bam 2. If the program produced a core dump, would you mind uploading it? The program produced the above error. I am attaching all errors which I have. No, there was no core dump produced after the run. |
Okay. I still can't reproduce the crash. Would it be possible for you to provide a copy of the input gtf and bam? If there are protected access concerns, it may be helpful to use a
|
Thanks for the reply, sorry but we cannot share the data with third party. However, I have tried replacing the read sequence with garbage sequence, which also threw the following error,
|
That's fine. I kinda expected that would be the case. Without being able to touch the data, my only option is to ask you to debug this for me. If possible could you try the following:
|
Also, if you need to get rnaseqc to run in the interim, you could use the docker image |
Hello Aaron,
However, when I try with the first 1000 lines of BAm file it is not throwing the error. I can attach the gtf file used for the run. You can find the lines of gtf file where the execution is halting. https://drive.google.com/file/d/1OTox95OYdtqJ-pRSocePwN4r6j25wZUf/view?usp=sharing |
It looks like this is a full annotation, which is not recommended for RNA-SeQC. Reported counts may be inaccurate unless you're using a "collapsed" gene model, which you can generate by using the GTEX Annotation Collapsing Script Looking at the ValidateSamFile output, there's nothing that jumps out as being an immediate culprit. It's odd that there are so many reads with missing mates, but I can't think of a reason this would crash RNA-SeQC. If you've had a chance to run Also, if you're able to identify the exact read which is causing the crash it would be helpful if you can share that read's metadata (tags, position, cigar, flags). If you don't have time to identify the culprit read, or can't share the metadata, that's fine |
Hello Aaron, Thanks for reply, and I have found the culprit reads leading to the fault. I am sharing the BAM file with those 45 reads here with you : https://drive.google.com/file/d/1-_N-A9CVCIUqs7KzOwYpidvdYyq-13G7/view?usp=sharing In addition to that I have ran the valgrind tool on that BAM file using the following command, the gtf file I used here is the one I shared with you previously (https://drive.google.com/file/d/1OTox95OYdtqJ-pRSocePwN4r6j25wZUf/view?usp=sharing)
Which has thrown following error,
Please let me know if you need any further information. |
Thank you so much for this information! I was able to reproduce the crash on my CentOS 7 VM. I'll let you know as soon as I have any updates |
I believe I have identified the problem. It appears that there are two exons with the same id:
The second exon overwrites the length stored for the first one, which is causing problems later on. RNA-SeQC relies on the lengths when allocating memory for certain data structures. This causes memory to be written out of bounds (because the memory allocated for the first exon is ~500 bytes shorter than it should be) which is causing the problem. The reason that the exact circumstances of the crash vary from environment to environment is that the state and layout of the program memory is highly dependent on the environment, which causes this problem to manifest in different ways (or not at all) on different platforms. I am going to release a patch to RNA-SeQC which should avoid both the memory issue and print an error if duplicate gene/exon ids are detected |
The fixes have been incorporated into RNA-SeQC 2.3.2 If this does not fix your issue, please feel free to comment and reopen this issue |
Hello Aaron, I have issues installing from the new release version of the tool. I
Then when I check the installation it says command not found. Do you have Thanks |
If you're building from source, you don't need to cd into the seqlib directory. The rnaseqc makefile is supposed to build seqlib for you. That said, the Linux static binary is available on the releases page I apologize if it wasn't there when you first looked; it can take ~30 minutes after a new release is published before the automated build system attaches the binaries |
Hello Aaron,
The above run, however, is throwing an error for the duplicated exon,
|
Hello Aaron,
Thank you so much, for your reply. I have now got the new release tool in
my machine and tried it on my samples using the same gtf file. The
following is the command I tried.
# LSBATCH: User input
/cluster/home/user/rnaseqc.v2.3.2.linux /gencode.v28lift37.annotation.gtf
/RCC16_L1Aligned.sortedByCoord.out.bam /RCC16_L1
The above run, however, is throwing an error for the duplicated exon,
Failed to parse the GTF: Detected non-unique Exon ID: ENSE00001957285.1_1
…On Thu, Apr 18, 2019 at 2:04 PM Aaron Graubert ***@***.***> wrote:
If you're building from source, you don't need to cd into the seqlib
directory. The rnaseqc makefile is supposed to build seqlib for you. That
said, the Linux static binary is available on the releases page
<https://github.com/broadinstitute/rnaseqc/releases/tag/v2.3.2>
I apologize if it wasn't there when you first looked; it can take ~30
minutes after a new release is published before the automated build system
attaches the binaries
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#26 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AB4I6JLFW755MCKOFOUKFHTPRBPTXANCNFSM4HEQBD3Q>
.
--
Kind Regards,
Dr.Alva Rani James
|
That is correct. Ensembl IDs are always unique, and RNA-SeQC relies on this fact. Your GTF contains two exons (shown above), with the same exon ID. You must change them to have correct IDs. Additionally, you're using a full gencode annotation, which will yield incorrect results. You should use the GTEx Collapse Annotation Script to convert your full annotation to a collapsed model which will work for RNA-SeQC |
Hello All,
Is it possible to install RNAseqC via conda environment?
…On Thu, Apr 18, 2019 at 6:53 PM Aaron Graubert ***@***.***> wrote:
That is correct.
Ensembl IDs are always unique, and RNA-SeQC relies on this fact. Your GTF
contains two exons (shown above), with the same exon ID. You must change
them to have correct IDs.
Additionally, you're using a full gencode annotation, which will yield
incorrect results. You should use the GTEx Collapse Annotation Script
<https://github.com/broadinstitute/gtex-pipeline/tree/master/gene_model>
to convert your full annotation to a collapsed model which will work for
RNA-SeQC
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#26 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AB4I6JIVAZPVBWGMWM54D5DPRCRQLANCNFSM4HEQBD3Q>
.
--
Kind Regards,
Dr.Alva Rani James
|
No, but at some point it may become available (#19). In the meantime, you can download the static binaries, use the docker image |
I am currently intergrating soem tools and buliding a SNAKEMAKE pipeline. I
have a lits of tools and packages i mentioned in my enviorment file. So if
RNA-seqC can be downloaded via conda or pip or via any dependencies I could
mention that in my enviroment.yml file.
…On Mon, Jul 8, 2019 at 4:03 PM Aaron Graubert ***@***.***> wrote:
No, but at some point it may become available (#19
<#19>). In the meantime,
you can download the static binaries
<https://github.com/broadinstitute/rnaseqc/releases>, use the docker
image gcr.io/broad-cga-aarong-gtex/rnaseqc:latest, or build from source
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#26>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AB4I6JJ55N45RP4KABPDYKLP6NCLLANCNFSM4HEQBD3Q>
.
--
Kind Regards,
Dr.Alva Rani James
|
Dear All,
I am using RNASEQC tool binary version rnaseqc.v2.3.1.linux from GitHub Releases. When I try to run the tool on human samples. It is throwing the following error,
And the Total memory I request to the run the above tool is cluster was: 80000.00 MB. Any help/suggestions would be much appreciated. I have successfully used the same tool on the mouse samples.
The text was updated successfully, but these errors were encountered: