Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cds glyph producing incorrect translations #51

Open
scottcain opened this issue Mar 13, 2015 · 1 comment
Open

cds glyph producing incorrect translations #51

scottcain opened this issue Mar 13, 2015 · 1 comment
Assignees

Comments

@scottcain
Copy link
Member

In several GBrowse instances around the net (WormBase, arabidopsis, paramecium for example), the cds glyph is producing incorrect translations for second and subsequent exons in the display. That is, for whatever the first exon in order of translation that appears in the details panel (it doesn't have to be the first exon in the transcript), the translation is correct, but after that exon, the translation is incorrect. See these examples from WormBase:

screen shot 2015-03-12 at 12 58 17 pm
screen shot 2015-03-12 at 12 55 57 pm

Note that the incorrect translations that appear seem to come from other portions of the same transcript translated in an incorrect frame. (I determined this be transcribing the AA sequence and then doing a tblastn against the elegans genome.) So it seems that something is going wrong with coordinate mapping. Also note that in the second image, the translations in the second exons are exactly the same even though they are splicing to different places in the genome.

I tried to test the underlying BioPerl functionality by writing a "in silico" splicing and translation script, where the script pulled mRNA features from a SeqFeature::Store database and put the CDS pieces together and translated it, but that worked fine.

Live examples of where this is happening elsewhere:

Arabidopsis:
http://paramecium.cgm.cnrs-gif.fr/gb2/gbrowse/ptetraurelia_mac/?start=11267;stop=11367;ref=scaffold_1;width=750;version=100;flip=0;grid=1;id=b9f0de601f367c68c264ebe77c96bae4;l=Translation%1ECDS%1EGenes

Paramecium:
http://tairvm17.tacc.utexas.edu/cgi-bin/gb2/gbrowse/arabidopsis/?start=1507795;stop=1507994;ref=Chr1;width=800;version=100;flip=0;grid=1;id=0a5edc2236a0c2faa054303a7a2510af;l=FrameTranslation%1ECDS%1EProteinCoding

@scottcain
Copy link
Member Author

Here's the simple test script I used:

#!/usr/bin/perl
use strict;
use warnings;
use Bio::DB::SeqFeature::Store;

my $db = Bio::DB::SeqFeature::Store->new(
-adaptor => 'DBI::mysql',
-dsn => 'dbi:mysql:c_elegans_PRJNA13758_WS247',
-user => 'wormbase',
-pass => '*********
) or die;

my ($f) = $db->features( -type=>'mRNA', -name => 'Y16B4A.2');
my @Exons = $f->get_SeqFeatures('exon');
my $strand = $f->strand();

#just to make sure--I think they probably come from the database sorted
my @sortedexons;
if ($strand < 0) {
@sortedexons = sort {$b->start <=> $a->start } @Exons;
}
else {
@sortedexons = sort {$a->start <=> $b->start } @Exons;
}

my $mRNA;
for my $exon (@sortedexons) {
my $seq = $exon->seq->seq ;
$mRNA .= $seq;
}

print ">mRNA\n";
print "$mRNA\n";
print ">primary_transcript\n";
print $f->seq->seq, "\n";

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants