Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: exon_lines[row.id]["tid"] = row.transcript KeyError: None #196

Closed
alelim-bio opened this issue Jul 18, 2019 · 10 comments
Closed

Error: exon_lines[row.id]["tid"] = row.transcript KeyError: None #196

alelim-bio opened this issue Jul 18, 2019 · 10 comments
Assignees
Milestone

Comments

@alelim-bio
Copy link

Hello Mikado,

I have been working through your pipeline and have ran into an error. I noticed it has been reported previously but, I have been unable to solve the problem. I was wondering if I could get your help?

Some additional information:

  • Using the recent 2.0 version of Mikado in order to run the pipeline.
  • CentOS Linux 7
  • Running a conda environment
  • Python version 3.6.8
  • Mikado test passed.

This is a subsection of what seems to be the main error.

File "/pylon5/mc5fr6p/alelim/CONDA/anaconda3/envs/bio/lib/python3.6/site-packages/Mikado/preparation/annotation_parser.py", line 548, in load_from_gtf
    exon_lines[row.id]["tid"] = row.transcript
KeyError: None

I have attached the .log file and the toy samples I have been using. If there is anything else I can provide, please let me know.

Kind Regards,

Alex

prepare.log
St.toy_sample.zip

@lucventurini
Copy link
Collaborator

Dear Alex,
thank you for your bug report. I will try to solve it as quickly as I can. It looks like a problem in parsing, hopefully it will not be too long to fix.

Kind regards,

Luca

@lucventurini lucventurini self-assigned this Jul 19, 2019
@lucventurini lucventurini added this to the 2.0 milestone Jul 19, 2019
@lucventurini
Copy link
Collaborator

Dear @AsclepiusDoc,
unfortunately I could not reproduce the bug with the toy data (as a reference genome, I used chromosome 1 of G. raimondii, having inferred from the log that you were analysing cotton - please let me know whether this is incorrect). However, I made a slight modification to the offending section of the code which could provide a fix.

May I also ask you to please look inside the GTF files whether there is any line missing a "transcript_id"? I think that Mikado might be crashing because of that.

Kind regards

@alelim-bio
Copy link
Author

Hello @lucventurini ,

Thank you for your reply. You are correct that we are working with cotton.

Also, I have looked through the GTF files and didn't seem to find any lines that were missing a transcript_id except for the two header lines in the stringtie file. Would it be crashing because of these headers?

Kind Regards,

Alex

@lucventurini
Copy link
Collaborator

Dear @AsclepiusDoc ,
the header should not pose a problem, but would you be able to send it here so that I can check?
Mikado should ignore lines that start with "#". If it does not, that is definitely the bug to be solved.

lucventurini added a commit that referenced this issue Jul 23, 2019
@lucventurini
Copy link
Collaborator

Dear @AsclepiusDoc , any news? Have you managed to try the amended version? If you could send me another snippet of the GTF, I can have a go.

@alelim-bio
Copy link
Author

Hello @lucventurini ,

Apologies for the late reply, to update you I have tried using the header and have now received a new issue using the amended version. I have attached the prepare.log.

Additionally, I have attached another set of GTF files.

Kind Regards,

Alex

prepare.log
St.toy_sample (2).zip

@lucventurini
Copy link
Collaborator

Dear @AsclepiusDoc ,
many thanks for the updated files. I am now able to reproduce the bug on my workstation, so I should be able to track the problem down and resolve it soon.

Thank you for your patience and collaboration.

Kind regards

@lucventurini
Copy link
Collaborator

Dear @AsclepiusDoc , it should now be fixed in 3bcae56 (see previous commit, ffc6ec3, under Mikado/preparation/annotation_parser.py, for the actual bug fix). If you could pull from the branch and test on your data, we can close the issue and merge back into master.

Thank you again for reporting this bug, this was quite nasty! if you had not reported it, this would have required a hasty patch right after releasing the next version!

@alelim-bio
Copy link
Author

Hello @lucventurini ,

I have updated the my branch and tested with my full dataset and it has gone through with no errors! I will continue running the pipeline to see if there are any other issues. Thank you so much for all your help!

Kind Regards,

Alex

@lucventurini
Copy link
Collaborator

Dear @AsclepiusDoc ,
excellent news! I will merge the fix back into master and update to v2.0rc2.

Thank you again for reporting the bug!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants