Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated the max isize to be Integer.MAX_VALUE. #1410

Merged
merged 1 commit into from
Aug 19, 2019

Conversation

tfenne
Copy link
Member

@tfenne tfenne commented Aug 19, 2019

Description

I'm processing some data with a reference sequence that has long chromosomes, over the BAI limit. I ran into this issue where one (likely chimeric) read-pair has an insert size of over 536,870,912 (aka 1<<29), which causes HTSJDK to throw validation errors.

I looked at the spec and can find nothing supporting a limit on insert size this low. The SAM section defines it as between [−2^31 + 1, 2^31 − 1], and the BAM section dedicates a full int32 t for tlen.

@tfenne tfenne requested review from yfarjoun and lbergelson August 19, 2019 19:58
@tfenne tfenne self-assigned this Aug 19, 2019
@codecov-io
Copy link

codecov-io commented Aug 19, 2019

Codecov Report

Merging #1410 into master will increase coverage by 0.006%.
The diff coverage is n/a.

@@               Coverage Diff               @@
##              master     #1410       +/-   ##
===============================================
+ Coverage     68.102%   68.108%   +0.006%     
- Complexity      8373      8374        +1     
===============================================
  Files            573       573               
  Lines          33965     33965               
  Branches        5668      5668               
===============================================
+ Hits           23131     23133        +2     
+ Misses          8644      8643        -1     
+ Partials        2190      2189        -1
Impacted Files Coverage Δ Complexity Δ
src/main/java/htsjdk/samtools/SAMRecord.java 68.323% <ø> (ø) 297 <0> (ø) ⬇️
...htsjdk/samtools/util/nio/DeleteOnExitPathHook.java 90.476% <0%> (+9.524%) 4% <0%> (+1%) ⬆️

@lbergelson lbergelson added long reference Bugs and issues related to long reference sequences. bug labels Aug 19, 2019
@lbergelson
Copy link
Member

Seems likely to be an artifact of the bai restrictions, but the history of its creation is lost to git history.

@lbergelson
Copy link
Member

Lets see if @yfarjoun can remember any objections too this. I can't think of any though.

@jmarshall
Copy link
Member

Historical specification note: this field was initially neglected and only lifted from 229 to 231 a couple of years after most of the other fields (see samtools/hts-specs@6f8dfe4). This may go some way to explaining the still-present HTSJDK limit on this field — but it was still lifted almost six years ago!

@lbergelson
Copy link
Member

lbergelson commented Aug 19, 2019

@jmarshall Thank you. Sometimes you just have to give things time, you don't want to rush anything.

@tfenne
Copy link
Member Author

tfenne commented Aug 19, 2019

Thanks @jmarshall - I figured it was something like that and I appreciate the archeological help!

@jmarshall
Copy link
Member

Tragically I still remember making that change 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug long reference Bugs and issues related to long reference sequences.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants