-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated the max isize to be Integer.MAX_VALUE. #1410
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1410 +/- ##
===============================================
+ Coverage 68.102% 68.108% +0.006%
- Complexity 8373 8374 +1
===============================================
Files 573 573
Lines 33965 33965
Branches 5668 5668
===============================================
+ Hits 23131 23133 +2
+ Misses 8644 8643 -1
+ Partials 2190 2189 -1
|
Seems likely to be an artifact of the bai restrictions, but the history of its creation is lost to git history. |
Lets see if @yfarjoun can remember any objections too this. I can't think of any though. |
Historical specification note: this field was initially neglected and only lifted from 229 to 231 a couple of years after most of the other fields (see samtools/hts-specs@6f8dfe4). This may go some way to explaining the still-present HTSJDK limit on this field — but it was still lifted almost six years ago! |
@jmarshall Thank you. Sometimes you just have to give things time, you don't want to rush anything. |
Thanks @jmarshall - I figured it was something like that and I appreciate the archeological help! |
Tragically I still remember making that change 😄 |
Description
I'm processing some data with a reference sequence that has long chromosomes, over the BAI limit. I ran into this issue where one (likely chimeric) read-pair has an insert size of over 536,870,912 (aka
1<<29
), which causes HTSJDK to throw validation errors.I looked at the spec and can find nothing supporting a limit on insert size this low. The SAM section defines it as between
[−2^31 + 1, 2^31 − 1]
, and the BAM section dedicates a fullint32 t
for tlen.