-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Misc CRAM cleanup #1253
Misc CRAM cleanup #1253
Conversation
Removed lots of Review with https://github.com/samtools/htsjdk/pull/1253/files?w=1 to ignore whitespace changes. |
@@ -80,11 +85,11 @@ public Boundary(final long start, final long end) { | |||
if (start >= end) throw new RuntimeException("Boundary start is greater than end."); | |||
} | |||
|
|||
boolean hasNext() throws IOException { | |||
private boolean hasNext() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make these throwing methods explicitly private
{ | ||
def.setLevel(GZIP_COMPRESSION_LEVEL); | ||
} | ||
}; | ||
IOUtil.copyStream(new ByteArrayInputStream(data), gos); | ||
gos.close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now auto-closes
Codecov Report
@@ Coverage Diff @@
## master #1253 +/- ##
===============================================
+ Coverage 69.356% 69.893% +0.537%
- Complexity 8164 8628 +464
===============================================
Files 548 555 +7
Lines 32675 34440 +1765
Branches 5520 5980 +460
===============================================
+ Hits 22662 24071 +1409
- Misses 7795 8035 +240
- Partials 2218 2334 +116
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmthibault79 Some comments. Looks good otherwise. I don't know if @cmnbroad has comments as well.
@@ -235,15 +242,15 @@ public static CramHeader readCramHeader(final InputStream inputStream) throws IO | |||
final ByteArrayOutputStream headerOS = new ByteArrayOutputStream(); | |||
try { | |||
headerOS.write(bytes); | |||
headerOS.write(headerBodyOS.getBuffer(), 0, headerBodyOS.size()); | |||
headerOS.write(headerBodyOS.toByteArray(), 0, headerBodyOS.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This introduces an additional redundant array copy, but it's the header so it's probably not a bottleneck at all. If it is a bottleneck then headerOs should probably be an exposed one as well...
@@ -28,15 +28,17 @@ | |||
* @param data byte array to compress | |||
* @return compressed blob | |||
*/ | |||
public static byte[] gzip(final byte[] data) throws IOException { | |||
public static byte[] gzip(final byte[] data) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All these operations that copy between byteArrayOutputStreams are doing a lot of unnecessary array copies which might be worth looking into as performance problems in future profiling.
src/main/java/htsjdk/samtools/cram/build/CramSpanContainerIterator.java
Outdated
Show resolved
Hide resolved
|
||
import java.io.ByteArrayOutputStream; | ||
|
||
public class ExposedByteArrayOutputStream extends ByteArrayOutputStream { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this class does server the purpose of reducing unnecessary array copies. I don't know if any of the places you removed if it from are bottlenecks, but if they are I would be sure that changing this isn't impacting performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK - I've been removing it in other PRs over the last few months too, so we should also check those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmthibault79 One additional comment, you can choose what to do with it... Looks good to me. 👍
src/main/java/htsjdk/samtools/cram/build/CramSpanContainerIterator.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only one or two minor comments (in addition to sharing Louis' concern around perf when removing ExposedByteArray). Otherwise this looks good - I've wanted to do this for a long time.
- made GaveUpException a proper class - caller to findBasesByMD5 now catches IOException instead of Exception
- cleaner exception handling
b24ac12
to
62fa350
Compare
rebased to resolve conflict |
Checklist