-
Notifications
You must be signed in to change notification settings - Fork 443
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix the multi-threaded CRAM multi-region iterator regression.
Fixes #1061 This appears to have been introduced during bfc9f0d and fixed for BAM only in 6149ea6. The effect on multi-threading decoding for CRAM was significant. This fix takes a totally different approach, which not only fixes the regression but passes it. CRAM can do CRAM_OPT_RANGE queries, used by the single version of the iterator, which informs the decode of both start and end range positions. When threading this means we don't preemptively decode the next container unless it'll actually be used. This same logic is now used in the multi-region iterator, although it's complex. The general strategy is as follows: - Ensure we know the next container start from index. This needs a small tweak to cram_index struct as the next container isn't quite the same as this container + slice offset + slice size (sadly). I think it's missing the size of the container struct itself. - When being given a start..end offset, step into the reg_list to find the corresponding chr:start-end range. - Produce a new CRAM_OPT_RANGE_NOSEEK option that does the same job as the old CRAM_OPT_RANGE opt but without the seek. This is necessary hts.c does the seek for us and the pseek/ptell only work with each other and not within the cram subdir code itself). - Identify neighbouring file offsets and merge together their correponding ranges so a block of adjacent offsets becomes a single CRAM_OPT_RANGE query. We cache the end offset (.v) in iter->end so we can avoid duplicating the seek / range request in subsequent intervals. - Manage the EOF vs EOR (end of range) return values. For EOR we have do the incrementing ourselves as we need to restart the loop without triggering the end-of-file or end-of-multi-iterator logic below it. - Tweak the region sorting a bit so ties are resolved by the max value. We want the index into reg_list to also be sorted, so we can accurately step through them. Note this logic is CRAM specific, as the sorting wouldn't work on BAI anyway due to the R-tree. Some benchmarks on ~37,000 regions returning ~110 million seqs across NA06985.final.cram: CRAM-1.10 no threads: real 4m37.361s user 4m21.128s sys 0m7.988s CRAM-1.10 -@16: real 3m14.371s user 28m48.872s sys 4m52.442s CRAM-dev -@16: real 5m55.670s user 61m0.005s sys 11m23.147s CRAM-current -@16: real 1m55.701s user 5m54.234s sys 0m51.699s The increase in user time between unthreaded and 16 threads is modest. System time increase is considerable, but this appears to primarily be down to glibc malloc inefficiencies. Using libtcmalloc.so instead gives: 1M CRAM-current -@16: real 1m30.882s user 6m9.103s sys 0m4.960s We're still nowhere near using 16 threads, but this is because the average size of each region is significantly smaller than 16 containers worth.
- Loading branch information
1 parent
9de45b7
commit d314715
Showing
5 changed files
with
211 additions
and
14 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters