Extra low-level CRAM manipulation functions. #1771
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Provide extra CRAM container manipulations and index queries.
Added to support extra functionality to
samtools cat
.Some internal cram functions are no longer static as they're called
from cram_external.c, but they don't have HTSLIB_EXPORT and aren't
an official part of the API.
These are cram_to_bam, cram_next_slice
New public CRAM APIs:
These facilitate manipulation at the container level, both seeking to specific byte offsets, but also being able to specify containers as the n^th container listed in the index.
cram_container_get_coords returns refid, start and span fields from the opaque cram_container struct.
cram_filter_container copies a container but applies region based filtering, as already specified in the cram_fd with a range request. (Note we currently also provide cram_copy_slice, but may want to add a cram_copy_container for consistency.)
cram_index_extents queries an index to return byte offsets of the first and last container overlapping a specified region.
cram_num_containers_between queries an index to report the number of indexed containers and their container numbers (starting at 0 for the first) covering a range.
cram_num_containers is a simplified cram_num_containers_between doing only the counting operation and on the entire file.
cram_container_num2offset returns the byte offset for the n^th container. cram_container_offset2num does the reverse.
A new cram_skip_container function, which is currently internal only but may potentially have use externally in the future. It's used by cram_filter_container when it detects it'll filter out everything.
cram_index_query now copes with HTS_IDX_NOCOOR (-2) and maps it over to refid -1.
Also improved cram_index_query so it works on region HTS_IDX_NOCOOR too, rather than requiring a remapping to CRAM's -1.