-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] add 'sourmash signature' signature manipulation utilities. #587
Conversation
Codecov Report
@@ Coverage Diff @@
## master #587 +/- ##
==========================================
+ Coverage 88.64% 89.43% +0.79%
==========================================
Files 25 27 +2
Lines 3786 4193 +407
Branches 37 37
==========================================
+ Hits 3356 3750 +394
- Misses 430 443 +13
Continue to review full report at Codecov.
|
I think this is ready for a preliminary review by folks. @olgabot @taylorreiter @bluegenes any thoughts or comments welcome! @luizirber while the tests aren't that comprehensive I think the basic command line API is OK, and as these commands are mostly wrappers around MinHash functionality, I'm not too concerned about the tests. I'm sure we'll add some as bugs are discovered, too :). |
I think this is a good place to put #121 too? Maybe not in this PR, but probably fits into |
yes, #121 is an excellent addition - can provide import and export both! thx! |
Hi @luizirber I think this is ready for code review! Your opinion on |
all tests passed!!!! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Provides
merge
,flatten
,rename
,intersect
,extract
,downsample
,subtract
,import
,export
, andoverlap
utilities viasourmash signature
subcommands.PR internal link: sourmash signature documentation updates.
Favorite internal review so far:
"it's grease for bioinformatics! (and all the functionality I never knew i needed in my life)"
Also renames
sourmash_lib
tosourmash
in most of the existing code, as minor cleanup.Fixes #196 #239 #121 #241.
TODO:
info
ordescribe
command (see code in Add "describe" to output kmer sizes, molecules, scaled, and num hashes for each signature #561)-o/--output
to specify output--ksize
etc signature selection commandsextract
command to extract one particular signaturedownsample
command (part of Add two utilities #149)subtract
command (part of Add two utilities #149)--output
intersect
andsubtract
behavior wrt--track-abundance
--track-abundance
, esp wrt mixtures of these signatures with non-track-abundance.Upon merge:
--output
, and signatures calculated with--track-abundance
.sig/__main__.py
and extract ideas for new functions for theMinHash
class -flatten
,subtract
, downsample conversion, etc, into new issues.Checklist:
make test
Did it pass the tests?make coverage
Is the new code covered?without a major version increment. Changing file formats also requires a
major version number increment.
changes were made?