-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: support RocksDB databases in sourmash proper through FFI #3545
base: latest
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## latest #3545 +/- ##
==========================================
- Coverage 88.03% 87.72% -0.31%
==========================================
Files 136 137 +1
Lines 22275 22588 +313
Branches 2225 2260 +35
==========================================
+ Hits 19609 19815 +206
- Misses 2353 2450 +97
- Partials 313 323 +10
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
interim benchmarks: 1234.2674s / 20m 34s for python gather --no-prefetch against RocksDB on SRR1976948. with prefetch it's much worse, of course: 1886s / 8 GB of RAM. compare with fmg x rocksdb at 120s and 476 MB of RAM 😭 link Still, it's better than straight Python gather on both RAM and time!! |
full set of benchmarks -
|
for more information, see https://pre-commit.ci
This PR exposes RocksDB-based
RevIndex
to Python via the FFI layer, supports the load/search/gather APIs, and adds command-line based creation viasourmash index --rocksdb
.Tackles #3558
Fixes #3570
Specifics:
sourmash::index::revindex::disk_revindex::RevIndex
, including creation withsourmash index
, support for theIndex
search/gather protocol, and full manifest/picklist operations 🎉 .sourmash index
, as well as RocksDB, via-F/--index-type
, viz viz support more outputs fromindex
command & add more on-disk index tests? #3570;tests/test_revindex.py
to test finicky details; adds branchwater-created RocksDB files for testing;tests/test_sourmash.py
and moves them totest_cmd_index.py
; expands the tests to cover SBT, zip, and RocksDB index types;Idx
filter to thedisk_revindex.rs
code to support picklists on RocksDB databases (but, so far, only for the limited set of operations exposed via FFI);test_cmd_index.py
viz document and test skipmer moltypes #3449TODO:
revindex.py
cli/index.py
;index()
incommands.py
Build RocksDB:
run describe
run gather