-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MRG: change sig_from_record
to use scaled from Record
to downsample
#3387
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## latest #3387 +/- ##
=======================================
Coverage 86.46% 86.47%
=======================================
Files 137 137
Lines 16092 16095 +3
Branches 2219 2219
=======================================
+ Hits 13914 13918 +4
+ Misses 1871 1870 -1
Partials 307 307
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
sig_from_record
to use scaled from Record
to downsamplesig_from_record
to use scaled from Record
to downsample
sig_from_record
to use scaled from Record
to downsamplesig_from_record
to use scaled from Record
to downsample
Ready for review and merge @luizirber ! |
aaaand as I should have expected: in the branchwater plugin, I get breakages around md5sum tests as well as discovering a few bugs/unhandled situations! Which is good. Just, you know, more work :) |
Co-authored-by: Luiz Irber <luizirber@users.noreply.github.com>
Addresses #3386 #3387 changed `select` to update `Record.scaled` to the desired scaled value. This PR changes `CollectionSet` to require that all `scaled` values be the same, which can now be achieved by running `select` ;). It also adds a new method `Collection::min_max_scaled()` which makes it easy to retrieve `scaled` for a `Collection`.
## [0.17.2] - 2024-11-15 MSRV: 1.66 Changes/additions: * enforce a single scaled on a `CollectionSet` (#3397) * change `sig_from_record` to use scaled from `Record` to downsample (#3387) Updates: * Upgrade rocksdb to 0.22.0, bump MSRV to 1.66 (#3383) * Bump thiserror from 1.0.68 to 2.0.3 (#3389) * Bump csv from 1.3.0 to 1.3.1 (#3390) * Bump tempfile from 3.13.0 to 3.14.0 (#3391)
Developer updates: * build: move ORCID to metadata in pyproject.toml, fix pixi (#3416) * build: simplify Rust release (#3392) * fix: Avoid re-calculating md5sum on clone and conversion to KmerMinHashBTree (#3385) * r0.15.1 release (#3304) * update sourmash core to r0.17.0 (#3381) * Added union method to HLL (#3293) * Build: upgrade to newer maturin (#3366) * CI: use supported ubuntu for codspeed (#3350) * Fix clippy lints from 1.83 beta (#3357) * Implement resumability for revindex (#3275) * add `Manifest::intersect_manifest` to Rust core (#3305) * bump sourmash core to r0.17.2 (#3399) * change `sig_from_record` to use scaled from `Record` to downsample (#3387) * derive Hash for `HashFunctions` (#3344) * enforce a single scaled on a `CollectionSet` (#3397) * fix formatting from #3306 (#3307) * have ruff ignore ipynb so as to avoid triggering an error during CI (#3325) * improve downsampling behavior on `KmerMinHash`; fix `RevIndex::gather` bug around `scaled`. (#3342) * panic when `FSStorage::load_sig` encounters more than one `Signature` in a JSON record (#3333) * propagate error from `RocksDB::open` on bad directory (#3306) * refactor `calculate_gather_stats` to disallow repeated downsampling (#3352) * release core r0.17.1 (#3388) * release sourmash rust core r0.16.0 (#3356) * standardize on u32 for scaled, and introduce `ScaledType` (#3364) * update plugin documentation for users (#3286) * update sourmash core to r0.15.2 (#3338) * when lingroups are provided, use them for `csv_summary` (#3311) * Misc Rust updates to core (#3297) * Resolve issue for high precision MLE estimation (#3296) Dependabot and pre-commit CI updates: * Bump DeterminateSystems/magic-nix-cache-action from 7 to 8 (#3319) * Bump DeterminateSystems/nix-installer-action from 13 to 14 (#3320) * Bump DeterminateSystems/nix-installer-action from 14 to 15 (#3374) * Bump DeterminateSystems/nix-installer-action from 15 to 16 (#3401) * Bump camino from 1.1.7 to 1.1.9 (#3301) * Bump codspeed-criterion-compat from 2.6.0 to 2.7.2 (#3324) * Bump conda-incubator/setup-miniconda from 3.0.4 to 3.1.0 (#3373) * Bump csv from 1.3.0 to 1.3.1 (#3390) * Bump getset from 0.1.2 to 0.1.3 (#3328) * Bump histogram from 0.11.0 to 0.11.1 (#3377) * Bump js-sys from 0.3.72 to 0.3.74 (#3412) * Bump memmap2 from 0.9.4 to 0.9.5 (#3326) * Bump myst-parser from 3.0.1 to 4.0.0 (#3277) * Bump needletail from 0.5.1 to 0.6.0 (#3376) * Bump pypa/cibuildwheel from 2.19.2 to 2.20.0 (#3278) * Bump pypa/cibuildwheel from 2.20.0 to 2.21.1 (#3332) * Bump pypa/cibuildwheel from 2.21.1 to 2.21.2 (#3345) * Bump pypa/cibuildwheel from 2.21.2 to 2.21.3 (#3353) * Bump pypa/cibuildwheel from 2.21.3 to 2.22.0 (#3408) * Bump roaring from 0.10.6 to 0.10.7 (#3413) * Bump serde from 1.0.204 to 1.0.207 (#3289) * Bump serde from 1.0.207 to 1.0.208 (#3298) * Bump serde from 1.0.208 to 1.0.209 (#3310) * Bump serde from 1.0.209 to 1.0.210 (#3318) * Bump serde from 1.0.210 to 1.0.214 (#3368) * Bump serde from 1.0.214 to 1.0.215 (#3403) * Bump serde_json from 1.0.120 to 1.0.121 (#3267) * Bump serde_json from 1.0.121 to 1.0.122 (#3280) * Bump serde_json from 1.0.122 to 1.0.124 (#3288) * Bump serde_json from 1.0.124 to 1.0.125 (#3302) * Bump serde_json from 1.0.125 to 1.0.127 (#3309) * Bump serde_json from 1.0.127 to 1.0.128 (#3316) * Bump serde_json from 1.0.128 to 1.0.132 (#3358) * Bump serde_json from 1.0.132 to 1.0.133 (#3402) * Bump sphinx-design from 0.5.0 to 0.6.0 (#3268) * Bump sphinx-design from 0.6.0 to 0.6.1 (#3276) * Bump tempfile from 3.10.1 to 3.11.0 (#3279) * Bump tempfile from 3.11.0 to 3.12.0 (#3287) * Bump tempfile from 3.12.0 to 3.13.0 (#3340) * Bump tempfile from 3.13.0 to 3.14.0 (#3391) * Bump thiserror from 1.0.63 to 1.0.64 (#3335) * Bump thiserror from 1.0.64 to 1.0.65 (#3367) * Bump thiserror from 1.0.65 to 1.0.68 (#3379) * Bump thiserror from 1.0.68 to 2.0.3 (#3389) * Bump web-sys from 0.3.69 to 0.3.70 (#3299) * Bump web-sys from 0.3.70 to 0.3.72 (#3354) * Bump web-sys from 0.3.72 to 0.3.74 (#3411) * Update pytest-cov requirement from <6.0,>=4 to >=4,<7.0 (#3375) * Update sphinx requirement from <8,>=6 to >=6,<9 (#3269) * Upgrade rocksdb to 0.22.0, bump MSRV to 1.66 (#3383) * [pre-commit.ci] pre-commit autoupdate (#3281) * [pre-commit.ci] pre-commit autoupdate (#3290) * [pre-commit.ci] pre-commit autoupdate (#3312) * [pre-commit.ci] pre-commit autoupdate (#3330) * [pre-commit.ci] pre-commit autoupdate (#3336) * [pre-commit.ci] pre-commit autoupdate (#3341) * [pre-commit.ci] pre-commit autoupdate (#3346) * [pre-commit.ci] pre-commit autoupdate (#3360) * [pre-commit.ci] pre-commit autoupdate (#3369) * [pre-commit.ci] pre-commit autoupdate (#3380) * [pre-commit.ci] pre-commit autoupdate (#3393) * [pre-commit.ci] pre-commit autoupdate (#3404) * [pre-commit.ci] pre-commit autoupdate (#3409) * [pre-commit.ci] pre-commit autoupdate (#3414)
Fixes #3384.
This PR changes
Manifest::select
so that ifscaled
is set in the selection, all matchingRecord
s have their scaled value updated. It also updatesSelection::from_record
to setscaled
to match theRecord
scaled value. In turn, this allowsCollection::sig_from_record
to respect the specifiedscaled
value when loadingSignature
.