-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
relative path interpretation for manifests is a little fubar #3053
Comments
The problem in the code is here: https://github.com/sourmash-bio/sourmash/blob/latest/src/sourmash/save_load.py#L162-L171 here: https://github.com/sourmash-bio/sourmash/blob/latest/src/sourmash/index/__init__.py#L1157-L1158 and here: https://github.com/sourmash-bio/sourmash/blob/latest/src/sourmash/index/__init__.py#L1188-L1192 basically, the prefix is set from the dirname of the manifest, and then prepended wilhe-nilhe to the A workaround that @AnneliektH used successfully was to build the manifest with |
see discussion in #3048 (comment); once #3054 is merged, solution will be to use |
…` and `sig collect` (#3054) This PR updates `sig collect` and `sig check` so that they can produce standalone manifests that work properly with default sourmash loading behavior. The default behavior produces broken manifests in some situations and is not changed, but will be deprecated in v5. ## Details Currently, `sig collect` and `sig check` default to producing standalone manifests with internal path locations relative to the current working directory. This conflicts with the default `StandaloneManifest` behavior implemented in `save_load.py` that loads path locations relative to the manifest location. As a result, whenever the manifest was in a subdirectory, the standalone manifests output by `sig check` and `sig collect` were broken. The only way to make good manifests in this situation was to use `sig collect --abspath`, but `sig check` didn't support `--abspath`, and using absolute paths is brittle in situations where you want to distribute manifests. This PR adds `--relpath` to both `sig check` and `sig collect`, and adds `--abspath` to `sig check`. It also demonstrates the bad behavior in tests and annotates the tests appropriately. See #3008 (comment) for more detailed discussion of why I think `--relpath` is the right behavior for the future. - [x] adds `--abspath` and `--relpath` to `sig check`, to properly support relative paths; - [x] adds `--relpath` to `sig collect`, to properly support relative paths; - [x] documents this behavior properly for creating standalone manifests; - [ ] create issue to change default `sig check` and `sig collect` behavior for v4, and disable cwd behavior. Techie TODO: - [x] explicitly test `relpath` and `abspath` behavior in `sig check`; - [x] explicitly test `relpath` behavior in `sig collect` - [x] write some tests for `sig check` and `sig collect` to explore the relative path loading issue, with all three combinations of relpath: mf in cwd, sigs in subdir; mf in subdir, sigs in cwd; mf in subdir, sigs in subdir. Related issues: * Addresses #3008 * Addresses issues in #3048 by updating `sig check` to support `--relpath`; * Fixes #3053 - `--relpath` again --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
This PR: * fixes a minor nit in `sourmash sig collect` output where it said "loaded 0 signatures" * updates a lot of the documentation around standalone manifests to encourage their use * in tandem, modifies docs to discourage loading from pathlists/from-files and directory hierarchies TODO: - [x] look at TODO item re directories in sig collect - [x] think about adding #3023 information into docs about lazy loading; maybe in the advanced databases document? - [x] update `sig manifest` docs to point out that they do not generate standalone manifests - [x] revisit branchwater plugin documentation to, to either make issues or make changes - [x] update `sig check` and `sig collect` to tell people to expand their paths ref #3039 - [x] update docs more to recommend against pathlists and directories per #3040 Related issues: * sourmash-bio/sourmash_plugin_branchwater#235 * Fixes #3048 * Fixes #3009 by recommending `sig collect` and `sig check` instead of `sig manifest` for making standalone manifests * #3053 * Fixes #3023 * Fixes #3039 * Fixes #3040 --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Tessa Pierce Ward <bluegenes@users.noreply.github.com>
(this is what is triggering the problem in #3048 (comment))
with the following directory structure:
mf.csv built by doing:
then mf.csv looks like this:
and then
yields
more info soon
related to:
internal_location
- what do we recommend? #3008The text was updated successfully, but these errors were encountered: