You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In #1837, we change sourmash sig extract to identify signatures to extract using manifest rows, and then have to convert the manifest rows into a manifest and then from there into a picklist in order to actually extract the sketches. This seems circuitous.
It also means that sourmash sig extract --picklist does not work on certain database types that do not support multiple picklists - LCA DBs, SBTs, and zipfiles w/o a manifest, for example.
Two ideas, not mutually exclusive -
one, we could have Index classes provide a signature getter that works on internal locations in manifests.
two, we could directly provide a method for retrieving many signatures, given a manifest (or, really, just a list of internal locations).
What I don't remember offhand is whether all Index classes support internal locations. If not, that would be a problem.
The text was updated successfully, but these errors were encountered:
Some things going on the SqliteIndex PR #1808 make me think that we should enable individual retrieval via manifest row. That gives storages the ability to figure out what collection of information is best, include off-label manifest row columns like primary keys in sqlite databases...
defzipfile_load_ss_from_row(db, row):
data=db.storage.load(row['internal_location'])
sigs=sourmash.signature.load_signatures(data)
return_sig=Noneforssinsigs:
ifss.md5sum() ==row['md5']:
assertreturn_sigisNone# there can only be one!return_sig=ssifreturn_sigisNone:
raiseValueError("no match to requested row in db")
returnreturn_sig
Curious how this approach would generalize to all Index classes and also how it would interact with Rust Collection layer.
In #1837, we change
sourmash sig extract
to identify signatures to extract using manifest rows, and then have to convert the manifest rows into a manifest and then from there into a picklist in order to actually extract the sketches. This seems circuitous.It also means that
sourmash sig extract --picklist
does not work on certain database types that do not support multiple picklists - LCA DBs, SBTs, and zipfiles w/o a manifest, for example.Two ideas, not mutually exclusive -
one, we could have
Index
classes provide a signature getter that works on internal locations in manifests.two, we could directly provide a method for retrieving many signatures, given a manifest (or, really, just a list of internal locations).
What I don't remember offhand is whether all
Index
classes support internal locations. If not, that would be a problem.The text was updated successfully, but these errors were encountered: