Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow aggregation of queryables across collections #505

Closed
gadomski opened this issue May 2, 2023 · 5 comments · Fixed by #511
Closed

Allow aggregation of queryables across collections #505

gadomski opened this issue May 2, 2023 · 5 comments · Fixed by #511
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@gadomski
Copy link
Member

gadomski commented May 2, 2023

It can be useful to find the set of queryables available across collections. Here's one possible usage:

client = Client.open("https://pystac-client.test")

# GET /queryables
_: Set[str] = client.queryables()

# GET /queryables/collection-a/queryables
# GET /queryables/collection-b/queryables
# Put all queryables in a set and return
_: Set[str] = client.queryables(["collection-a", "collection-b"])

There's a question to resolve around whether queryables() should hit /queryables or do an aggregation of all collections.

@gadomski gadomski added the enhancement New feature or request label May 2, 2023
@jsignell
Copy link
Member

jsignell commented May 2, 2023

Is the response different from get_queryables? If we want allow aggregation maybe we could add a recursive option to get_queryables?

@gadomski
Copy link
Member Author

gadomski commented May 2, 2023

Ah no, thanks! I forgot that we just added that :-). The primary motivating use-case was around "I have this subset of collections that I want to query against", so just recursive wouldn't be enough -- we'll want to specify collection ids. But maybe:

client.get_queryables(collections=["a", "b"])

that would hit /collections/a/queryables and /collections/b/queryables and union the queryables together.

@jsignell
Copy link
Member

jsignell commented May 2, 2023

Yeah and that feels more ergonomic than something like this?

set(client.get_collection(c).get_queryables() for c in ["a", "b"])

@gadomski gadomski added this to the 0.7.0 milestone May 4, 2023
@ircwaves ircwaves self-assigned this May 5, 2023
@ircwaves
Copy link
Member

ircwaves commented May 5, 2023

The response from get_queryables has "$id", "$schema", "properties" as the keys.

  1. $schema seems straight forward, as that is always the same.
  2. $id is less clear to me. I default to wanting to drop it.
  3. properties seems like what @jsignell suggested could be implemented by successive .update calls, but this could be erroneous if a parameter has different values in separate collections, and only the last value would be in there.

@gadomski -- does that fit the bill for this?

@gadomski
Copy link
Member Author

gadomski commented May 8, 2023

$schema seems straight forward, as that is always the same.

Hopefully, right? Probably worth assert-ing just in case.

$id is less clear to me. I default to wanting to drop it.

Yeah, drop it.

if a parameter has different values in separate collections, and only the last value would be in there.

In this case I think we should drop the entry. If the $ref is different for the same queryable key, then we can't reliably query on that key IMO.

What to do about additionalProperties? I think we should assert that it's the same for every collection's queryables, and throw an error if it isn't.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants