Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Join queries #97

Open
mustafahsyed opened this issue Mar 17, 2023 · 8 comments
Open

Join queries #97

mustafahsyed opened this issue Mar 17, 2023 · 8 comments

Comments

@mustafahsyed
Copy link

Description

Perform a join query between “individual” and “biosample” or any other two or more entity

Proposed solution

New API endpoint to support queries involving multiple entities.

Definition of Done

Availability of new API endpoint

@mustafahsyed
Copy link
Author

Hi @costero-e

Please let me know if join queries are possible.

Cheers
Mustafa

@costero-e
Copy link
Collaborator

Hi @mustafahsyed, by join queries do you mean aggregating data from two endpoints and return it in a single record or applying filters to multiple endpoints at once? We are planning to develop multiple endpoints filters, e.g. show me variants for a specific position for individuals that are male (specifications do accept this), but we are not planning to return data aggregated from different endpoints.

@mustafahsyed
Copy link
Author

Hi @costero-e
Ideally both filtering and aggregating data but we can begin by just having option to filter data using multiple endpoints. Please let me know when such join multiple endpoints filters will be available?
Excellent work!
Thanks

@costero-e
Copy link
Collaborator

Hi @mustafahsyed,
today I have finished implementing "cross queries", which can be used to apply a filter to a collection not belonging to the final collection you want to get the documents from.
Here is one example you can try:

curl \
  -H 'Content-Type: application/json' \
  -X POST \
  -d '{
    "meta": {
        "apiVersion": "2.0"
    },
    "query":{ "requestParameters": {
        },
        "filters": [
{"id":"NCIT:C20197", "scope":"individual"} ],
        "includeResultsetResponses": "HIT",
        "pagination": {
            "skip": 0,
            "limit": 0 
        },
        "testMode": false,
        "requestedGranularity": "count"
    }
}' \
  http://localhost:5050/api/g_variants

Aggregating data is not in our roadmap though right now, as it is not part of the spec.
Let me know if this new implementation fits your needs or is what you expected.
Best,
Oriol

@mustafahsyed
Copy link
Author

mustafahsyed commented May 14, 2024

@costero-e
Join query now works great! Appreciate adding this very useful update.

One more related question:
If I like to pass mutiple filter IDs for the same data element, eg., query diseases.diseaseCode NCIT:C3270 or NCIT:C16576. How should my payload should look like?

My payload for the post query looks like this, but it does not works:

{"query": { "filters": [{"id":"NCIT:C3270,NCIT:C16576"}], "includeResultsetResponses": "HIT", "pagination": { "skip": 0, "limit": 10000 }, "testMode": "false", "requestedGranularity": "record" } }
Cheers
Mustafa

@costero-e
Copy link
Collaborator

Hi @mustafahsyed ,
I'm glad the hear the appreciation, thanks.
About the payload for multiple filters, taking into consideration that both filters are present in the filtering terms endpoint, then it should look like this:

curl \
  -H 'Content-Type: application/json' \
  -X POST \
  -d '{
    "meta": {
        "apiVersion": "2.0"
    },
    "query": {
        "requestParameters": {
        },
        "filters": [
{"id": "NCIT:C16576"}, {"id": "NCIT:C3270"}],
        "includeResultsetResponses": "HIT",
        "pagination": {
            "skip": 0,
            "limit": 1000
        },
        "testMode": false,
        "requestedGranularity": "record"
    }
}' \
  http://localhost:5050/api/individuals

If you want to point more specifically which is the scope the filtering term is applying to, or apply a join query, then you can do like this:

curl \
  -H 'Content-Type: application/json' \
  -X POST \
  -d '{
    "meta": {
        "apiVersion": "2.0"
    },
    "query": {
        "requestParameters": {
        },
        "filters": [
{"id": "NCIT:C16576", "scope": "individual"}, {"id": "NCIT:C3270", "scope": "biosample"}],
        "includeResultsetResponses": "HIT",
        "pagination": {
            "skip": 0,
            "limit": 1000
        },
        "testMode": false,
        "requestedGranularity": "record"
    }
}' \
  http://localhost:5050/api/analyses

Best,

Oriol

@mustafahsyed
Copy link
Author

@costero-e
Above payload pulls any record with both terms, NCIT:C3270 and NCIT:C16576. What I am looking for is all records with diseaseCode either NCIT:C3270 or NCIT:C16576.

Payload you suggested above performs "AND" operation, I am looking for "OR" operation. Please let me know how I can have an OR query.

@costero-e
Copy link
Collaborator

Hi @mustafahsyed , this is still not in the standards but I know there is a plan for having it included in the short future.
Once it is accepted, I will develop "OR" operations for RI as well and I will let you know.

Best,
Oriol

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants