Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task 3 - API - add search models by ontology terms other than GO #291

Closed
3 tasks done
lpalbou opened this issue Mar 18, 2020 · 11 comments
Closed
3 tasks done

Task 3 - API - add search models by ontology terms other than GO #291

lpalbou opened this issue Mar 18, 2020 · 11 comments
Assignees

Comments

@lpalbou
Copy link
Contributor

lpalbou commented Mar 18, 2020

Task requirement from Noctua Landing Page Project

This will be a three steps task:

  • include ontologies such as Uberon, CL in blazegraph
  • make sure one can perform a closure search (eg retrieve models also containing the more specific/children terms of the specified term)
  • Provide the API route for NLP UI

Also linked to #230

@lpalbou lpalbou changed the title API - add search models by ontology terms other than GO Task 3 - API - add search models by ontology terms other than GO Mar 18, 2020
@goodb
Copy link
Contributor

goodb commented Mar 21, 2020

@lpalbou out of curiosity, what is the NLP UI ? Is the API route for this expected to be any different from the API used to drive the search?

if you have time, a list test cases and expected results are always handy.

@goodb
Copy link
Contributor

goodb commented Mar 21, 2020

@lpalbou the search as it stands has a parameter ?goterm that should probably have been named something more generic like ?term . That searches over all ontology terms used to provide an rdf:type used in the models, including anatomy etc.

I've just done some successful tests with an extension to this that does query expansion down the subclass tree.

  1. When I commit this, do you want this generic ontology parameter renamed to e.g. "term"
  2. Do you want query expansion to be optional or always on?
  3. Do we need query expansion to run down the "part of" children as well as the subclasses ? If yes, this needs further specification. e.g. we might not want to expand down from biological processes to molecular function parts etc.

@goodb
Copy link
Contributor

goodb commented Mar 22, 2020

For testing this week. Will deploy this with the parameter goterm changed to term , with closure search always on and only looking at subclass descendants, not part of.

@lpalbou
Copy link
Contributor Author

lpalbou commented Mar 26, 2020

@goodb sorry for the delay.

NLP stands for Noctua Landing Page, based on the Noctua Search of @tmushayahama . So it is the same project / search that we have been discussing.

A prototype is available on dev: http://noctua-dev.berkeleybop.org/workbench/noctua-landing-page and ping me on Gitter to setup a demo and give you more contexts.

?goterm that should probably have been named something more generic like ?term
When I commit this, do you want this generic ontology parameter renamed to e.g. "term"

Yes please. @tmushayahama you will have to adjust your queries.

Do you want query expansion to be optional or always on?

Optional at the moment. It could become permanent if we were to highlight/explain the matched results in the response.

on and only looking at subclass descendants, not part of.

Sounds good at the moment. @vanaukenk any preference ?

Thanks Ben

goodb added a commit that referenced this issue Mar 28, 2020
now need to add the parameter expand to a term search query to get it to include results from matches on the subclass closure.  The value of the parameter does not matter and can be empty.  e.g. http://127.0.0.1:6800/search/?term=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FGO_0140312&expand=
@goodb
Copy link
Contributor

goodb commented Mar 28, 2020

@lpalbou ahh.. in my previous worlds NLP stands for something a bit different. I thought maybe you were talking about one of the tools the worm base developers have for pre-processing text. Thanks for clarifying. I made query expansion optional and updated docs. I am working in this branch now in case anyone was keen to test anything before it gets PRd into dev. https://github.com/geneontology/minerva/tree/bugfixes-on-blazegraph-conversion.

@goodb
Copy link
Contributor

goodb commented Apr 1, 2020

RE: #230 (comment)

The most recent PR (not yet available on dev) makes expansion optional. By default it is off. Adding the &expand parameter will return results from all subclasses of the submitted term. If you want to search by root terms I will need to do some optimization.

Let me know about priority on that optimization.

@goodb
Copy link
Contributor

goodb commented Apr 1, 2020

^ @tmushayahama

@tmushayahama
Copy link
Contributor

tmushayahama commented Apr 1, 2020

referring to #230 (comment)

@goodb root term was only an example, but It is giving "Query Timeout Exception" on other most higher up terms as well, I also tried with
"catalytic activity"
http://barista-dev.berkeleybop.org/search?offset=0&limit=50&term=GO:0003824&count
or "biological regulation"
http://barista-dev.berkeleybop.org/search?offset=0&limit=50&term=GO:0065007
and same time out exception. I don't know where the threshold is.

Extra: I tried these big terms below, term = "localization" works
image

For later, it will be nice to send it as an error message for all exceptions. For now it is 200ok

tagging @vanaukenk @lpalbou

goodb added a commit that referenced this issue Apr 1, 2020
@tmushayahama this should help get results back for the short term, but will not speed it up.  Will optimize as needed.
If the UI could refrain from asking for all possible models via the ontology term search, that would probably be a good thing, but its also fair to say the server should figure out how to handle such requests.
@lpalbou
Copy link
Contributor Author

lpalbou commented Apr 1, 2020

Referring to #230 (comment).

Let's discuss it with @vanaukenk . I would not concern with root terms (MF/BP/CC) but one would want to be able to search over catalytic_activity indeed.

@tmushayahama ben has increased the timeout in its commit. It could be sufficient for now but we'll have to see where this comes from. @goodb is it in the SPARQL query ? Or in some subsequent queries you do further annotate the results before sending them back to tremayne ?

Thanks

@goodb
Copy link
Contributor

goodb commented Apr 1, 2020

@lpalbou its in the sparql query over the model collection. Kind of surprised it was such a problem. Will need to look at it to see what's up.

@tmushayahama
Copy link
Contributor

Thanks @goodb @lpalbou for the quick response. @vanaukenk @lpalbou should I make search by exact term default and subtype can be turned on? Or subtype is always on?

@goodb goodb closed this as completed Apr 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants