Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Imported ontologies should be queryable as separate graphs for robot query #158

Closed
cmungall opened this issue Apr 12, 2017 · 36 comments
Closed
Assignees

Comments

@cmungall
Copy link
Contributor

Robot converts an OWL ontology to Turtle for presentation to Jena for SPARQL queries. It appears the ontology goes into an (unnamed?) graph.

Would it not make sense to make the whole import chain queryable, and to preserve each ontology as its own graph?

As an aside, it would be great to have more published standards here. For example, OntoBee puts each ontology in its own graph which its nice, but it does its own renaming of the URI for the graph.

I thought to implement this in Robot it would be a straightforward switch of this line to use Trig...

...unfortunately saving Trig from the OWLAPI does not have the effect I would expect. It only seems to save the parent ontology, not the closure... ...and it seems to place each class in its own unnamed graph, hmm.

It would be relatively straightforward to iterate through the imports closure and add each separately, thought there may be a cleaner way.

And finally (this may deserve a separate ticket) but it is common to store inferences in a separate NG. It would be straightforward to do this stepwise in robot (reason, save results, and then combine this into the source ontology as an import). There may be a more elegant way to do this?

cc @balhoff @dougli1sqrd

@jamesaoverton
Copy link
Member

Separate named graphs for imports and inferences would be very nice, as long as it's still easy to query everything at once. Many (but not all?) systems make the default graph the union of the named graphs, and I think that would be good behaviour in this case. I don't remember what Fuseki does off-hand.

The graph for each import can use the import IRI as its name. The graph for the uninferred ontology can use the ontology IRI as its name. The graph with inferences would need a new IRI, which could be ROBOT-specific.

Off the top of my head, I can't think of a better method than iterating through the imports, converting to Turtle, and inserting into a named graph.

Maybe the query command can accept a --reasoner option to indicate that reasoning should be done.

@cmungall
Copy link
Contributor Author

cmungall commented Apr 12, 2017 via email

@dougli1sqrd dougli1sqrd self-assigned this May 1, 2017
@dougli1sqrd
Copy link
Contributor

It looks like the way we load data from the OWL API into a Jena DatasetGraph might be part of the difficulty here. I tried to manually add some named graphs from the OWLOntology object, but the sparql queries that talk to named graphs turn up nothing. I think it's possible Jena wants us to use Dataset as our primary way to SPARQL. Based on this article: https://www.ibm.com/developerworks/community/blogs/nlp/entry/an_introduction_to_the_jena_api?lang=en. I haven't had a chance to completely explore this yet, but this is what I've run into in my research.

@jamesaoverton
Copy link
Member

I was working on something similar lately, so I think I know the solution. In order to make the default graph be (or just include?) the union of all named graphs, I switched to TDB for managing the dataset, with the settings described here: https://jena.apache.org/documentation/tdb/datasets.html

@dougli1sqrd
Copy link
Contributor

Ah yeah. It looks like that's what they're using in the IBM article, too. Do you want to be assigned to this ticket then instead of (or in addition to) me?

@jamesaoverton
Copy link
Member

No thanks, I have too many deadlines right now.

@dougli1sqrd
Copy link
Contributor

Oh sorry, I guess I misunderstood. You were just saying you know how to do it in the robot case because of a different project, not that you have done it here already? ha, my mistake.

@zhengj2007
Copy link

It's is very useful feature. We have several ontologies built based on OBO Foundry ontologies used for data loading and search. If this feature implemented in ROBOT tool, we can easily identify whether OBO Foundry ontology terms use consistently in the ontologies used for data loading and search.

Looking forward to seeing the feature in ROBOT.

@jamesaoverton
Copy link
Member

This should not be hard to implement. The biggest questions in my mind are:

  1. what to name each import: the version IRI, the import IRI, the ontology IRI? I can imagine any of these choices causing some confusing. @cmungall @balhoff does the recent discussion of names for ontology parts/variants help clarify this?
  2. what to do with the default graph; it would be convenient to make it the union of all named graphs, but that would break backwards compatibility; if not, we need a name for the union

@cmungall
Copy link
Contributor Author

Backwards compatibility is important. We could have a command line switch with 3+ possibilities

  • core (default) - use only main ontology
  • union - all in one graph
  • stratify - each ontology in a named graph named by ontology IRI (the versionIRI will be associated with that graph)

Need to think how this interacts with reason command

@zhengj2007
Copy link

@cmungall @jamesaoverton For our use case, we don't need to reason on the ontology. Any expected date on its implementation in robot? Thanks!

@jamesaoverton
Copy link
Member

@zhengj2007: @rctauber is working on this. We have a lot to do before ICBO, so I'm not sure when it will be ready.

@zhengj2007
Copy link

@jamesaoverton Thanks for update.
@rctauber Thanks for working on it.

@beckyjackson
Copy link
Contributor

I made some progress on this here: https://github.com/rctauber/robot/tree/graphs
This branch includes unit and integration tests to make sure the new features work and to support backwards compatibility.

It adds in a new --imports option to query:

  • --imports ignore default behavior, does not load imports
  • --imports union loads imports as named graphs and queries over the union of the graphs
  • --imports graphs loads imports as named graphs and queries on the named graphs

This option is just a suggestion, if anybody has another idea on how to implement this I'd love to hear it!

@balhoff
Copy link
Contributor

balhoff commented Jul 25, 2018

@rctauber looking at @jamesaoverton and @cmungall 's descriptions, I'm not sure what the difference would be in your union and graphs options. I think union according to @cmungall would just put everything into the default graph instead of loading as named graphs. It seems to me you may just need ignore and graphs. And then specify that the default graph queries the union of the named graphs in the graphs case (I believe you need to set this up on purpose in Jena, although it is the default behavior for many triplestores like Blazegraph). If the default graph works this way, I don't see why we need an option for loading all imports but not putting them into named graphs.

@beckyjackson
Copy link
Contributor

True, that probably makes more sense. Should it be --imports ignore and --imports graphs or maybe --use-graphs with true and false?

@beckyjackson
Copy link
Contributor

I pushed a new update with the option --use-graphs true and --use-graphs false (default: false). If you set it to true, the default graph is the union of all imports, otherwise you can specify an import by its IRI.

A problem that @jamesaoverton pointed out is that the actual ontology IRIs of the import documents may collide (or be null). The import IRI may be different than the actual ontology IRI. Right now, the graph name is the ontology IRI from the ontology ID for an OWLOntology object. If that IRI is null, it will fail. If that IRI is the same as another import's ontology IRI, there will be a name collision.

As far as I know, OWLAPI doesn't provide a method for mapping the import IRIs to the actual OWLOntology objects that are returned when you run ontology.getImports(). The benefit of using --imports union is that you could load all the imports without worrying about their IRIs. --imports graphs would still run into the same problem as --use-graphs true, but at least users would have an alternative with the union option.

That said, the --use-graphs option may be a bit more user-friendly.

@balhoff
Copy link
Contributor

balhoff commented Jul 27, 2018

If that IRI is the same as another import's ontology IRI, there will be a name collision.

I think this should cause an exception in the OWLOntologyManager anyway—it won't load two ontologies with the same ontology IRI. For anonymous ontologies, I would suggest autogenerating a graph IRI (something like urn:uuid:EF2F72A6-79DC-40C7-A5D2-0D00B9120F65). It won't matter that the user doesn't know what it is.

@cmungall
Copy link
Contributor Author

cmungall commented Aug 2, 2018

New commits look good, instructions in the markdown seem clear

@zhengj2007
Copy link

@rctauber Thanks for implementing the feature. When will it be available in the release version of ROBOT? Is it possible including the feature in release 1.1.0 @jamesaoverton ? Thanks!

@jamesaoverton
Copy link
Member

@zhengj2007 I merged this yesterday, and it's included it in the 1.2.1-alpha-1 release: https://github.com/ontodev/robot/releases/tag/v1.2.0-alpha-1

@zhengj2007
Copy link

@jamesaoverton Thanks a lot!

@beckyjackson
Copy link
Contributor

Implemented by 882a517 - please re-open if this requires more discussion.

@zhengj2007
Copy link

@rctauber I downloaded the robot.jar that contains the feature from: https://github.com/ontodev/robot/releases/tag/v1.2.0-alpha-1

I tried '--use-graphs true' options in query. I sent the query like:
robot query --use-graphs true --input gates.owl --query QC_termWithMultipleLabels.rq output.csv
But got IllegalArgumentException error: Unknown command or option: --use-graphs

How should I use this option? Thanks!

@beckyjackson
Copy link
Contributor

Hi @zhengj2007 - I just tried to replicate your problem with the jar from the pre-release, but I was able to use the --use-graphs option.

When you downloaded the jar, did you replace the jar in your system PATH?

@zhengj2007
Copy link

@rctauber I replaced the old jar file by the newly downloaded one. So, it should be in my system PATH, right?

@beckyjackson
Copy link
Contributor

Yes - it should be. Can you confirm that your PATH points to where you replaced that jar? If you're on MacOS, it should be in ~/.bash_profile. For Windows, go to System -> Advanced system settings -> Environment Variables.

@jamesaoverton
Copy link
Member

And you can run robot version to check which version is actually being run.

@zhengj2007
Copy link

@rctauber Thanks! I will check it.

@jamesaoverton I ran the command and got "ROBOT version null" message.

@jamesaoverton
Copy link
Member

ROBOT version null means an old version, without --use-graphs. It should say "ROBOT version 1.2.0-alpha-1".

@zhengj2007
Copy link

@jamesaoverton got it. Will check what's wrong.

@zhengj2007
Copy link

@rctauber @jamesaoverton I found the issue. I forgot that I installed the robot under usr/local/bin but I updated the robot.jar in my downloaded folder. Now I am using the version of robot 1.2.0-alpha-1. Thanks for your help.

@beckyjackson
Copy link
Contributor

Great!

@zhengj2007
Copy link

@rctauber The queries that treat imported ontologies as separate graphs worked well (using --use-graphs true). Thanks for your efforts.
However, when I query the same ontology that import multiple OWL files and want to treat them as a union single graph, it does not work (using --use-graphs false). Always return 0 row. I need to run merge OWL files command then run the query. Did I miss anything? Thanks!

@beckyjackson
Copy link
Contributor

beckyjackson commented Sep 11, 2018 via email

@zhengj2007
Copy link

@rctauber Thanks for your explanation. It's very helpful. Now everything works fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants