Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add exclude-tautologies option for reason command. #560

Merged
merged 7 commits into from
Sep 11, 2019

Conversation

balhoff
Copy link
Contributor

@balhoff balhoff commented Aug 21, 2019

This fixes #173. Two things to maybe discuss before merging:

  • The default for this arg is false. This is just to maintain current behavior. However I think most people consider the current behavior to be a bug. I would prefer to make true the default, but this is less conservative.
  • I implemented a manual check for three kinds of axioms (X SubClassOf owl:Thing, owl:Nothing SubClassOf X, X SubClassOf X). @matentzn had proposed a more principled approach where any axiom that is not a subclass axiom is checked by seeing if it is entailed by a reasoner on an empty ontology. I left this out for now, but it could be easily added. It occurred to me that we can always use HermiT for this, since it is reasoning on the empty ontology, not the input ontology. But I don't know how much of a performance hit it would be to run axioms through that query.

@matentzn
Copy link
Contributor

My two cents. I strongly suggest making true tautologies filtering a possibility but to also make it possible to stick to the faster syntactic heuristics you have already implemented. I lean towards this:

--exclude-tautologies all: do the syntactic filter first, and if they dont apply, do the proper check
--exclude-tautologies fast: do only syntactic filter
--exclude-tautologies false: dont remove anything

Note on syntactic filter: we should at least add A equivalentTo A, I think.

  • I would be happy with either 'fast' of 'false' to be the default; I personally lean towards 'fast' by 10%.

@cmungall
Copy link
Contributor

I also lean toward fast by default

I might give it a more informative name, structural?

@matentzn
Copy link
Contributor

+1 on structural

@balhoff
Copy link
Contributor Author

balhoff commented Aug 22, 2019

I added the all option to check using HermiT. For this case I think it is most consistent to solely query HermiT rather than first checking using the "structural" code (most axioms would not match anyway and these would all get passed to HermiT).

I beefed up the structural option. @matentzn please review:

if (tautologiesOption.equalsIgnoreCase("structural")) {
if (a instanceof OWLSubClassOfAxiom) {
OWLSubClassOfAxiom subClassOfAxiom = (OWLSubClassOfAxiom) a;
if (subClassOfAxiom.getSuperClass().isOWLThing()) {
continue;
} else if (subClassOfAxiom.getSubClass().isOWLNothing()) {
continue;
} else if (subClassOfAxiom.getSubClass().equals(subClassOfAxiom.getSuperClass())) {
continue;
}
} else if (a instanceof OWLEquivalentClassesAxiom) {
OWLEquivalentClassesAxiom equivAxiom = (OWLEquivalentClassesAxiom) a;
if (equivAxiom.getClassExpressions().size() < 2) {
continue;
}
} else if (a instanceof OWLClassAssertionAxiom) {
OWLClassAssertionAxiom classAssertion = (OWLClassAssertionAxiom) a;
if (classAssertion.getClassExpression().isOWLThing()) {
continue;
}
} else if (a instanceof OWLObjectPropertyAssertionAxiom) {
OWLObjectPropertyAssertionAxiom assertion = (OWLObjectPropertyAssertionAxiom) a;
if (assertion.getProperty().isOWLTopObjectProperty()) {
continue;
}
} else if (a instanceof OWLDataPropertyAssertionAxiom) {
OWLDataPropertyAssertionAxiom assertion = (OWLDataPropertyAssertionAxiom) a;
if (assertion.getProperty().isOWLTopDataProperty()) {
continue;
}
}

I've left the default as false. If I change the default to structural, some existing tests will need to be updated. @jamesaoverton what do you think about changing default behavior?

@matentzn
Copy link
Contributor

Awesome Jim, i am cool with the solution. I would add though that the syntactic check you currently have is guaranteed to be sound, so keeping it for the ‘all’ case is safe and may radically improve performance (think of all the crap that will be loaded in a reasoner constructor even if the input is zero; data structures allocated with memory etc; this means that doing this for 100K+ axioms can be very costly). Else: I am excited to have this awesome feature! Thanks!

@matentzn
Copy link
Contributor

Btw your structural code is very good and I have no problems. Smart thinking on equivalent classes test :P

@matentzn
Copy link
Contributor

Sorry correction: since you are reusing one empty reasoner: it’s not the many reasoner problem; problem is that the reasoner is likely to cache some trace of the axiom you are testing; that will grow in memory. I would definitely suggest to do the structural pass first no matter what. And in any case: only do the test for subclass equivalent class, and the three assertion axiom types; ignore all other axiom types!

@balhoff
Copy link
Contributor Author

balhoff commented Aug 22, 2019

@matentzn

Sorry correction: since you are reusing one empty reasoner: it’s not the many reasoner problem; problem is that the reasoner is likely to cache some trace of the axiom you are testing; that will grow in memory. I would definitely suggest to do the structural pass first no matter what. And in any case: only do the test for subclass equivalent class, and the three assertion axiom types; ignore all other axiom types!

My argument against this is that the subclass axioms that get filtered out in the structural pass would be a small percentage of all those checked. All the rest would get sent to HermiT anyway. I can also imagine some tautologies for other axiom types. I value the simplicity and consistency of the HermiT mode simply asking HermiT if the axiom is entailed. :-)

@matentzn
Copy link
Contributor

Well, I like the simplicity as well :) since stacking arguments against one another is futile, I recommend taking monarch.owl as a huge benchmark Ontology and try with or without structural pre-check; that should give us a good sense which of the two is right. My main argument is: the outcome of with or without structural check leads to the exact same answer in all cases; but the syntactic pre check consumes less memory and is faster than the simple is entailed by call. Any chance we can just make a test?

@matentzn
Copy link
Contributor

One last thing: I would really like this operation to be a filter as well, to weed out asserted tautologies; are there any arguments against having the exact same option on remove of filter?

@balhoff
Copy link
Contributor Author

balhoff commented Aug 22, 2019

I just timed it on a file I had convenient (uberon-go-cl-ro.ofn). The structural option adds negligible time over doing nothing. The all option takes 3 or 4 seconds. Doing the structural check first doesn't improve the time (to be clear, in this case I am still checking a lot of other subclass axioms with HermiT—which I think is the right thing to do).

@matentzn
Copy link
Contributor

Convinced. Thanks @balhoff for bearing with me here!

@balhoff
Copy link
Contributor Author

balhoff commented Aug 22, 2019

One last thing: I would really like this operation to be a filter as well, to weed out asserted tautologies; are there any arguments against having the exact same option on remove of filter?

I agree this could be handy, but let's open another ticket if you want to request it. Whoever implements could refactor what I've done here in order to share code.

@jamesaoverton
Copy link
Member

Thanks @balhoff. I've just skimmed this so far but it looks good. Let me know when you want me to review in detail.

One question at this point: Why hardcode HermiT for the tautologyChecker? Would it be better to make a new copy of whatever was specified by the --reasoner option?

@balhoff
Copy link
Contributor Author

balhoff commented Aug 23, 2019

Why hardcode HermiT for the tautologyChecker? Would it be better to make a new copy of whatever was specified by the --reasoner option?

Various reasoners don't support the isEntailed method (e.g. ELK), and HermiT does. Also, with different reasoners there is varying support for semantics of concepts like topObjectProperty. Since HermiT is the most complete OWL DL implementation, and it only has to reason over an ontology with zero axioms, we can use it always.

Please go ahead and review—I won't be making any more changes unless you conclude that you want to change the default mode from false to structural.

@matentzn
Copy link
Contributor

I think the Elk version will just fail if you try to use it (isEntaileBy method not implemented), and since you check against the empty ontology it makes sense to use the complete approach anyways.. so if you wanted to classify with elk and use hermit for tautology removal, it would make it quite complicated I think !

@matentzn
Copy link
Contributor

Whoops Jim beat me.

@jamesaoverton
Copy link
Member

@balhoff We're hoping to make a point release soonish. Other than the minor changes requested, is this ready to merge?

@balhoff
Copy link
Contributor Author

balhoff commented Sep 10, 2019

@jamesaoverton I addressed the change requests, so this is ready for you and @beckyjackson to check.

@jamesaoverton
Copy link
Member

Ok, all @beckyjackson and my comments have been addressed. I'll merge this and we'll make a point release soon.

Thanks @balhoff !

@jamesaoverton jamesaoverton merged commit 7840c72 into ontodev:master Sep 11, 2019
@beckyjackson
Copy link
Contributor

@jamesaoverton @balhoff I just bumped up ROBOT to 1.4.2 for DO in the Makefile. It seems like these changes now logs errors for all axioms with a deprecated subject (including annotation assertion axioms), of which DO has many. The Travis build fails with:

The job exceeded the maximum log length, and has been terminated.

I resolved this by using a custom logging.properties file for the DO build and turning logging off for the ReasonOperation class but this is just masking the problem. Other people may run into the same thing.

I'm sorry I didn't catch this sooner.

@jamesaoverton
Copy link
Member

When using this new option or not? I tested 1.4.2 on OBI and MRO and didn’t notice extra log noise.

@matentzn
Copy link
Contributor

Is this a confirmed issue? I am itching to upgrade, but I want to be sure my travis jobs wont be failing because of this.. Many release pipelines rely on travis+robot now and I cant test them all :/

@jamesaoverton
Copy link
Member

Now I do see the problem with OBI, so it's a confirmed issue.

@jamesaoverton
Copy link
Member

I don't think the problem is caused by this PR. I've opened a new issue: #567

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

reasoner behavior with Thing and Nothing
5 participants