Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

robot outputs invalid turtle syntax #1129

Closed
balhoff opened this issue Jul 7, 2023 · 22 comments · Fixed by #1135
Closed

robot outputs invalid turtle syntax #1129

balhoff opened this issue Jul 7, 2023 · 22 comments · Fixed by #1135
Assignees
Labels

Comments

@balhoff
Copy link
Contributor

balhoff commented Jul 7, 2023

This may be a problem in OWL API, but just tracking here for now. Run these commands on the latest MONDO release:

curl -L -O 'http://purl.obolibrary.org/obo/mondo/releases/2023-07-03/mondo.owl'
riot --validate mondo.owl
# outputs some warnings about URN format in SWRL variables, but not fatal
robot convert -i mondo.owl -o mondo.ttl
riot --validate mondo.ttl
# 15:17:40 ERROR riot            :: [line: 212732, col: 100] Unrecognized (expected an RDF Term): [SEMICOLON]

The problem section of the file looks like this:

###  http://purl.obolibrary.org/obo/MONDO_0000290
obo:MONDO_0000290 rdf:type owl:Class ;
                  owl:equivalentClass [ owl:intersectionOf ( obo:MONDO_0005550
                                                             _:genid24281 rdf:type owl:Restriction ;
                                                                          owl:onProperty obo:RO_0014001 ;
                                                                          owl:someValuesFrom obo:NCBITaxon_5763
                                                           ) ;
                                      rdf:type owl:Class
                                      ] ;
rdfs:subClassOf obo:MONDO_0002428 ,
                obo:MONDO_0020067 ,
                _:genid24281 ;
obo:IAO_0000115 "A infectious disease involving the Naegleria fowleri." ;
terms:conformsTo <http://purl.obolibrary.org/obo/mondo/patterns/infectious_disease_by_agent.yaml> ;
oboInOwl:hasDbXref "DOID:0050242" ,
                   "GARD:0009554" ,
                   "MESH:C535275" ,
                   "SCTID:721816008" ,
                   "UMLS:C0300934" ,
                   "UMLS:C4303098" ;
oboInOwl:hasExactSynonym "Naegleria fowleri infection" ;
oboInOwl:hasRelatedSynonym "infections, Naegleria fowleri" ;
oboInOwl:id "MONDO:0000290" ;
oboInOwl:inSubset mondo:mondo_rare ,
                  mondo:rare ;
rdfs:label "primary amebic meningoencephalitis" ;
skos:exactMatch <http://identifiers.org/mesh/C535275> ,
                <http://identifiers.org/snomedct/721816008> ,
                <http://linkedlifedata.com/resource/umls/id/C0300934> ,
                <http://linkedlifedata.com/resource/umls/id/C4303098> ,
                obo:DOID_0050242 .

_:genid24281 rdf:type owl:Restriction ;
              owl:onProperty obo:RO_0014001 ;
              owl:someValuesFrom obo:NCBITaxon_5763 .

In the intersection list you would expect the existential restriction to be enclosed in [ ], but instead there is a blank node ID _:genid24281 followed by invalid inline properties. The blank node ID is reused twice below, which it should not be (OWL is supposed to use fresh blank nodes for reference to a class expression). That's fine in turtle, but might suggest the source of the error.

Based on the type of expression, it's possible this is somehow related to robot relax.

@balhoff balhoff added the bug label Jul 7, 2023
@balhoff
Copy link
Contributor Author

balhoff commented Jul 7, 2023

The input RDF/XML has the same weird problems with blank node IDs:

<!-- http://purl.obolibrary.org/obo/MONDO_0000290 -->

    <owl:Class rdf:about="http://purl.obolibrary.org/obo/MONDO_0000290">
        <owl:equivalentClass>
            <owl:Class>
                <owl:intersectionOf rdf:parseType="Collection">
                    <rdf:Description rdf:about="http://purl.obolibrary.org/obo/MONDO_0005550"/>
                    <owl:Restriction rdf:nodeID="genid24282">
                        <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/RO_0014001"/>
                        <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/NCBITaxon_5763"/>
                    </owl:Restriction>
                </owl:intersectionOf>
            </owl:Class>
        </owl:equivalentClass>
        <rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/MONDO_0002428"/>
        <rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/MONDO_0020067"/>
        <rdfs:subClassOf rdf:nodeID="genid24282"/>
        <obo:IAO_0000115>A infectious disease involving the Naegleria fowleri.</obo:IAO_0000115>
        <terms:conformsTo rdf:resource="http://purl.obolibrary.org/obo/mondo/patterns/infectious_disease_by_agent.yaml"/>
        <oboInOwl:hasDbXref>DOID:0050242</oboInOwl:hasDbXref>
        <oboInOwl:hasDbXref>GARD:0009554</oboInOwl:hasDbXref>
        <oboInOwl:hasDbXref>MESH:C535275</oboInOwl:hasDbXref>
        <oboInOwl:hasDbXref>SCTID:721816008</oboInOwl:hasDbXref>
        <oboInOwl:hasDbXref>UMLS:C0300934</oboInOwl:hasDbXref>
        <oboInOwl:hasDbXref>UMLS:C4303098</oboInOwl:hasDbXref>
        <oboInOwl:hasExactSynonym>Naegleria fowleri infection</oboInOwl:hasExactSynonym>
        <oboInOwl:hasRelatedSynonym>infections, Naegleria fowleri</oboInOwl:hasRelatedSynonym>
        <oboInOwl:id>MONDO:0000290</oboInOwl:id>
        <oboInOwl:inSubset rdf:resource="http://purl.obolibrary.org/obo/mondo#mondo_rare"/>
        <oboInOwl:inSubset rdf:resource="http://purl.obolibrary.org/obo/mondo#rare"/>
        <rdfs:label>primary amebic meningoencephalitis</rdfs:label>
        <skos:exactMatch rdf:resource="http://identifiers.org/mesh/C535275"/>
        <skos:exactMatch rdf:resource="http://identifiers.org/snomedct/721816008"/>
        <skos:exactMatch rdf:resource="http://linkedlifedata.com/resource/umls/id/C0300934"/>
        <skos:exactMatch rdf:resource="http://linkedlifedata.com/resource/umls/id/C4303098"/>
        <skos:exactMatch rdf:resource="http://purl.obolibrary.org/obo/DOID_0050242"/>
    </owl:Class>
    <owl:Restriction rdf:nodeID="genid24282">
        <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/RO_0014001"/>
        <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/NCBITaxon_5763"/>
    </owl:Restriction>

@balhoff
Copy link
Contributor Author

balhoff commented Jul 7, 2023

Comparing a section from go.owl, I don't know why MONDO includes those node IDs (maybe this is a problem with the MONDO pipeline (@matentzn). GO doesn't specify node IDs in RDF/XML:

<!-- http://purl.obolibrary.org/obo/GO_0000019 -->

    <owl:Class rdf:about="http://purl.obolibrary.org/obo/GO_0000019">
        <owl:equivalentClass>
            <owl:Class>
                <owl:intersectionOf rdf:parseType="Collection">
                    <rdf:Description rdf:about="http://purl.obolibrary.org/obo/GO_0065007"/>
                    <owl:Restriction>
                        <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/RO_0002211"/>
                        <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/GO_0006312"/>
                    </owl:Restriction>
                </owl:intersectionOf>
            </owl:Class>
        </owl:equivalentClass>
        <rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/GO_0000018"/>
        <rdfs:subClassOf>
            <owl:Restriction>
                <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/RO_0002211"/>
                <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/GO_0006312"/>
            </owl:Restriction>
        </rdfs:subClassOf>
        <obo1:IAO_0000115>Any process that modulates the frequency, rate or extent of DNA recombination during mitosis.</obo1:IAO_0000115>
        <oboInOwl:hasNarrowSynonym>regulation of recombination within rDNA repeats</oboInOwl:hasNarrowSynonym>
        <oboInOwl:hasOBONamespace>biological_process</oboInOwl:hasOBONamespace>
        <oboInOwl:id>GO:0000019</oboInOwl:id>
        <rdfs:label>regulation of mitotic recombination</rdfs:label>
    </owl:Class>

@matentzn
Copy link
Contributor

matentzn commented Jul 7, 2023

I dont understand this ticket well, but some observations:

  1. https://www.w3.org/RDF/Validator/rdfval thinks the RDFXML is valid
  2. The turtle is indeed nonsensical. Is:
obo:MONDO_0000290 rdf:type owl:Class ;
                  owl:equivalentClass [ owl:intersectionOf ( obo:MONDO_0005550 
                                                             _:genid3 rdf:type owl:Restriction ;
                                                                      owl:onProperty obo:RO_0014001 ;
                                                                      owl:someValuesFrom obo:NCBITaxon_5763
                                                           ) ;
                                      rdf:type owl:Class
                                      ] ;

and should be:

obo:MONDO_0000290 rdf:type owl:Class ;
                  owl:equivalentClass [ owl:intersectionOf ( obo:MONDO_0005550 
                                                             _:genid3 
                                                           ) ;
                                      rdf:type owl:Class
                                      ] ;

(This part needed to go)

rdf:type owl:Restriction ;
                                                                      owl:onProperty obo:RO_0014001 ;
                                                                      owl:someValuesFrom obo:NCBITaxon_5763

Its not about the genids.. these are fine. The turtle serialise but this nonsense blank node expansion in the expression.. Sounds like OWLAPI bug.

@matentzn
Copy link
Contributor

matentzn commented Jul 7, 2023

This is still fine imo (@balhoff I don't see why this is wrong):

<?xml version="1.0"?>
<rdf:RDF xmlns="http://purl.obolibrary.org/obo/mondo.owl#"
     xml:base="http://purl.obolibrary.org/obo/mondo.owl"
     xmlns:obo="http://purl.obolibrary.org/obo/"
     xmlns:owl="http://www.w3.org/2002/07/owl#"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
     xmlns:xml="http://www.w3.org/XML/1998/namespace"
     xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
     xmlns:foaf="http://xmlns.com/foaf/0.1/"
     xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
     xmlns:skos="http://www.w3.org/2004/02/skos/core#"
     xmlns:mondo="http://purl.obolibrary.org/obo/mondo#">
    <owl:Ontology rdf:about="http://purl.obolibrary.org/obo/mondo.owl">
        <owl:versionIRI rdf:resource="http://purl.obolibrary.org/obo/mondo/releases/2023-07-03/mondo.owl"/>
    </owl:Ontology>
    
    <owl:Class rdf:about="http://purl.obolibrary.org/obo/MONDO_0000290">
        <owl:equivalentClass>
            <owl:Class>
                <owl:intersectionOf rdf:parseType="Collection">
                    <rdf:Description rdf:about="http://purl.obolibrary.org/obo/MONDO_0005550"/>
                    <owl:Restriction rdf:nodeID="genid24282">
                        <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/RO_0014001"/>
                        <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/NCBITaxon_5763"/>
                    </owl:Restriction>
                </owl:intersectionOf>
            </owl:Class>
        </owl:equivalentClass>
        <rdfs:subClassOf rdf:nodeID="genid24282"/>
        <rdfs:label>primary amebic meningoencephalitis</rdfs:label>
    </owl:Class>
    <owl:Restriction rdf:nodeID="genid24282">
        <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/RO_0014001"/>
        <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/NCBITaxon_5763"/>
    </owl:Restriction>
</rdf:RDF>

But adding this:

    <owl:Axiom>
        <owl:annotatedSource rdf:resource="http://purl.obolibrary.org/obo/MONDO_0000290"/>
        <owl:annotatedProperty rdf:resource="http://www.w3.org/2000/01/rdf-schema#subClassOf"/>
        <owl:annotatedTarget rdf:nodeID="genid24282"/>
        <mondo:source>MONDO:Wikidata</mondo:source>
    </owl:Axiom>

Makes the serialser break.

@matentzn
Copy link
Contributor

matentzn commented Jul 8, 2023

I though I could at least fix it in Mondo by removing the axiom annotations but there are too many of these..

@balhoff
Copy link
Contributor Author

balhoff commented Jul 9, 2023

@matentzn your first example is not valid, since the equivalent class axiom and the subclass axiom should not share a blank node representing the existential restriction. The spec says:

In the mapping, each generated blank node (i.e., each blank node that does not correspond to an anonymous individual) is fresh in each application of a mapping rule.

When there is an annotated axiom, in RDF serializations the core axiom is represented by triples, as well as a reified version to which the annotations are attached. The OWL API changed at some point in how it handles anonymous expressions in the core axiom and in the reified version. In this input both the equivalent class axiom and the subclass axiom contain the expression r some B:

Ontology(<http://example.org/>
Declaration(Class(<http://example.org/A>))
Declaration(Class(<http://example.org/B>))
Declaration(Class(<http://example.org/C>))
Declaration(ObjectProperty(<http://example.org/r>))
EquivalentClasses(<http://example.org/A> ObjectIntersectionOf(<http://example.org/C> ObjectSomeValuesFrom(<http://example.org/r> <http://example.org/B>)))
SubClassOf(Annotation(rdfs:comment "This axiom is annotated.") <http://example.org/A> ObjectSomeValuesFrom(<http://example.org/r> <http://example.org/B>))
)

An older version of ROBOT (I tested 1.8.1) translates to this Turtle (robot convert -i test.ofn -o test.ttl):

@prefix : <http://example.org/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
<http://example.org/> rdf:type owl:Ontology .
:r rdf:type owl:ObjectProperty .
:A rdf:type owl:Class ;
   owl:equivalentClass [ owl:intersectionOf ( :C
                                              [ rdf:type owl:Restriction ;
                                                owl:onProperty :r ;
                                                owl:someValuesFrom :B
                                              ]
                                            ) ;
                         rdf:type owl:Class
                       ] ;
   rdfs:subClassOf [ rdf:type owl:Restriction ;
                     owl:onProperty :r ;
                     owl:someValuesFrom :B
                   ] .
[ rdf:type owl:Axiom ;
   owl:annotatedSource :A ;
   owl:annotatedProperty rdfs:subClassOf ;
   owl:annotatedTarget [ rdf:type owl:Restriction ;
                         owl:onProperty :r ;
                         owl:someValuesFrom :B
                       ] ;
   rdfs:comment "This axiom is annotated."
 ] .
:B rdf:type owl:Class .
:C rdf:type owl:Class .

A fresh blank node is used every time r some B is expressed:

[ rdf:type owl:Restriction ;
  owl:onProperty :r ;
  owl:someValuesFrom :B
]

But in ROBOT 1.9.4, the same conversion produces this Turtle:

@prefix : <http://example.org/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
<http://example.org/> rdf:type owl:Ontology .
:r rdf:type owl:ObjectProperty .
:A rdf:type owl:Class ;
   owl:equivalentClass [ owl:intersectionOf ( :C
                                              [ rdf:type owl:Restriction ;
                                                owl:onProperty :r ;
                                                owl:someValuesFrom :B
                                              ]
                                            ) ;
                         rdf:type owl:Class
                       ] ;
   rdfs:subClassOf _:genid5 .

_:genid5 rdf:type owl:Restriction ;
          owl:onProperty :r ;
          owl:someValuesFrom :B .

[ rdf:type owl:Axiom ;
   owl:annotatedSource :A ;
   owl:annotatedProperty rdfs:subClassOf ;
   owl:annotatedTarget _:genid5 ;
   rdfs:comment "This axiom is annotated."
 ] .
:B rdf:type owl:Class .
:C rdf:type owl:Class .

The blank node used for the expression in the subclass axiom is given an identifier so that it can be referenced in the reified annotated axiom. There was a long discussion of this in owlcs/owlapi#874. I agree with @ignazio1977 that the spec is really unclear about how to handle this, but in my interpretation I would prefer the previous approach.

The problem in this ticket is happening somehow within robot relax. The same blank node id for the class expression gets used in both the equivalent class axiom and in the subclass axiom, which I think is certainly incorrect with regard to the OWL-to-RDF spec. You can see this happening when you go from the original file to RDF/XML (robot relax -i test.ofn -o test-relaxed.owl); see genid3 in both axioms:

<?xml version="1.0"?>
<rdf:RDF xmlns="http://example.org/"
     xmlns:owl="http://www.w3.org/2002/07/owl#"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
     xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
    <owl:Ontology rdf:about="http://example.org/"/>
    <owl:ObjectProperty rdf:about="http://example.org/r"/>
    <owl:Class rdf:about="http://example.org/A">
        <owl:equivalentClass>
            <owl:Class>
                <owl:intersectionOf rdf:parseType="Collection">
                    <rdf:Description rdf:about="http://example.org/C"/>
                    <owl:Restriction rdf:nodeID="genid3">
                        <owl:onProperty rdf:resource="http://example.org/r"/>
                        <owl:someValuesFrom rdf:resource="http://example.org/B"/>
                    </owl:Restriction>
                </owl:intersectionOf>
            </owl:Class>
        </owl:equivalentClass>
        <rdfs:subClassOf rdf:resource="http://example.org/C"/>
        <rdfs:subClassOf rdf:nodeID="genid3"/>
    </owl:Class>
    <owl:Restriction rdf:nodeID="genid3">
        <owl:onProperty rdf:resource="http://example.org/r"/>
        <owl:someValuesFrom rdf:resource="http://example.org/B"/>
    </owl:Restriction>
    <owl:Axiom>
        <owl:annotatedSource rdf:resource="http://example.org/A"/>
        <owl:annotatedProperty rdf:resource="http://www.w3.org/2000/01/rdf-schema#subClassOf"/>
        <owl:annotatedTarget rdf:nodeID="genid3"/>
        <rdfs:comment>This axiom is annotated.</rdfs:comment>
    </owl:Axiom>
    <owl:Class rdf:about="http://example.org/B"/>
    <owl:Class rdf:about="http://example.org/C"/>
</rdf:RDF>

The RDF/XML serializer outputs valid XML, but the Turtle serializer gets confused and outputs invalid Turtle (robot relax -i test.ofn -o test-relaxed.ttl); here the intersection list is not parseable at all, which is the issue I ran into:

@prefix : <http://example.org/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
<http://example.org/> rdf:type owl:Ontology .
:r rdf:type owl:ObjectProperty .
:A rdf:type owl:Class ;
   owl:equivalentClass [ owl:intersectionOf ( :C
                                              _:genid3 rdf:type owl:Restriction ;
                                                       owl:onProperty :r ;
                                                       owl:someValuesFrom :B
                                            ) ;
                       rdf:type owl:Class
                       ] ;
rdfs:subClassOf :C ,
                _:genid3 .
_:genid3 rdf:type owl:Restriction ;
          owl:onProperty :r ;
          owl:someValuesFrom :B .
[ rdf:type owl:Axiom ;
   owl:annotatedSource :A ;
   owl:annotatedProperty rdfs:subClassOf ;
   owl:annotatedTarget _:genid3 ;
   rdfs:comment "This axiom is annotated."
 ] .
:B rdf:type owl:Class .
:C rdf:type owl:Class .

So I think there are at least two bugs:

  • robot relax somehow causes the OWL API to reuse a blank node ID for separate usages of a class expression when converting to RDF
  • the Turtle serializer produces malformed output when a blank node ID is used in the above (incorrect) way

I also personally think OWL API should revert to the previous approach using all fresh blank nodes, but I'm not 100% sure about this.

@matentzn
Copy link
Contributor

matentzn commented Jul 9, 2023

This is not great.. The main reason for moving to the last OWLAPI was serialisation stability.. How blanknode ids are named was a significant part of serialisation stability if I remember correctly..

@ignazio1977
Copy link
Contributor

I'll need to wrap my head around what's happening here. Fresh blank node id or reuse of previous id is something that can be controlled now (option was introduced a while ago to reuse existing id), but it's been a long time since I looked at that.

"Blank nodes, or, why is OWL not used everywhere yet?"

@matentzn
Copy link
Contributor

Thank you @ignazio1977 really appreciated.. This issue is extremely important for us, and I feel a lot less scared now that you joined the debugging party!

@balhoff
Copy link
Contributor Author

balhoff commented Jul 11, 2023

I can confirm this doesn't have anything particular to do with robot relax, except that it triggers a bug in OWL API when the same object for a class expression is used in two axioms and there is an axiom annotation involved. The malformed turtle output can be produced running this with scala-cli:

//> using scala 2.13
//> using dep "net.sourceforge.owlapi:owlapi-distribution:4.5.25"

import org.semanticweb.owlapi.model._
import org.semanticweb.owlapi.apibinding.OWLManager
import org.semanticweb.owlapi.formats.TurtleDocumentFormat
import org.semanticweb.owlapi.formats.RioTurtleDocumentFormat
import scala.jdk.CollectionConverters._
import java.io.File

val factory = OWLManager.getOWLDataFactory()
val manager = OWLManager.createOWLOntologyManager()
val r = factory.getOWLObjectProperty(IRI.create("http://example.org/r"))
val A = factory.getOWLClass(IRI.create("http://example.org/A"))
val B = factory.getOWLClass(IRI.create("http://example.org/B"))
val C = factory.getOWLClass(IRI.create("http://example.org/C"))
val restriction = factory.getOWLObjectSomeValuesFrom(r, B)
val equiv = factory.getOWLEquivalentClassesAxiom(A, factory.getOWLObjectIntersectionOf(C, restriction))
val subClassOf = factory.getOWLSubClassOfAxiom(
    A, restriction,
    Set(factory.getOWLAnnotation(factory.getRDFSComment(), factory.getOWLLiteral("comment"))).asJava)
val ontology = manager.createOntology(Set[OWLAxiom](equiv, subClassOf).asJava)
manager.saveOntology(ontology, new TurtleDocumentFormat(), IRI.create(new File("test.ttl")))
manager.saveOntology(ontology, new RioTurtleDocumentFormat(), IRI.create(new File("test-rio.ttl")))

The RIO document format produces valid turtle, the other does not. So I need to open issues at OWL API.

@ignazio1977
Copy link
Contributor

same object for a class expression is used in two axioms and there is an axiom annotation involved.

This rings a bell. There was a bug to do with exactly that: sharing of nodes is fine in memory but not in text syntaxes, where anonymous classes can't be reused (I say can't, probably 'shouldn't' is a better word) and the objects need to be duplicated. One symptom of this issue is ending up with spare RDF triples.

Annotations on axioms present the same problem when objects are reused.

I'm sure there was a fix for this - possibly two fixes at different times. But I'm not sure if the two were sorted and tested together.

Thanks for the code, saves me a job.

@ignazio1977
Copy link
Contributor

Sticking the code in an OWLAPI test, turns out this fails only on turtle syntax, the other syntaxes are fine.

@balhoff
Copy link
Contributor Author

balhoff commented Jul 13, 2023

@ignazio1977 thanks for looking at it! Yes, the turtle writer is the only one that fails to output a valid file. But I think it is also incorrect that the equivalence axiom and the subclass axiom share blank nodes for the class expression (in all the RDF syntaxes), even if it is syntactically valid RDF.

@balhoff
Copy link
Contributor Author

balhoff commented Jul 13, 2023

@ignazio1977 you said:

Fresh blank node id or reuse of previous id is something that can be controlled now (option was introduced a while ago to reuse existing id)

How can this be controlled? Is there any documentation for this?

@ignazio1977
Copy link
Contributor

I think it is also incorrect that the equivalence axiom and the subclass axiom share blank nodes for the class expression (in all the RDF syntaxes), even if it is syntactically valid RDF.

Normally they wouldn't, but annotations make everything worse.

@ignazio1977
Copy link
Contributor

So, the shared node is referenced in three places in the output - two axioms, one of them with annotations. The annotation triggers reification, which is where the third reference appears.

The ontology looks like this:

@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@base <http://www.w3.org/2002/07/owl#> .
[ rdf:type owl:Ontology ] .
<http://www.semanticweb.org/owlapi/test#r> rdf:type owl:ObjectProperty .
<http://www.semanticweb.org/owlapi/test#A> rdf:type owl:Class ;
    owl:equivalentClass [ owl:intersectionOf ( <http://www.semanticweb.org/owlapi/test#C>
      1)   _:genid4 rdf:type owl:Restriction ;
      2)      owl:onProperty <http://www.semanticweb.org/owlapi/test#r> ;
      3)      owl:someValuesFrom <http://www.semanticweb.org/owlapi/test#B>
        ) ;
        rdf:type owl:Class
        ] ;
rdfs:subClassOf _:genid4 .
_:genid4 rdf:type owl:Restriction ; owl:onProperty <http://www.semanticweb.org/owlapi/test#r> ; owl:someValuesFrom <http://www.semanticweb.org/owlapi/test#B> .
[ rdf:type owl:Axiom ;
   owl:annotatedSource <http://www.semanticweb.org/owlapi/test#A> ;
   owl:annotatedProperty rdfs:subClassOf ;
   owl:annotatedTarget _:genid4 ;
   rdfs:comment "comment"
 ] .
<http://www.semanticweb.org/owlapi/test#B> rdf:type owl:Class .
<http://www.semanticweb.org/owlapi/test#C> rdf:type owl:Class .

The node triples are outputted twice - same in turtle as in rdf/xml.
This is redundant but, at the RDF level, it's just a repetition of the same triples (setting aside for a moment the impact on serialized histories).

It shouldn't make a difference to the parser. In fact, the rdf/xml parser copes with it. However, the Turtle parser doesn't like it - it doesn't expect an inline description with an id. Hard to change, as it's a JavaCC generated parser.

However, the solution needs to be that the extra triples aren't outputted. The fact that they're outputted twice, and not three times, suggests that the mechanism for deciding if the id gets generated and the one for deciding if the triples are outputted are getting tangled.

@ignazio1977
Copy link
Contributor

//> using scala 2.13
//> using dep "net.sourceforge.owlapi:owlapi-distribution:4.5.25"

import org.semanticweb.owlapi.model._
import org.semanticweb.owlapi.apibinding.OWLManager
import org.semanticweb.owlapi.formats.TurtleDocumentFormat
import org.semanticweb.owlapi.formats.RioTurtleDocumentFormat
import scala.jdk.CollectionConverters._
import java.io.File

val factory = OWLManager.getOWLDataFactory()
val manager = OWLManager.createOWLOntologyManager()
val r = factory.getOWLObjectProperty(IRI.create("http://example.org/r"))
val A = factory.getOWLClass(IRI.create("http://example.org/A"))
val B = factory.getOWLClass(IRI.create("http://example.org/B"))
val C = factory.getOWLClass(IRI.create("http://example.org/C"))
val restriction = factory.getOWLObjectSomeValuesFrom(r, B)
val equiv = factory.getOWLEquivalentClassesAxiom(A, factory.getOWLObjectIntersectionOf(C, restriction))
val subClassOf = factory.getOWLSubClassOfAxiom(
    A, restriction,
    Set(factory.getOWLAnnotation(factory.getRDFSComment(), factory.getOWLLiteral("comment"))).asJava)
val ontology = manager.createOntology(Set[OWLAxiom](equiv, subClassOf).asJava)
manager.saveOntology(ontology, new TurtleDocumentFormat(), IRI.create(new File("test.ttl")))
manager.saveOntology(ontology, new RioTurtleDocumentFormat(), IRI.create(new File("test-rio.ttl")))

There are two options in ConfigurationOptions that control settings related to blank node ids (changing the values is described in the javadoc for this class).

// Save options
/** True if ids for blank nodes should always be written (axioms and anonymous individuals only). */
SAVE_IDS                            (Boolean.FALSE),
/** True if all anonymous individuals should have their ids remapped after parsing. */
REMAP_IDS                           (Boolean.TRUE),

I thin REMAP_IDS might be of use for Robot in some circumstances. If it's set to true, blank node ids when parsing will be the same value as they were when the ontology was saved (if they were written out, of course). So, if an ontology is saved with all ids written out, and parsed without remapping the ids, you bet all blank nodes with the same ids they had in their previous life; if only some nodes had their ids written out, those nodes will have the same id they originally had.

This can have side effects if there happens to be a clash between blank nodes in the ontology and blank nodes in imported ontologies. That's (one of) the reason for remapping on parse.

@ignazio1977
Copy link
Contributor

Problem was not the desharing of nodes; rather, when a node is referred in multiple places but should be output only once, the renderer 'defers' it. The node in question, _:genid4, was being deshared and deferred correctly. However, the renderer was not checking for deferment when rendering list objects - i.e., the list of arguments to the intersecionOf expression.

So, we had a test covering this already, but it didn't cover shared nodes as part of lists.

Same issue in RDF/XML, however the same fix doesn't seem to work. The two renderers are almost structurally identical. Almost.

@ignazio1977
Copy link
Contributor

Fix pushed to version4 branch, let me know if you can test on your side as is or if I should make a release candidate for 4.5.26

@ignazio1977
Copy link
Contributor

4.5.26 released

@balhoff
Copy link
Contributor Author

balhoff commented Jul 18, 2023

Thanks @ignazio1977! I was a bit too slow—I'll give that version a try.

@balhoff
Copy link
Contributor Author

balhoff commented Aug 1, 2023

OWL API 4.5.26 is outputting valid Turtle for me; thanks @ignazio1977.

So the fix for this ROBOT issue would be to upgrade our dependency to 4.5.26.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants