-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Template Rework #403
Template Rework #403
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is much better than my original code.
Minor problems: comments inline.
Major problems:
- japicmp says this breaks backwards compatibility
- I think the
>C
and>P
are just wrong. We should only care about the type of the current column (>A comment
,>AI see also
,>AT foo:count^^xsd:integer
, etc.), and the type of the column to the left shouldn't matter. We need tests to cover this.
Please address these and then I'll take another look.
robot-core/src/main/java/org/obolibrary/robot/TemplateOperation.java
Outdated
Show resolved
Hide resolved
robot-core/src/main/java/org/obolibrary/robot/TemplateOperation.java
Outdated
Show resolved
Hide resolved
robot-core/src/main/java/org/obolibrary/robot/template/Template.java
Outdated
Show resolved
Hide resolved
robot-core/src/main/java/org/obolibrary/robot/template/Template.java
Outdated
Show resolved
Hide resolved
robot-core/src/main/java/org/obolibrary/robot/template/Template.java
Outdated
Show resolved
Hide resolved
robot-core/src/main/java/org/obolibrary/robot/template/TemplateHelper.java
Outdated
Show resolved
Hide resolved
|
I'd prefer to switch everything to For backwards compatibility, I guess we could allow |
As rows are parsed, exception messages will be logged as
If there were any exception messages after parsing all rows, the operation will fail. You can override this with The operation will fail immediately if there are any errors with the headers. See #300 |
For many users, I think that it's easier to fail on the first error, then fix and try again. That was the old behaviour. It's better to start fixing errors at the top than the bottom, but by printing all errors to the terminal you'll see the last error first. I can see how advanced users might want all errors at once, but that requires a more complex mental model of the operation. |
I can see how that can be confusing. I find it frustrating when I have a template and I need to keep running it after fixing one thing at a time. Maybe the default behavior, instead of showing all errors, is to fail on the first error. Then we can add another option that lets you print all errors? I'm not sure what to call it though, some ideas (I don't particularly like any of these, though):
Or the behavior could just be if |
I like your final suggestion. |
@ontodev/robot-admin While working on GAZ, I found that this new template rework handles very large template files MUCH better than the old template. It's faster, and doesn't fail if the file is too large. I ran template on a 20.5MB template CSV with this new code, and it processed in 28 seconds. If I run the same command using I think it would be good to get this merged into
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've looked through the code and it's good. I made comments about a few minor things.
The NI
option is not documented, and I'm not sure that I understand it.
The last big thing I would like is an example file / integration test that exercises a wide range of the new features. The inline tables in the Markdown file are good but the aren't integration tests. The current integration tests only show the old way of doing things. We want to keep them -- maybe they could be moved to unit tests. But we need at least one good new example that shows the new "right way" to work with ROBOT templates.
robot-core/src/main/java/org/obolibrary/robot/TemplateHelper.java
Outdated
Show resolved
Hide resolved
This |
It's fine to remove |
OK - I made the requested changes and added a new integration test with the new template strings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there. A couple more small things.
docs/examples/template.csv
Outdated
ID,A rdfs:label,A IAO:0000115,>A IAO:0000119,AI rdfs:seeAlso,A IAO:0000117,TYPE,SC %,DC %,DOMAIN,RANGE,I weight in kilograms | ||
ex:F344N,F 344/N,An inbred strain of rat used in many scientific investigations.,James A. Overton,http://www.informatics.jax.org/external/festing/rat/docs/F344.shtml,James A. Overton,class,NCBITaxon:10116,,,, | ||
ex:B6C3F1,B6C3F1,An inbred strain of mouse used in many scientific investigations.,James A. Overton,http://jaxmice.jax.org/strain/100010.html,James A. Overton,class,NCBITaxon:10090,F 344/N,F 344/N' or B6C3F1,, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The third row is a class with domain "'F 344/N' or B6C3F1". That can't be right.
docs/examples/template-old.csv
Outdated
@@ -0,0 +1,5 @@ | |||
ID,Label,Definition,Definition Source,See Also,Editor,RDF Type,Class Type,Parent IRI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renaming this file is fine, but we need the test to run somewhere. Now I think it is not.
🎉 🎂 🎈 !!! |
This PR completely reworks the logic of
template
. Instead of using a staticTemplateOperation
class, it now uses aTemplate
class. This allows for a cleaner way to add more features to template.This rework also supports new template strings. I've included the documentation and examples below. I apologize in advance about the length 😅 TLDR:
object property
can be used instead ofowl:ObjectProperty
@ontodev/robot-team This is an experimental update, so I would appreciate tests from anyone. The old template strings should produce the same results as before, even though the logic is different.
Property Template Strings
PROPERTY_TYPE
: ROBOT creates a property for each row of data that has aTYPE
of either an object or data property. The property type can be (any type followed by a * can ONLY be used for object properties):subproperty
)subproperty
: the created property will be a subproperty of each templated property expression (default)equivalent
: the created property will be equivalent to all of the templated property expressionsdisjoint
: the created property will be disjoint from each templated property expression and the values cannot be the sameinverse
*: the created object property will be the inverse of each templated property expressionfunctional
: the created property will be functional, meaning each entity (subject) can have at most one valueinverse functional
*: the created object property will be inverse functional, meaning each value can have at most one subjectirreflexive
*: the created object property will be irreflexive, meaning the subject cannot also be the valuereflexive
*: the created object property will be reflexive, meaning each subject is also a valuesymmetric
*: the created object property will be symmetric, meaning the subject and value can be reversedasymmetric
*: the created object property will be asymmetric, meaning the subject and value cannot be reversedtransitive
*: the created object property will be transitive, meaning the property can be chainedP
property expression: If the template string starts with aP
and a space then it will be interpreted as a property expression. The value of the current cell will be substituted into the template, replacing all occurrences of the%
character. Then the result will be parsed into an OWL property expression. ROBOT uses the same syntax for property expressions as Protégé: Manchester Syntax. If it does not recognize a name, ROBOT will assume that you're trying to refer to an entity by its IRI or CURIE. This can lead to unexpected behavior, but it allows you to refer to entities without loading them into the input ontology.P inverse(%)
. A single object property for a value can be specified byP %
.P %
.P %
.DOMAIN
: The domain to a property is a class expression in Manchester Syntax (for object and data properties). For annotation properties, the domain must be a single class specified by label, CURIE, or IRI.RANGE
: The range to a property is either a class expression in Manchester Syntax (for object properties) or the name, CURIE, or IRI of a datatype (for annotation and data properties).Example of Property Template Strings
The
functional
data property will still default to asubproperty
logical axiom for theP %
template string, unless a different logical property type (equivalent
,disjoint
) is provided. Property type can be split, e.g.PROPERTY_TYPE SPLIT=|
.Individual Template Strings
INDIVIDUAL_TYPE
: ROBOT creates an individual for each or of data that has aTYPE
of another class. The individual type can be:named
: the created individual will be a default named individual. When theINDIVIDUAL_TYPE
is left blank, this is the default. This should be used when adding object property or data property assertionssame
: the created individual will be asserted to be the same individual as each templated individual in the rowdifferent
: the created individual will be asserted to be a different individual than any of the templated individuals in the rowI
individual assertion:I property %
: when creating anamed
individual, replace property with an object property or data property to add assertions. The%
will be replaced by the template cell value or values. For object property assertions, this is another individual. For data property assertions, this is a literal value.I %
: when creating asame
ordifferent
individual, this template string is used to specify which individual will be the value of the same or different individual axiom.Example of Individual Template Strings
Datatype Template Strings
Datatypes can currently be added (
TYPE
=owl:Datatype
), but I have not yet added support for datatype definitions. I'm still figuring out the best way to do this that wouldn't make the template strings too complex. There's nothing in the docs for this yet. Suggestions would be appreciated.TYPE declarations
TYPE
: this is therdf:type
for the row. Because ROBOT is focused on ontology development, the default value isowl:Class
and this column is optional. When creating an OWLIndividual, specify the class to which it belongs in this column.class
orowl:Class
object property
orowl:ObjectProperty
data property
orowl:DataProperty
annotation property
orowl:AnnotationProperty
datatype
orowl:Datatype