Skip to content

Oort Extracting Modules

Shahim Essaid edited this page Feb 2, 2015 · 1 revision

Extracting modules from external ontologies

Introduction

This details how to extract subsets of external ontologies (modules) using the OWLAPI SyntacticLocalityModuleExtractor using OWLTools

This functionality is not yet in Oort - it's necessary to use owltools on the command line

Extracting a module for a single class

To fetch a class (or classes) and all descendants:

owltools http://purl.obolibrary.org/obo/cl.owl --extract-module -d CL:0000540 -o neuron.owl

Examples

Bootstrapping

Assume as a starting point you have an ontology my-edit.owl (or in obo format). Assume this has axioms that point to an ontology 'foo':

E.g.

Ontoogy: my.owl

Class: FOO:1

Class: MY:1
  SubClassOf: part_of some FOO:1

Note that external classes must be declared if they are not imported.

In obo format you may have something like:

ontology: my

[[Term]]
id: MY:1
relationship: part_of FOO:1

in obo-format parlance, this is known as a 'dangling' relationship, as FOO:1 is not in the file.

To generate a module "foo_import.owl" do this:

export OBO=http://purl.obolibrary.org/obo

owltools my-edit.owl $OBO/foo.owl --add-imports-from-supports --extract-module -c -s $OBO/foo.owl --set-ontology-id $OBO/my/foo_import.owl -o foo_import.owl

Remember, in owltools, commands are handled sequentially - a single line such as this can be an entire pipeline chain. The first part of the above chain attaches foo.owl as an import to my-edit.owl.

The second part uses the OWLAPI to extract a module (using BOT strategy by default) using all classes in the signature of my-edit (but not it's import closure) as seed. This module becomes the new source ontology.

The final part renames the ontology and saves a local copy.

You can now add an imports directive to my-edit.owl. You may also want to manually add an entry to your catalog

Refrshing

Assume now your ontology looks like this:

Ontoogy: my.owl
Imports: http://..../my/import_foo.owl

Class: MY:1
  SubClassOf: part_of some FOO:1

Assume also you have a catalog entry like this:

  <uri id="User Entered Import Resolution" name="http://purl.obolibrary.org/obo/my/foo_import.owl" uri="foo_import.owl"/>

Remember, with owltools you can specify this using --catalog-xml FILE or simply --use-catalog to use the default.

What happens if you add new FOO classes to MY, or you simply want to regenerate foo_import based on changes made to FOO-central?

The challenge here is that you are already importing a possibly stale subset of FOO - you don't want this to be included. You could simply re-bootstrap. However, an alternate method is to map the IRI of your import to the central location of the ontology:

owltools --use-catalog --map-ontology-iri $OBO/my/foo_import.owl $OBO/foo.owl my-edit.owl -extract-module -c -s $OBO/foo.owl --set-ontology-id $OBO/my/foo_import.owl -o foo_import.owl

Adding new terms to an import

One method is to include the IRI in my-edit.owl, but this is awkward.

Classes can be hacked in to foo_import.owl, but this also is less than ideal.

Another method is to keep a file foo_seed.owl in a hackable syntax like manchester or obo:

Class: FOO:5
Class: FOO:6

Then just merge this in to make your seed set and regenerate the import:

owltools --use-catalog --map-ontology-iri $OBO/my/foo_import.owl $OBO/foo.owl my-edit.owl foo_seed.owl --merge-support-ontologies -extract-module -c -s $OBO/foo.owl --set-ontology-id $OBO/my/foo_import.owl -o foo_import.owl

Of course, the ideal way to do this would be via a Protege plugin (TODO: investigate existing plugins)

Makefiles

The above commands can be included in a Makefile to allow for easier execution and dependency management

Filtering

It is recommended the import modules are saved as RDF/XML OWL in order to maximize interoperability. This can be very verbose, especially where axioms annotations are concerned. These may not be required in the importing ontology.

Additional commands can be added to the chain above to trim down the axioms included. For example, before the "--set-ontology-id" command, you can do

--extract-mingraph

to get a minimal set of axioms (labels and SubClassOf)

or

--remove-annotation-assertions -l

to remove all annotation assertions, preserving labels

Naming Conventions

Naming/URI conventions for import modules have yet to be standardized.

Sometimes the import modules go in a subdir called "imports", so the URI looks like .../obo/my/imports/foo_import.owl".

In general, all ontology IRIs should use lowercase as specified in id-policy - e.g. "foo_import" rather than "FOO_import".

Reciprocal imports

Sometimes ontologies will have mutual dependencies - e.g. uberon, go, cl. TODO docs

Oort

Oort should be extended to handle the full cycle

Module Extraction Strategy

The module type can be controlled by adding

-m MODULE-TYPE

to the arguments. See ModuleType (OWLAPI docs)

The default is BOT (upper modules). In practice this works best for most Bio-ontologies supported by Oort at this time. Please consult the relevant papers for a formal treatment, but for practical purposes, this strategy includes everything reachable via the seed set across EL-type constructs. Experience shows this is generally sufficient for classification purposes for GO, CL, Uberon and other ontologies with similar levels of axiomatization.

See Also