-
Notifications
You must be signed in to change notification settings - Fork 5
[Ontologies] Find a reliable approach to manage ontology terms ID consistency #16
Comments
@tgbugs Do you have any insights on the structure of NIFSTD or the scigraph client that could help us with respect to this issue? |
I think this is the result of the fact that we transitioned the ontology away from the ontology.neuinfo.org identifiers to the uri.neuinfo.org identifiers. See SciCrunch/NIF-Ontology@b268a6b for details. I do not load the mapping file into SciGraph to avoid confusion, though in this case it seems to have caused some. I also have not tested whether SciGraph treats owl:sameAs correctly with regard to issuing queries against the graph, so there is a possibility that you would have to issue two SciGraph queries even if I did. I would suggest switching to the new uri scheme but totally understand the human readability needs. Therefore I suggest that you can load the mapping file into a python dict to do the translation and it will be performant. I might insert a translation shim that switches the representation of those identifiers whenever a call is made in or out of SciGraph. We use an equivalent implementation to do the translations in nginx for the resolver. One note is that you should not do this computationally by trying to replace prefixes because there are exceptions. Also, the endpoint you have here https://github.com/BlueBrain/nat/blob/master/nat/treeData.py#L107 is no longer accessible, so I'm not entirely sure where that data is coming from. If you have hardcoded the IP to old matrix in your hosts file or something like that then you are almost certainly get stale data. If you want to switch to our maintained endpoint (which is now finally up) see (newly added) note in the readme https://github.com/SciCrunch/NIF-Ontology#using-nifstd and switch your query to api_key = os.environ['SCICRUNCH_API_KEY']
baseKS = "http://scicrunch.org/api/1/"
response = requests.get(baseKS + "/scigraph/graph/neighbors/" +
root_id + "?direction=" + direction +
"&depth=" + str(maxDepth) +
"&project=%2A&blankNodes=false&" + relationshipType +
"&key=" + api_key) Please let me know if this addresses the issue. Best! |
NB: For the endpoint, we have an open issue (#11). Due to several things, it has not yet been fixed. |
Thanks @tgbugs for the info. It is very useful. I'll uses the mapping file you pointed us to implement explicit equivalences and avoid defining general rules on prefix equivalences due to exceptions. |
@tgbugs I'm back working on things related to this issue. I went at https://github.com/SciCrunch/NIF-Ontology#using-nifstd and tried to create a key for the API but both https://scicrunch.org/register and https://scicrunch.org/account/developer are currently empty pages. Were you aware of that? Is that normal? When is the situation expected to be resolved? |
Definitely not normal. It looks like the UCSD data center went down some time over night. It should be back up some time later today PDT. I will take a look at it when I get in later today and let you know. |
OK, so I'll resume this work on Monday then. Thanks for the feedback @tgbugs |
produces
and
produces
The terms identified by "BIRNLEX:160" and "NIFORG:birnlex_160" are identical. These alternative ways to format the ID of ontological terms cause difficulties for basic operation like asking "Is this model organism (e.g., wistar rat) is a subclass of another model organism (e.g., rodent)." When looking if 'NIFORG:birnlex_211' is a subclass of "BIRNLEX:160" presently we get
False
when we would expectTrue
. This is due to comparison of a given ID with a list of subclasses ID, but this list does not contain all possible alternatives ways to write the ID. We need to find a consistent and reliable way to check these equivalences. Most importantly, it has to be relatively efficient, e.g., systematic REST call to check for equivalences could quickly results in poor performances.The text was updated successfully, but these errors were encountered: