Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report queries not handling empty labels on ERROR level #1017

Closed
psiotwo opened this issue Jun 24, 2022 · 8 comments
Closed

Report queries not handling empty labels on ERROR level #1017

psiotwo opened this issue Jun 24, 2022 · 8 comments

Comments

@psiotwo
Copy link
Contributor

psiotwo commented Jun 24, 2022

Standard set of report queries for the ERROR level does not seem to handle cases when a label is empty "".

There are rules e.g. to detect

But there does not seem to be check that a label is an empty string "" or an empty lang literal, e.g. ""@en. Should not this be covered on the ERROR level?

Also, this causes confusion in combination with the duplicate label query because it reports all concepts with empty labels as potential duplicates.

@matentzn
Copy link
Contributor

There are maaaany checks we could add to ROBOT. The problem is that with every check, runtime increases. So we basically always sit together and decide:

  1. How prevalent is this error
  2. How damaging is this error
  3. How long is the runtime of the check

Only if 1 is high or 2 is very high while 2 is reasonable, we start adding checks.. I have never seen this happen! https://api.triplydb.com/s/txFJGiVyt

@matentzn
Copy link
Contributor

That said, http://robot.obolibrary.org/report_queries/missing_label could probably be extended to support "" and ""@en, if you like to give it a shot!

@psiotwo
Copy link
Contributor Author

psiotwo commented Jun 24, 2022

Clear @matentzn - I think this check would make sense because not having a label and having an empty label is often just a matter of inconsistent serialisation of the particular tool and does not bear any semantics (although it could, of course).

So having a check for missing_label not including empty labels seems a bug to me w.r.t. this check.

But I understand your reasoning and your proposal for missing_label was one option I was considering. I will prepare some PR for this and we can discuss it through there ...

@jamesaoverton
Copy link
Member

Thanks @psiotwo! I think this will be a good addition, but yes let's ensure that there isn't a big performance hit.

@psiotwo
Copy link
Contributor Author

psiotwo commented Jun 24, 2022

Actually, is there any established way to test (both unit/performance tests) report queries? I must be overlooking st. in the codebase ...

@jamesaoverton
Copy link
Member

No, you're not missing anything, there's no code here for that.

@matentzn runs a million ontology projects and the OBO Dashboard, so maybe he can suggest a project that takes 5-10 minutes to run robot report, and we can test against that.

@matentzn
Copy link
Contributor

Actually, it would not hurt to add a few unit tests here as well. But yeah, we develop a query, make sure it works on a corpus, then push it to robot.

psiotwo added a commit to psiotwo/robot that referenced this issue Jun 25, 2022
psiotwo added a commit to psiotwo/robot that referenced this issue Jun 25, 2022
psiotwo added a commit to psiotwo/robot that referenced this issue Jun 25, 2022
jamesaoverton added a commit that referenced this issue Jul 28, 2022
[#1017] updating missing_label query to support empty labels
@psiotwo
Copy link
Contributor Author

psiotwo commented Jul 28, 2022

Closing via #1018 .

@psiotwo psiotwo closed this as completed Jul 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants