Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loadvoc command should take a vocabulary id, not project id #602

Closed
osma opened this issue Aug 4, 2022 · 1 comment · Fixed by #614
Closed

loadvoc command should take a vocabulary id, not project id #602

osma opened this issue Aug 4, 2022 · 1 comment · Fixed by #614
Assignees
Milestone

Comments

@osma
Copy link
Member

osma commented Aug 4, 2022

The annif loadvoc command currently takes a project ID as a parameter, like this:

annif loadvoc yso-tfidf-fi path/to/yso.ttl

But this can be a bit misleading, because the same vocabulary could be used by many other projects and thus their vocabulary will be loaded/updated as well. The problem will be even more prominent after implementing #559 / #600 , which makes vocabularies multilingual, and thus even projects in different languages may share the same vocabulary (and vocabulary id).

I suggest changing this so that the loadvoc command instead takes a vocabulary ID, like this:

annif loadvoc yso path/to/yso.ttl

This would align better with current reality, but of course it's a potentially disruptive change, since for example scripts that perform loadvoc operations have to be modified and all the relevant documentation updated, including the Annif tutorial. There could perhaps be a transition period where loadvoc with a project id keeps working but prints a deprecation warning...

There's also the question of how to deal with languages, especially when loading a vocabulary from a TSV file. Currently the language of a TSV vocabulary is inferred from the project configuration. But if the vocabulary is loaded directly, there is no project configuration, so the language may need to be specified directly, for example with a --language option (shortened to -L which is not otherwise used in current CLI commands). This would also be useful for SKOS vocabularies that lack language tags, as discussed in #556.

@osma osma added this to the Short term milestone Aug 4, 2022
@osma
Copy link
Member Author

osma commented Aug 22, 2022

Based on a suggestion by @juhoinkinen , I think this would be a better approach:

  1. Keep the loadvoc command around for a while (one release cycle?) in its current form, but mark it as deprecated and print a warning message when it is used.
  2. Add new commands load-vocab, list-vocabs, show-vocab and possibly other similar ones as well (not necessarily all of them implemented at once). These would all work on vocabulary IDs instead of project IDs and would be analoguous to the list-projects and show-project commands.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant