-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PROPOSAL] Reimagine SWEET as a compilation of textual definitions #211
Comments
I am +1 for this proposal. #208 provides a semi-automated mechanism for achieving this. |
I recommend further examination and discussion of the topic. This is an opportunity to determine what each direction would look like, and the potential downstream pros or harms for each. The question and brief discussion during the 11 Aug meeting wasn't quite clear. For example, as a junction, there may be risk of SWEET becoming ineffectual, merely a listing of others' resources, or replaced by some other resource. That is a concern, so for that particular direction, care and steps should be take to prevent those situations. For example, another direction is this as an opportunity to develop SWEET further. A team can be set up (if distinct from the current group) to have tasks such as creating SWEET-specific descriptions/definitions for its terms in neutral manner, and adding logical axiomatizations. It would need to be decided whether to have both. For example, an interesting direction may be a hybrid of the two. etc. |
Starting out with a disjunction of definitions at least gives you a place to start, and provides SWEET users with some (albeit not perfect) guidance on the meaning of terms. |
Please note that the proposals re SWEET were at the end of the talk, and were just some musings building on work earlier that week by @lewismc on adding textual defs into SWEET. The background is that SWEET is an important community resource with a long-standing reputation, but its role going forward is a little unclear. It is a large ontology covering a large scope, but its axiomatization is limited and that aspect has been overtaken by more recent work elsewhere. This proposal provides an updated role for SWEET, reflecting its widespread adoption for tagging. The core requirement is that textual definitions are available locally to SWEET users, so while there are several options for the mechanics, the expectation is that textual definitions are cached with SWEET. It was recognised that it is necessary that the provenance of these textual definitions incorporated be made very explicit. And if you are doing that for one source then there is no reason not to add text from other sources alongside. Wikidata provides something of an exemplar for merging content, with provenance. However, the scope of Wikidata is much broader, and its content is not curated by a community or discipline. |
The 'junction of textual definitions' is a direction SWEET could take and getting some clarity that that is what SWEET is would be progress. This would essentially establish SWEET as what I would call a dictionary. Use case is 'I see this term as a tag for some resource, what might it mean?' Matching the term with a SWEET label would yield one or more definitions that are in use. Assuming that SWEET provides the concept URI associated with each definition, this gives a user the potential to semantically enhance the tag by associating it with a URI that identifies a single, clearly defined concept. |
So what I am hearing from above (along with some thoughts of my own) is:
corollary - dramatically changing the structure of SWEET as would be necessary to create a logically coherent well-axiomized ontology would
corollary - picking a single definition for any term in SWEET would:
corollary - allowing multiple definitions from other semantic resources to be used for a single term in SWEET would
I am sure this rather hasty analysis is not complete or for that matter completely accurate (please point out flaws where you find them); but from the above I'd have to say that I like the "dictionary" version of SWEET as a path forward . |
In this scenario, adding local SWEET definitions, or an approximate intended meaning, can be done as well. |
Good analysis @rduerr |
Nice overview. Some input below.
Good points. In any case, though, we have the capability and freedom to develop SWEET further in those aspects, which may or may not have been the original intent of SWEET, and it can also be progress.
The structure of SWEET, e.g., at least the subsumption hierarchy, is there. A logical formalization may not be as dramatic as one may think.
It may be more work, but how much is TBD, and may not be as much as one may think. But even if it is, who said hard work was easy. :-) In general, questions of SWEET development include:
Well, there's a possibility for change on all sides whenever any semantic resource makes a change. It happens all the time.
We shouldn't use 'duplication' in a pejorative sense, or as a reason to either not create, or to point to a particular product.
Any resources with their own axiomatizations will already be work to harmonize or map with one another. I think we agree that a resource, including SWEET, shouldn't be forced to have the same axiomatization of some other resource. In general, each resources with axiomatizations can be different in the subtleties within that formalization, just as within a natural language definition. It need to be a full harmonization (whatever that would look like), but it could be specified to approximate harmonization: one resource with a natural language defintion of a term may be said to be approximately similar at the most immediate sense with that of another resource which has additional semantic baggage. But the the two may be determined and asserted as different at, say, at a abstract level which the latter may have in its additional semantic baggage. So it is always a good amount of work. The perceived workload should not deter.
Possibly but not necessarily.
This applies to all semantic resources. Any semantic resource that has one definition is a restriction or constraint. It may make assumptions on semantic and other dimensions, that may or may not be seen in the definition itself. This is especially true for semantic resources that use other more abstract resources (e.g. upper ontologies) which actually restricts and imposes more on the user that the user may want.
The idea of users being able to add their own definition is interesting, definitions they may or may not have explicitly stated in their work.
|
This might all be an issue of use cases... my immediate use case is to enable a large number of existing repositories to federate their discovery mechanisms without having to jump over lots of semantic and other technical hoops to just get their results to be comparable and useful to users which is, by the way, not the case today. In other words to be Findable! Beyond that the semantic harmonization work should allow those datasets found to also actually be Accessed, Interoperable, and Reusable. In other words, entirely utilitarian use cases. No semantic resources of academic quality or academic interest is of interest to me! I want stuff that works in the real world, with real data and real repositories. That's actually what the polar repositories community that I am working with have wanted for decades now! So, having said that. I vote that we
Hopefully this will encourage repositories to contribute to this work, build community and start thinking outside of the box of their data when building systems, capabilities and tools. Does this make sense? |
Whew. Can open. Worms everywhere. :) I have a few follow up questions.
Yes, a waste, especially when not uploaded to COR (or similar) either. :) In the past there were domain ontologies developed for, or subsequently added to, SWEET; Hydrogeology and Volcanology (IIRC) come to mind (i.e. not mapped to). I understand ownership, credit and the like can be a tricky thing, but is this sort of approach worth re-visiting?
Is it? I've heard this ...more than thrice. Is this information gathered mostly from personal interaction or is there evidence somewhere one could point toward? Has there been any recent investigation? Are there any web/link analytics available from current (or any) SWEET URIs? (And the knock-on questions around how that might be leveraged alongside or within COR...) Without allowing anyone time to answer :), would it be worth sending out a survey to ask questions in this regard? We did something similar for the Agrisemantics WG and the results were actually pretty interesting (output 2 is most relevant here). I'm wondering if a survey, obviously with a focus on geoscience semantic resources, make sense? Questions around what is used, why is it used, etc. Perhaps even loaded/leading questions around why SWEET is or isn't used, and so on. At minimum it would provide some much needed insight to the overall "what is SWEET and where is it going" lingering question(s).
Sure. I agree with @rrovetto. It's the way of things. ;) To me, it does signal the governance of SWEET needs to be clear both in natural language terms and how PROV (I assume) is used.
Yes, and ideally the effort would be both ways. I don't think it an unreasonable ask for a thesaurus editor at NASA, NOAA, or the USGS to perform a quick search of SWEET when making edits or additions. If the links already exist it's a quick verification step. What's really interesting in this proposal is the analog between SWEET, for Earth Sciences, and GACS, for Ag and Food --- or similar lines at least (paper has explanation and links). One of the reasons GACS has stalled is due to politics (to be polite) --- they have no ESIP equivalent (well, nothing that would garner any type of agreement). As ESIP is already in place, SWEET as a "semantic hub" could be quite powerful. It's personally disappointing, but if that's what makes SWEET useful then so be it. In reference to the actual issue, I think 2 and 3 seem quite usable. I would advocate for keeping them in separate graphs, at least for the time being. Modularity is an undervalued design technique and it would be nice to only import what is needed --- even if it turns out to be in theory only. :) |
SWEET is accessed loads through COR. We can easily generate some numbers fro say a months worth of HTTPD server logs.
I would say yes. I think this is a good idea. It may go a long ways to cleaning up a lot of the hypothesizing above. |
Discussed in our Semantic Tech call today:
This will have significant consequences on what SWEET looks like and how technology interacts with it. This should be considered - those who have technology interacting with SWEET should be asked to comment. Also, which definition will be used to build the hierarchy? Some will identify different super classes. Previous discussions have suggested not having the structure driven by definitions, but this has consequences for the semantic model. The Semantic Harmonisation definitions do, however, consider the hierarchy and suggested changes to it. |
Re the last point in the preceding comment ("Which definition will be used to build the hierarchy?"), I had a direct reaction in the meeting, which led to interesting questions about governance. As noted, my reaction was that no one in the discussion about definitions (either this one or issue #125) had any thought that the definitions had any implication for the structure. I think we were assuming that a definition that did not fit the current structure (e.g., the realm) would obviously not fit, and so would not be included. I know my expectation was that since SWEET is not a seriously structured semantic resource, and opinions about structure can vary (especially as expressed in the definitions), I would not expect us to take on additional structuring just because a definition said it was so. And so, there is no need to answer the question at the top of this comment, because we are not using definitions to build a hierarchy. All definitions are equally (un)important. That does not mean the SWEET semantic model/structure can not be changed, just that the process of changing the structure should not be driven by the definitions. If independently everyone decides on a structural change, that should change what definitions are considered appropriate from then on. But these are all my opinions. The other question this led to was "Who decided that we would start collecting definitions for SWEET? How are decisions for SWEET made?" I guess this is progress, because until that moment I'd say SWEET decisions were made by Lewis based on agreement from the "interested community"—that is, people who responded to the tickets where the tasks were proposed. The advantage of that approach is that it is responsive to people who are most willing to invest the time to propose and implement a change. I'm not yet 100% convinced that the interest level will stay high enough to adopt a "strategic governance" model as opposed to a "task-driven governance" model (strategic is much more expensive!), but it's great if the interest and value is strong enough to support more strategic discussion. (We'd likely want a SWEET-specific call each month so it doesn't dominate SemTech calls.) |
Note, I've changed this proposal title to accommodate growing consensus from the attendees at the 2020-10-13 meeting. Also, I'm going to cross-post all of the commentary from that meeting here
|
For those interested in doing more high-powered semantics, it might worthwhile to form groups to work on specific versions of SWEET. E.g., if we want a version of SWEET that makes heavy use of the OBO Foundry, we can create an OBO-SWEET ontology. I say this with caution b/c it might entail a lot of work that is difficult to get down without funding. |
This topic was discussed/debated at todays ESIP Winter working session.
The RESULTS were 92% Yes and 8% No meaning that the VOTE passes and that we resolve this issue and move forward. Update (9 Feb): A few things should be noted for fairness and clarity. (1) The vote was abrupt and ad hoc, introduced after some members mentioned alternatives (not necessarily mutually exclusive) and expressing potential negative consequences of the junction approach. (2) Conducting a vote was not agreed to ahead of time. (3) Alternatives (including this proposal) were previously described in a document provided by a member, but were never discussed prior to the session. (4) Newcomers to the Winter meeting session were thereby not sufficiently informed (they were not in a position to make an informed decision). In all, this makes the vote questionable and should not have occurred (certainly not in that manner). In any case, a critically important point is having a shared desire to develop SWEET itself, not to spotlight or bias other resources. |
If SWEET classes represent a collection of definitions that might each have a different lexical representation (label, title, as suggested in #125), there should be some way to identify a specific definition if someone wants to reference one of the specific concepts (represented by a definition string). This is not possible if blank nodes are used for the definitions. |
Summary
The SWEET ontology suite should fulfill the role of a junction for textual definitions. An example can be found below
Context
During The ESIP Summer 2020 meeting, as part of the Plenary: Proliferation of Vocabularies in Solid Earth, Space and Environmental sciences: Which one should I use and which ones can I trust?, @dr-shorthair presentation titled Proliferation of vocabularies and services - feature or bug? asked the fundamental question of How to manage the diversity? (across vocabularies).
@dr-shorthair's narrative presented the following three scenarios.
Current semantics
skos:defintion
's are currently stored in SWEET as followsAt the Aug 11th, 2020 ESIP SemTech Committee meeting this topic was discussed with the driving commentary being that although SWEET is well known in the ESSI space, other resources present appealing alternatives due to SWEET's limited axiomatization. One such example is ENVO which clearly offers an improved option from a curation perspective.
At it's core, this proposal seeks to acknowledge the above situation by strategically re-positioning SWEET as a junction for textual definitions. Several attendees at the Aug 11th, 2020 ESIP SemTech Committee meeting stated explicit support for this motion.
Request for comment(s)
This thread will act as means for obtaining community desire. Please comment below.
The text was updated successfully, but these errors were encountered: