-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Profile Composition and Languages #162
Comments
Vladimir, you already provided some examples of "profile definition languages" which formalization level is decisive about operations like composition (+ conflict resolution), referencing etc. There are UCs supporting the usage of profiles, mainly ID2, ID30, ID41, and ID46, while no one deals specifically with the formalization of profiles. I'd suggest to coordinate with @RubenVerborgh, @larsgsvensson, and @rob-metalinkage on creation of a new, dedicated use case on "Profile specification(s)". ID3 implies the usage/application of multiple profiles at once, where profiles are implicitly merged/combined, which your UC seems to address. Here, based on "Profiles specification(s)", the issues of conflict resolution etc. come into play. |
@jpullmann thanks for the good overview! Awaiting a reaction from @RubenVerborgh, @larsgsvensson, and @rob-metalinkage. I think these are complex issues: how to compose profiles, described with different technologies, that can be used to validate data, or to understand what data to expect from a adataset. |
Hi, I have already created a "straw man" for a lightweight description of profiles independent of the profile specification language - potentially as part of dcat - since this is a concern for describing data sets and extends dct:conformsTo and dc:Standard just enough to declare relationships and bind to profile description resources. This is intended to support existing profile hierarchies (described in documents) as well as profile descriptions in OWL, SHACL or other choices. its at https://github.com/w3c/dxwg/tree/roba-profile/dcat/rdf Its agnostic about the composition method, but the definition of profile adoption does require constraints to be transitive - a profile cannot relax or change sense of an inherited constraint. I think its a good point to decide if thus is in scope and implied by current requirements, or if we want a more explicit UC and requirements. I dont believe we are in a position to require or develop a specific profile description vocabulary - but i would expect that DCAT profile guidance would be able to recommend use of W3C vocabularies such as SHACL and RDF-Datacube |
@rob-metalinkage Could you make some documentation of the strawman so it doesn't depend on us inferencing (mind-reading) from peering into the the RDF? Maybe in the Wiki here https://github.com/w3c/dxwg/wiki/DCAT-Profiles-Topics |
@VladimirAlexiev
This draft is definitely not state-of-the-art any more. It's sort of mixes profiles and schemas and doesn't really cater for the media type independence we're aiming at when discussing profiles. |
Well the WG isn't only about DCAT but we do have profiles on our agenda, too. Glad to hear that you think of that as a technological advancement! I don't have an answer on how to do profile composition (yet) but I hope we'll find a solution. What we need is a (technology-independent?) meta-model of profiles including composition mechanisms. |
I followed @dr-shorthair 's suggestion and pasted one of your examples into the Wiki with some comments from me for further discussion
I think we can, perhaps not as a recommendation, but at least as a WG note |
Sure.
But are there successful languages for universal data description?
Ah but this is a very narrow example. MARC XML uses the same MARC lingo (tags, subtags etc) and is a mirror image.
I doubt you can do it with XML Schema, I think you have to also use Schematron for the cross-field rules. I can give another narrow example:
I think we do: even if we can't develop a technology-independent language for defining (implementing) profiles, we can develop one for composing profiles. |
I don't know. But a profile description language should be as media-type independent as possible.
I'll have a look at those
OK, might be and that was not the point I was aiming at. The point is that there are several flavours of MARC 21 (not counting all flavours of MARC). In his MARC validator, Péter Király mentions six different ones that all ought to be proper MARC 21 but where there is a choice of where you can put the information an some suppliers do it like this and some like that. To me that's different profiles of MARC 21.
OK, then let's do it! |
I agree with Vlad here. There seems to be confusion about whether the profile is a set of validation rules, or if it's something else; for example
I think this issue is clear that the Profile and schema/validation language are separate resources, and thus profiles can only "link to" those machine processable constraint descriptions. That a profile could then link to many such descriptions, for different formats (JSON-schema, SHACL/ShEx, xml schema / schematron, etc.), thereby resolving Lars' definition that the profile is technology independent. |
Housekeeping: Can someone add |
Hmmm. We seem to have "profile-negotiation" as a tag but not "content-negotiation". I added "content-negotiation" but I'm not sure it's a good idea (we've suffered label proliferation in the recent past). I'll leave it there and see what folks think. Meanwhile, I added the "profile-negotiation" label. |
Sorry, I meant |
@azaroth42 "I think this issue is clear that the Profile and schema/validation language are separate resources, and thus profiles can only "link to" those machine processable constraint descriptions. That a profile could then link to many such descriptions, for different formats (JSON-schema, SHACL/ShEx, xml schema / schematron, etc.), thereby resolving Lars' definition that the profile is technology independent." @larsgsvensson " I think we do: even if we can't develop a technology-independent language for defining (implementing) profiles, we can develop one for composing profiles. OK, then let's do it!" this is exactly the motivation for the ProfileDesc vocabulary :-) Please identify where it succeeds or fails to meet these goals and also make sure the Use Cases and Requirements adequately drive this if there is any doubt. |
@azaroth42 As to whether profiles and schema languages are necessarily separate "things" - there are folks who wish to use SHACL or ShEx as profile languages, or at least as the basis for a profile. I haven't seen an example so I don't know what that looks like in comparison to, say, DCAT-AP. Presumably you could add properties for instructions, examples, definitions, etc. to a validation language and that may be sufficient in some cases. I have doubts about the human-friendliness of that solution, but for sure we should allow for profiles as human-readable documents and coded validation as separate, with all of the downsides of expressing some of the same things in two different places. |
@rob-metalinkage It is incumbent on you to show how/if profileDesc meets these requirements if you believe that it does. Note that as of yet there is no documentation that gives the goals of PD, the scope, definitions of terms, etc., and no documentation that links it to specific requirements. We have talked about this before. Also, all work must follow the W3C process, which moves from use cases to requirements to problem statements and then solutions, all done through open meetings of group members, agendas, minutes, assigned actions and consensus on solutions. We cannot work backward from solutions that have been developed outside of this process. The sooner we confirm the use cases and requirements the sooner we can begin to work on solutions as a group, as required by W3C procedure. If you are aware of missing use cases, please suggest them. Also, as we complete the discussion of requirements a gathering of requirements that are needed to "describe" profiles (perhaps as a github issue for discussion) would be useful. That could help us scope that particular function. |
@kcoyle wrote:
Certainly this is a common and good practice, but I disagree that strictly "We cannot work backward from solutions that have been developed outside of this process." Sometimes prototype solutions are the most efficient way to document requirements. Looking at the W3C process document [1] I see no strict rule about how a WG operates. For example, there is no mention of use-cases. The DWXG charter [2] doesn't seem to specify a strict process either. Perhaps there is another document you are leaning on? I certainly respect a process that starts with use cases and derives requirements before engineering a solution, but the waterfall method is not the only way to get results, and has its own well known risks. My understanding of the W3C process is that it is more permissive and flexible than this. Yes, all significant decisions should be documented. And Use Cases and derived Requirements can be important reference points, and are a common artefact in W3C work. But I don't even think that a formal UCR is strictly mandatory? As I understand it, each working-group is at liberty to decide on its own internal mode of operation. Other groups that I was involved in used the UCR to trigger the work, and to provide a check list for the outputs, but the outputs also included material that couldn't be strictly traced back to a specific use case, and during the process were also willing to consider ready-made solutions outside the UCR process. Perhaps you are suggesting that the DXWG has agreed to follow a strict process? Can you point to where this is recorded? Otherwise, I would suggest that there is no impediment to at least considering proposals that arise during the development of our work, even if they came from some parallel or external stream. [1] https://www.w3.org/Consortium/Process/ |
Simon, a number of things.
|
I doubt that such a process is required by the W3C. It might be required by a WG, but then this requirement should be documented. Otherwise the process is intransparent. |
Please see the charter[1] with our timeline (3.3) that includes a deadline for the use cases and requirements. Although the creation of a formal UCR document is optional, the use cases and requirements are not. The purpose of the UCR document is mainly to solicit wide agreement on the scope of the work. Also note that this group has existed for over a year and we have been openly pursuing these goals all of that time. The work itself is documented in detail in our agendas and minutes, the UCR document was published as a FPWD (which makes it official in W3C terms). It seems somewhat odd to be questioning the process now. However, if there are concrete suggestions for change we can run them past the W3C representatives we have and also past the group as a whole. That said, there is nothing keeping the group from considering the profileDesc other than its own will. Using the process we have been engaged in is a suggestion to increase the chances that the group will consider that proposal because it puts it in the context of our work. |
@kcoyle I'm keen to improve the documentation around profileDesc, including further detailing Use Cases for many reasons, one of which is to increase this group's understanding of it and thus the groups engagement with it. However, can we do a status check before I do that please? This is a long-running thread (first posts back in March) and much has been done in the profiling space in this group since. Can I please check these assumptions:
We have a growing list of profiling requirements confirmed as in scope and use cases, like #239 that describe the motivation for particular approaches. We also have well discussed issues such as profileDesc and the Guidance document that indicate great engagement with profileDesc in particular and in profile guidance and implementations of profileneg in general. So what then is missing? Do we perhaps need to better indicate which parts of the things we are working on (guidance doc, specific implementations) have Use Cases and Requirements clearly detailed? I gather you think there is a Use Case or two for profileDesc missing? We have 239 for the Alternates View approach but do you think we are missing such for profileDesc? |
|
https://github.com/w3c/dxwg/tree/gh-pages/profiledesc "full" and "simplified" seem to be more distinct frame based profiles of a common profile that determines meaning (yet another example of an hierarchy) |
I also think the level of fields is more granular than we need to worry about - thats more of an API concern. Describing the availability (and usage) of sets of fields is more relevant to make statements about interoperability of data (which is why its more a DCAT issue than a conneg issue, in spite of the word profile appearing...) |
@nicholascar I think your questions are questions for the group. We can either add this to an upcoming agenda, or perhaps a separate github issue with an email asking folks to weigh in. I don't know if there are missing use cases - and we haven't finished going through the profile/conneg requirements. Ideally those of you who know profileDesc would do that analysis based on the requirements listed as of today[1]. [1] https://docs.google.com/document/d/13hV2tJ6Kg2Hfe7e1BowY5QfCIweH9GxSCFQV1aWtOPg/edit |
Rob A:
Yes, a semantic/data model profile and a materialization profile. I would go one level deeper to say there's a conceptual profile behind the data model, which is then mapped to various ontological models. And perhaps one level further out as well for strict API concerns. Conceptual: The human understanding of the world, without any RDF notions at all eg:
Then there are the ontologies that the Semantic Profile also uses... (a) uses SKOS and SKOSXL, (b) uses CIDOC-CRM, RDF, RDFS, and linked.art. I think in schemaDesc these are BaseSpecifications, as distinct from Profiles (but the name could be improved, IMO) |
@azaroth42 - your exercise in separation of concerns is probably also applicable to datasets and distributions. i dont think there is much appetite for a very deep model of profiles however - and everything needs to be grounded in a specific requirement. That said, the "role" qualifier on a profile definition (in word, SHACL, UML or whatever) could be used to discriminate between these levels of abstraction perhaps. PDF (text) forms of profile definitions tend to bundle multiple levels of abstraction in a single artefact, but machine readable artefacts such as the "semantic profile" could have a specific role. Materialisations could be sub-profiles - and the use of production rules, for example to control JSON-LD serialisation etc could be included as a requirement - such profiles could inherit from an additional base specification which includes such constraints. (a profile is an interoperability contract - so predictability of serialisation is definitely a profile concern). So currently, I think the profileDesc proposal can handle these concerns - and it is derived from the original proposed requirements in the UCR, (the current exercise in discussing these in plenary has not yet had significant impact by adding, deleting or changing any of these). So procedure-wise we can NB. An open question for profileDesc is to what extent we define a canonical set of "roles" to describe the form and function of constraint expressions. |
Yes, agreed. I think that between the hierarchical approach and the use of roles, it will work nicely to allow various permutations of the above, or completely different thinking :) |
@nicholascar I was hoping to find an example of what I mean by basic documentation, but I didn't find something analogous in a very short search. Just looking at a bunch of readme's kind of sets the stage. What I imagine is really quite simple, which is adding some paragraphs to the readme or a short introduction of profileDesc that says:
I'm thinking this ends up being a screen or screen and half of stuff. Kind of an "elevator talk" level that promotes the idea. Then we'll ask folks to read it and they may have questions or suggestions. BTW if this already exists in another file then I missed it while poking around and I apologize, so let's make it more visible. |
Removing tag profile-negotiation since this issue is all about profile-description |
Unlinking this issue from UCR, as decided in https://www.w3.org/2021/04/20-dxwg-minutes#r03 |
@nicholascar @rob-metalinkage should this be linked to PROF (and moved to its github repo?) |
Submitting a new USE CASE, strongly related to ID3
Profile Composition and Languages
Status:
Identifier: ID52 (proposed)
Creator: Vladimir Alexiev, Ontotext
Deliverable(s): AP Guidelines, Content Negotiation
Tags
profile_negotiation profile representation composition
Stakeholders
data consumer, data producer
Problem statement
As stated by ID3, the response of a server can conform to multiple profiles.
However, I believe it is unclear how these profiles should compose.
Different languages for defining profiles have different "composition" capabilities.
Possible profile languages and composition mechanisms should be described.
Examples:
owl:imports
), and ShEx has importing. Ifsh1
importssh2
, should the resulting profile be calledsh1&sh2
or justsh1
?Existing approaches
The Expected RFC states one such mechanism for XML schemas:
Requirements
Related use cases
ID3, RPFDF, RPFN
The text was updated successfully, but these errors were encountered: