-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert specification to schema format #540
Comments
a small clarification: So AFAIK we do not really need to "use JSON" while working on the schema (in our "sources") -- we can convert to JSON-LD for the "published" document. Our "sources" could remain as modular and avoid duplication as we like it, we just need to introduce "provisions" for encoding graph structure of LD and to ensure that we can produce proper json-ld from it. Having said all that, @satra et al are releasing json-ld's (and other serializations) while I believe working directly within "json-ld" sources, but also modularized into files (but without hierarchy), e.g. see https://github.com/ReproNim/reproschema/blob/master/terms/multipleChoice . To my very personal liking it is a bit of duplication etc which we could avoid... ;-) |
i should clarify this is only true for the compact form of JSON-LD. the expanded form has its own specific syntax. |
it is no longer a valid JSON? |
it is valid JSON, since JSON is always the underlying format, but the validation schema for the compact form and the expanded forms would be different. valid JSON doesn't mean valid data :) |
sure, good... I was simply pointing out that any (compact or not) JSON-LD is a valid JSON (and thus a valid YAML). What we would need to do for validation of the schema itself is another aspect. |
I guess I really don't know enough about JSON-LD (or even just LD). The decision I was trying to describe is generally between an established, but highly technical specification with a steep learning curve (i.e., JSON-LD) and a customized, readable, but highly idiosyncratic specification (e.g., something like what's being done in #475). If we do choose to follow established ontology structures like JSON-LD, we'll need some sort of translation between standard contributors and the schema, such as maintainers with specific training. On the other hand, if we use ad hoc structures, then they'll be much more readable, but also not easily interactive with other ontologies. Given that I have very limited experience with JSON-LD (mostly looking at NIMADS and NIDM and being confused), I hope that I'm not misrepresenting the tradeoff. Please correct me if either of you see a problem with it. @sappelhoff - On an unrelated note, since there are currently several issues and one PR dedicated to the conversion of the specification to a schema, could we create a new Project board to manage them? |
@tsalo - i think there are perhaps several components being conflated together, so instead of directly thinking about json/jsonld/ld, let's consider the components and why we may want to represent them in some schema or other structured format. I am also writing this partly to think this through a bit more openly. It may also be helpful for anyone new to bids to consider the potential places for contribution and ramifications that any change should consider. This builds on the very nice set of areas that @tsalo lays out in the original post (so please read: #540 (comment) before this).
|
Hi @tsalo I just increased your scope of permissions for this repository. Can you go ahead and try to make a project now? If it doesn't work, I'll look into the permissions again. I think it's a great idea to create a project board for this! |
It worked. Thanks! |
@satra You expressed all of my goals better than I did! As well as noted several more that I hadn't thought of. Do you have any thoughts on where we should go from here? It seems like we need to decide on the schema of choice for each of the elements you describe in "Structural and value constraints". Should we open a separate issue for each, or try to find one solution that works for all of them? I assume we could commit to technology (e.g., JSON-LD or SHACL) and then break it down from there? Perhaps we could ping some of the other folks who work with schemata more than me for their thoughts on the best technology for this case. Most everything under "Implementation technologies" went over my head, so I doubt I'll be able to contribute to that decision. |
@tsalo - it may be good to have an online chat with interested parties to discuss some of these things? |
@satra that sounds like a good plan! |
@satra Who would be interested? Based on a recent BIDS maintainers call, I was thinking @yarikoptic, @dbkeator, @dnkennedy, @rwblair, and possibly Maryann Martone (not sure if Dr. Martone has a GitHub account). EDIT: Oh and @nqueder! Apologies for the oversight. |
I'm interested.... |
@yarikoptic and I joined a couple of NIDM-Terms calls with most of the interested folks where we've made solid progress on the format of the schema, and since there is now a project board for the conversion I think we can close this issue in favor of more focused issues. Any objections? |
+1 for smaller, targeted, actionable issue that are connected into a coherent whole via the project board |
A long-term goal of the specification could be to make almost all of its content into a machine-readable schema, to facilitate automated use of the specification in other packages (e.g.,
pybids
andbids-validator
), as well as to propagate small changes across the specification.This is related to #423, #466, and #475. In #423, @dbkeator discusses work in extracting schema-like information from the specification, converting the terms to JSON-LD format, and linking BIDS terms to similar terms in other ontologies (see BIDS_Terms). Many elements of this work should also be incorporated into the actual specification, as it explicitly defines associations between terms within the specification. This should, in turn, make extracting relevant information from BIDS for other efforts, like BIDS_Terms, much easier. Initial work toward doing this conversion with the YAML format and specifically limited to the entity table has been done in #475.
Here are some initial goals for the conversion:
sourcedata/
).run
entity is currently defined once, under Anatomy imaging data under Magnetic Resonance Imaging, even though the entity applies to many other datatypes, and is generally referenced or briefly defined in those other datatypes' sections, without a link to the main definition.I think that only sections of the specification that wouldn't be described in the schema would be the Appendix pages, introduction pages/sections, Common Principles, and specific examples.
Open questions:
The text was updated successfully, but these errors were encountered: