Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SCHEMA] Add metadata term files #762

Merged
merged 92 commits into from
Apr 13, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
92 commits
Select commit Hold shift + click to select a range
d8697ec
Draft a handful of metadata term files.
tsalo Mar 17, 2021
3d14053
Add example with specific possible values.
tsalo Mar 17, 2021
8c113fa
Match the validator schemas better.
tsalo Mar 17, 2021
a8d4043
Fix formatting.
tsalo Mar 17, 2021
9912848
Fix formatting again!
tsalo Mar 17, 2021
1e56975
Draft semi-functional rendering functions.
tsalo Mar 17, 2021
bfa99eb
Use unit abbreviations.
tsalo Mar 17, 2021
4240cad
Add more fields.
tsalo Mar 17, 2021
9627c46
Get macro working.
tsalo Mar 17, 2021
89ce1e6
Add terms from first table.
tsalo Mar 17, 2021
23c5915
Add AnatomicalLandmarkCoordinateSystem.
tsalo Mar 17, 2021
f3f3970
fMRI task information table.
tsalo Mar 17, 2021
8ac50f9
More terms.
tsalo Mar 17, 2021
11d6bcb
More terms.
tsalo Mar 17, 2021
098ba08
More terms.
tsalo Mar 17, 2021
28e8ef0
EchoTime and FlipAngle
tsalo Mar 17, 2021
6822962
Add tables.
tsalo Mar 18, 2021
61fef2c
More tables.
tsalo Mar 18, 2021
7b9717a
More terms.
tsalo Mar 18, 2021
897769c
More terms.
tsalo Mar 18, 2021
9dcc186
More terms.
tsalo Mar 18, 2021
1c10067
Fix spacing.
tsalo Mar 18, 2021
e9433d0
Clean things up.
tsalo Mar 18, 2021
24d2ca0
More terms.
tsalo Mar 18, 2021
80b98f8
More terms!
tsalo Mar 18, 2021
ae5c240
More terms.
tsalo Mar 18, 2021
4581f76
Some iEEG terms.
tsalo Mar 21, 2021
333519b
More iEEG terms.
tsalo Mar 21, 2021
823755b
Add ASL labeling terms.
tsalo Mar 21, 2021
b749f9b
Next batch.
tsalo Mar 21, 2021
340e2ae
More terms.
tsalo Mar 21, 2021
c96f7ee
More terms.
tsalo Mar 21, 2021
375f26e
More terms.
tsalo Mar 21, 2021
c00f9bb
Fix mistakes.
tsalo Mar 21, 2021
283b854
Last terms.
tsalo Mar 21, 2021
3624e7b
Reference yamls.
tsalo Mar 21, 2021
49f9c99
Fix mistakes.
tsalo Mar 21, 2021
0ada768
Change format of coordinate system files.
tsalo Mar 21, 2021
c78f0e3
Use degree in associated files.
tsalo Mar 21, 2021
610bcc7
Some of the missing terms.
tsalo Mar 21, 2021
7208960
A few more terms.
tsalo Mar 21, 2021
992957e
Fix typos.
tsalo Mar 21, 2021
89abc4f
More terms.
tsalo Mar 21, 2021
3616940
More terms.
tsalo Mar 21, 2021
630d852
More terms.
tsalo Mar 21, 2021
971a669
More terms.
tsalo Mar 21, 2021
31d2e33
More terms.
tsalo Mar 21, 2021
c489ff7
More terms.
tsalo Mar 21, 2021
83b7f47
Last terms.
tsalo Mar 21, 2021
3b17764
Fix link.
tsalo Mar 21, 2021
f7fa1ec
Fix internal links.
tsalo Mar 21, 2021
33a5413
Fix links for real.
tsalo Mar 21, 2021
feb6441
Derivative terms.
tsalo Mar 21, 2021
32a4c1b
Fix up code link.
tsalo Mar 21, 2021
f20943f
Merge branch 'master' into metadata-schema
tsalo Mar 30, 2021
27fcc3d
Use backslashes for continued strings.
tsalo Apr 3, 2021
94f6aaa
Replace $ref with file contents.
tsalo Apr 3, 2021
4281da8
Fix genetics.
tsalo Apr 3, 2021
6b0d762
Merge branch 'master' into metadata-schema
tsalo Apr 6, 2021
dc0c2af
Describe the structure of metadata YAML files.
tsalo Apr 6, 2021
3346942
Make metadatatype function recursive.
tsalo Apr 6, 2021
b19dc6d
Improve search function.
tsalo Apr 6, 2021
35005fe
Merge branch 'master' into metadata-schema
tsalo Apr 7, 2021
a5b361f
Start adding PET fields.
tsalo Apr 7, 2021
936b545
Add some fields.
tsalo Apr 7, 2021
3353f05
More terms.
tsalo Apr 7, 2021
38b79c3
More terms.
tsalo Apr 7, 2021
bd5fbad
More terms.
tsalo Apr 7, 2021
353d2fe
Fix mistakes.
tsalo Apr 7, 2021
5c44b13
More terms.
tsalo Apr 7, 2021
9d6183e
Replace InstitutionDepartmentName with existing InstitutionalDepartme…
tsalo Apr 7, 2021
3bbc26a
More terms.
tsalo Apr 7, 2021
175d3cd
More terms.
tsalo Apr 7, 2021
cbc46c4
More terms.
tsalo Apr 7, 2021
8f5feeb
More terms.
tsalo Apr 7, 2021
103e86a
More terms.
tsalo Apr 7, 2021
ee349c7
More terms.
tsalo Apr 7, 2021
82b1a1a
More terms.
tsalo Apr 7, 2021
5c7df7d
Last terms.
tsalo Apr 7, 2021
7f0f7b8
Add unit format for strings.
tsalo Apr 7, 2021
652c018
Add dataset_relative and participant_relative string formats.
tsalo Apr 7, 2021
ae351e7
Update READMEs.
tsalo Apr 7, 2021
79fddbe
Fix formats in README.
tsalo Apr 7, 2021
b4c07aa
Support table-specific metadata description extensions.
tsalo Apr 9, 2021
eeaafb5
Employ description extensions with IntendedFor.
tsalo Apr 9, 2021
04120de
Remove explicit defaults from YAML files.
tsalo Apr 9, 2021
7132399
Replace Minimum with minimum.
tsalo Apr 9, 2021
a4cbcc2
Replace inclusiveMaximum with maximum.
tsalo Apr 9, 2021
1b1ef6b
Replace implicit links with explicit ones.
tsalo Apr 9, 2021
0973723
Rename key_name to name.
tsalo Apr 10, 2021
16ed42a
Merge branch 'master' into metadata-schema
tsalo Apr 13, 2021
34fceb6
Rename "Unit" to "Units"
tsalo Apr 13, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
10 changes: 10 additions & 0 deletions src/02-common-principles.md
Original file line number Diff line number Diff line change
Expand Up @@ -508,6 +508,16 @@ Note that if a field name included in the data dictionary matches a column name
then that field MUST contain a description of the corresponding column,
using an object containing the following fields:

{{ MACROS___make_metadata_table(
{
"LongName": "OPTIONAL",
"Description": "RECOMMENDED",
"Levels": "RECOMMENDED",
"Units": "RECOMMENDED",
"TermURL": "RECOMMENDED",
}
) }}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

| **Key name** | **Requirement level** | **Data type** | **Description** |
| ------------ | --------------------- | ------------------------- | --------------------------------------------------------------------------------------------------------------- |
| LongName | OPTIONAL | [string][] | Long (unabbreviated) name of the column. |
Expand Down
28 changes: 26 additions & 2 deletions src/03-modality-agnostic-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,23 @@ Templates:
The file `dataset_description.json` is a JSON file describing the dataset.
Every dataset MUST include this file with the following fields:

{{ MACROS___make_metadata_table(
{
"Name": "REQUIRED",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"BIDSVersion": "REQUIRED",
"HEDVersion": "RECOMMENDED",
"DatasetType": "RECOMMENDED",
"License": "RECOMMENDED",
"Authors": "OPTIONAL",
"Acknowledgements": "OPTIONAL",
"HowToAcknowledge": "OPTIONAL",
"Funding": "OPTIONAL",
"EthicsApprovals": "OPTIONAL",
"ReferencesAndLinks": "OPTIONAL",
"DatasetDOI": "OPTIONAL",
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having such specifications in the document text would be a big step back from toward the target of having a "machine readable" schema.

Why not to add metadata key to the corresponding records, e.g. in this case of https://github.com/bids-standard/bids-specification/blob/master/src/schema/top_level_files.yaml#L14 where similar to entities of https://github.com/bids-standard/bids-specification/tree/master/src/schema/datatypes to list the 'requirements' for each one of those? and then in the macros just point to what metadata of to render here (e.g. of top_level_files.dataset_description here)? or you can immediately see that this would not work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having such specifications in the document text would be a big step back from toward the target of having a "machine readable" schema.

I see it as an incremental move forward. It definitely doesn't solve a lot of the problems (e.g., requirement levels, conditional relationships between metadata fields), but to be fair those are the hard problems so I don't feel too bad about it.

Why not to add metadata key to the corresponding records, e.g. in this case of https://github.com/bids-standard/bids-specification/blob/master/src/schema/top_level_files.yaml#L14 where similar to entities of https://github.com/bids-standard/bids-specification/tree/master/src/schema/datatypes to list the 'requirements' for each one of those? and then in the macros just point to what metadata of to render here (e.g. of top_level_files.dataset_description here)? or you can immediately see that this would not work?

I'll have to try it out, but that sounds like a good plan for the PR after this one. I figure that, since we want to separate the core definitions from the rules of the schema, we can tackle those two pieces in separate stages.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the best next step for the schema (after this) is to try to identify the types of rules we can expect from the schema, along with any really weird edge cases. That's what I've tried to do in #620, although I worry that my current list isn't exhaustive. Without a list of necessary rules, I'm not sure if we can define them in YAML files in a way that will satisfy everyone.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure could and should be a multistage process. I am ok with your proposal. I just thought that it might be nice to just put rules identification for later pr, thus keep description free form, while already allowing for machine readable specification of possible requirement levels and keeping everything under src/schema. But such a move could indeed be just yet another follow up PR to not block this one

) }}

| **Key name** | **Requirement level** | **Data type** | **Description** |
|--------------------|-----------------------|--------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Name | REQUIRED | [string][] | Name of the dataset. |
Expand Down Expand Up @@ -69,6 +86,13 @@ In addition to the keys for raw BIDS datasets,
derived BIDS datasets include the following REQUIRED and RECOMMENDED
`dataset_description.json` keys:

{{ MACROS___make_metadata_table(
{
"GeneratedBy": "REQUIRED",
"SourceDatasets": "RECOMMENDED",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SourceDatasets is missing: https://bids-specification--762.org.readthedocs.build/en/762/03-modality-agnostic-files.html#derived-dataset-and-pipeline-description

I guess script should be adjusted to error out if some field is not "handled"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultimately yes, but since I'm still in the process of converting all of the metadata terms, I don't want everything to error out just yet. Right now it just filters out any missing terms and renders the available ones.

}
) }}

| **Key name** | **Requirement level** | **Data type** | **Description** |
|----------------|-----------------------|--------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| GeneratedBy | REQUIRED | [array][] of [objects][] | Used to specify provenance of the derived dataset. See table below for contents of each object. |
Expand Down Expand Up @@ -339,15 +363,15 @@ The purpose of this file is to describe timing and other properties of each
imaging acquisition sequence (each *run* file) within one session.

Each neural recording *file* SHOULD be described by exactly one row.
Some recordings consist of multiple parts, that span several files,
Some recordings consist of multiple parts, that span several files,
for example through `echo-`, `part-`, or `split-` entities.
Such recordings MUST be documented with one row per file.

Relative paths to files should be used under a compulsory `filename` header.

If acquisition time is included it should be listed under the `acq_time` header.
Acquisition time refers to when the first data point in each run was acquired.
Furthermore, if this header is provided, the acquisition times of all files that
Furthermore, if this header is provided, the acquisition times of all files that
belong to a recording MUST be identical.

Datetime should be expressed as described in [Units](./02-common-principles.md#units).
Expand Down
Loading