Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BIDS validator and datalad. "issue" to mention in the FAQ? #560

Closed
Remi-Gau opened this issue Jul 31, 2020 · 6 comments · Fixed by #562
Closed

BIDS validator and datalad. "issue" to mention in the FAQ? #560

Remi-Gau opened this issue Jul 31, 2020 · 6 comments · Fixed by #562

Comments

@Remi-Gau
Copy link
Contributor

Is your feature request related to a problem? Please describe.

Not sure if this a datalad "issue" or a bids-validator one.

TL;DR: Adding something to the FAQ mentioning that a BIDS valid dataset will fail to pass the validator if all its annexed content is dropped.

After a datalad get the validator will only count the subjects that have some data with no broken link.

This is not a very intuitive behavior and I think that if I did get confused maybe others will too.

Additional context

I had made a valid BIDS dataset a long time ago. Look how beautiful it is.
(no comment on the fact that can't count when it comes to subject number) 😄

~/BIDS/McGurk/rawdata$ tree -d
.
├── stimuli
├── sub-01
│   ├── anat
│   └── func
├── sub-13
│   ├── anat
│   └── func
├── sub-14
│   ├── anat
│   └── func
├── sub-15
│   ├── anat
│   └── func
├── sub-24
│   ├── anat
│   └── func
├── sub-28
│   └── func
├── sub-32
│   ├── anat
│   └── func
├── sub-41
│   ├── anat
│   └── func
├── sub-48
│   ├── anat
│   └── func
├── sub-61
│   ├── anat
│   └── func
├── sub-66
│   ├── anat
│   └── func
├── sub-69
│   ├── anat
│   └── func
├── sub-73
│   ├── anat
│   └── func
├── sub-74
│   ├── anat
│   └── func
├── sub-82
│   ├── anat
│   └── func
└── sub-98
    ├── anat
    └── func

Decided to bring it under version control with datalab to also help with publishing and future work.

So now all the big boys have been annexed.

~/BIDS/McGurk/rawdata$ tree sub-01
sub-01
├── anat
│   ├── sub-01_T1w.json
│   └── sub-01_T1w.nii -> ../../.git/annex/objects/XQ/Qq/MD5E-s21627856--617772f8e9e9aa020a216bd7954fd20b.nii/MD5E-s21627856--617772f8e9e9aa020a216bd7954fd20b.nii
└── func
    ├── sub-01_task-contextmcgurk_run-01_bold.nii -> ../../.git/annex/objects/pZ/kF/MD5E-s109756768--8f1660222186a790936717f17898fd9f.nii/MD5E-s109756768--8f1660222186a790936717f17898fd9f.nii
    ├── sub-01_task-contextmcgurk_run-01_events.tsv
    ├── sub-01_task-contextmcgurk_run-02_bold.nii -> ../../.git/annex/objects/j6/wm/MD5E-s109756768--856ed3a5e494e016a59f18be3a7cc738.nii/MD5E-s109756768--856ed3a5e494e016a59f18be3a7cc738.nii
    ├── sub-01_task-contextmcgurk_run-02_events.tsv

I have pushed it all to GIN

:~/BIDS/McGurk/rawdata$ datalad siblings
.: here(+) [git]
.: gin(+) [git@gin.g-node.org:/RemiGau/mc_gurk-raw.git (git)]

~/BIDS/McGurk/rawdata$ datalad status --annex all
183 annex'd files (0.0 B/13.2 GB present/total size)
nothing to save, working tree clean

But now if I run the validator I get this HORROR error.

~/BIDS/McGurk/rawdata$ bids-validator .
bids-validator@1.5.4

	1: [ERR] Quick validation failed - the general folder structure does not resemble a BIDS dataset. Have you chosen the right folder (with "sub-*/" subfolders)? Check for structural/naming issues and presence of at least one subject. (code: 61 - QUICK_VALIDATION_FAILED)
		..
	Please visit https://neurostars.org/search?q=QUICK_VALIDATION_FAILED for existing conversations about this issue.

After recovering from the shock I tried to get some data back.

~/BIDS/McGurk/rawdata$ datalad get sub-01/anat/sub-01_T1w.nii
get(ok): /home/remi/BIDS/McGurk/rawdata/sub-01/anat/sub-01_T1w.nii (file) [from gin...]

And this happens.

~/BIDS/McGurk/rawdata$ bids-validator .
bids-validator@1.5.4

This dataset appears to be BIDS compatible.

        Summary:                Available Tasks:        Available Modalities: 
        9 Files, 20.63MB        context Mc Gurk         T1w                   
        1 - Subject                                                           
        1 - Session                                                           

Same if I get another anat

~/BIDS/McGurk/rawdata$ datalad get sub-13/anat
get(ok): /home/remi/BIDS/McGurk/rawdata/sub-13/anat/sub-13_T1w.nii (file) [from gin...]                                       
get(ok): /home/remi/BIDS/McGurk/rawdata/sub-13/anat (directory)
action summary:
  get (ok: 2)
(base) remi@remi-XPS-15-9570:~/BIDS/McGurk/rawdata$ bids-validator .
bids-validator@1.5.4

This dataset appears to be BIDS compatible.

        Summary:                 Available Tasks:        Available Modalities: 
        11 Files, 41.26MB        context Mc Gurk         T1w                   
        2 - Subjects                                                           
        1 - Session                                                            

But nothing changes if I get one bold func file from another subject.

~/BIDS/McGurk/rawdata$ datalad get sub-14/func/sub-14_task-contextmcgurk_run-01_*
get(ok): /home/remi/BIDS/McGurk/rawdata/sub-14/func/sub-14_task-contextmcgurk_run-01_bold.nii (file) [from gin...]            
action summary:
  get (notneeded: 1, ok: 1)
(base) remi@remi-XPS-15-9570:~/BIDS/McGurk/rawdata$ bids-validator .
bids-validator@1.5.4

This dataset appears to be BIDS compatible.

        Summary:                 Available Tasks:        Available Modalities: 
        11 Files, 41.26MB        context Mc Gurk         T1w                   
        2 - Subjects                                                           
        1 - Session                                                            
@adswa
Copy link
Contributor

adswa commented Jul 31, 2020

I had made a valid BIDS dataset a long time ago. Look how beautiful it is.

Magnificent. The dataset name brings back joyous memories to old psychology lectures about sensory illusions 😍

(no comment on the fact that can't count when it comes to subject number) 😄

I'd say it is another level of anonymity, by obfuscating the total subject number with non-sequential IDs 😉

But now if I run the validator I get this HORROR error.
This is not a very intuitive behavior and I think that if I did get confused maybe others will too.

I completely agree. It's frighteningly uninformative ("nooo, my BIDS data") and its not obvious that this stems from broken symlinks. I agree that it deserves a place in the FAQ. And potentially also a warning or note in chapter three, in the section on broken symlinks.

The issue makes sense but is potentially not obvious at all at the same time. With all data dropped, all symlinks in the dataset are broken, and few tools would be able to make any use of them at all. But for someone just looking at a dataset it may not be clear that the data is dropped and symlinks are broken, unless their shell does helpful highlighting or they know DataLad well enough. There's nothing that can be done on DataLad's side about this IMO, though. Personally, I'd find it helpful if the bids validator would fail a bit more informatively.

But nothing changes if I get one bold func file from another subject.

I'm not familiar with the internals of the BIDS validator, but I was under the assumption that it starts finding subjects via the available anatomical files, so I assume it needs to find a resolvable symlink to an anat file to count that subject. (this is me just guessing)

I'll write something up on that and add it in later today. Thanks much for the suggestion! :-)
@all-contributors please add @Remi-Gau for ideas, maintenance

@allcontributors
Copy link
Contributor

@adswa

I could not determine your intention.

Basic usage: @all-contributors please add @Someone for code, doc and infra

For other usages see the documentation

@adswa
Copy link
Contributor

adswa commented Jul 31, 2020

@all-contributors please add @Remi-Gau for ideas and maintenance

@allcontributors
Copy link
Contributor

@adswa

I've put up a pull request to add @Remi-Gau! 🎉

@Remi-Gau
Copy link
Contributor Author

There's nothing that can be done on DataLad's side about this IMO, though. Personally, I'd find it helpful if the bids validator would fail a bit more informatively.

But nothing changes if I get one bold func file from another subject.

I'm not familiar with the internals of the BIDS validator, but I was under the assumption that it starts finding subjects via the available anatomical files, so I assume it needs to find a resolvable symlink to an anat file to count that subject. (this is me just guessing)

Yeah I don't necessarily think there is much to do on the datalad side bu I will raise an issue on the bids validator side to see if they have any idea / suggestion. :-)

@Remi-Gau
Copy link
Contributor Author

Will close this now as the PR seem to have more relevant information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants