-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read_raw_bids() cannot infer recording kind, filename extension, leading to awkward workarounds #408
Comments
we should have a video session to discuss this but my first reaction is that
read_raw_bids
should just work with basename and use glob internally to find a matching
extension for raw data.
… |
My schedule is flexible. Let me know when you would be available – who else would like to chime in? @jasmainak @sappelhoff (just randomly pinging people here :D) I can also set up a Doodle to find a date that would suit us all. Thanks! |
I don't have a particular opinion about this but I am happy to listen and chime in when I think I have something important to say. My schedule is also flexible (except mondays) |
I would suggest going a little back and forth on github first so we can think about the problem first. I like the idea of globbing. But one problem with globbing is that you might have two files which match the basename. For example, if someone placed a derivative file in the same folder that has the same basename, then how do you choose the right file? It would be nice if there are guarantees in the validator from this happening. Could you check? |
But one problem with globbing is that you might have two files which match the basename. For example, if someone placed a derivative file in the same folder that has the same basename, then how do you choose the right file? It would be nice if there are guarantees in the validator from this happening. Could you check?
would it valid to do this? if not I would say we detect this and raise
an exception
|
I think nothing stops you from adding non-compliant extra files in the BIDS folders. But I seem to recall that the validator throws warnings/errors if you do that. Maybe @sappelhoff will know. I'm fine with detecting this case and raising an exception until there is a compelling reason to do otherwise. |
The validator should (and to my knowledge DOES) stop you if you do that. However:
That said, the validator does not go into the Perhaps we can use a check whether globbing is ambiguous (several matches returned) ... and if it is, we warn and demand extra params ... or a check by the user of their BIDS data? if it's not ambiguous (just a single match), we just proceed? |
@agramfort and I had a brief phone call yesterday to discuss the issue. Here's an idea we came up with: ScenarioAssume a dataset with simultaneous MEG and EEG recordings. MEG data is in FIFF format, and EEG data is stored as BrainVision. Current solutionbids_root = '/bids'
bids_basename = make_bids_basename(subject='foo', session='bar', task='baz')
# Load MEG data.
kind = 'meg'
fname_ext = 'fif'
bids_fname = f'{bids_basename}_{kind}.{fname_ext}'
raw_meg = read_raw_bids(bids_fname=bids_fname, bids_root=bids_root)
# Load EEG data.
kind = 'eeg'
fname_ext = 'vhdr'
bids_fname = f'{bids_basename}_{kind}.{fname_ext}'
raw_eeg = read_raw_bids(bids_fname=bids_fname, bids_root=bids_root) How it could be betterbids_root = '/bids'
bids_basename = make_bids_basename(subject='foo', session='bar', task='baz')
# Load MEG data.
kind = 'meg'
raw_meg = read_raw_bids(bids_basename=bids_basename, bids_root=bids_root, kind=kind)
# Load EEG data.
kind = 'eeg'
raw_eeg = read_raw_bids(bids_basename=bids_basename, bids_root=bids_root, kind=kind) Notice how
➔ We don't require users to create a Issues with this approach
I'd be in favor of raising an exception in an ambiguous situation :) |
I think you raise an interesting point though. mne-bids should also be bidsignore-aware when reading the file with globbing.
This is by the way an unlikely scenario in the sense that there are long discussions going on around this. So it's not fully fleshed out in BIDS yet. There might be other mechanisms to disambiguate in this scenario. This is just to say that the |
An alternative is to do:
bids_basename = make_bids_basename(subject=subject, kind='meg')
raw = read_raw_bids(bids_basename=bids_basename, bids_root=bids_root)
and then to glob internally in read_raw_bids
… |
Yes so the difference between this proposal and the one I had made is whether the Briefly skimming over the discussion that @jasmainak linked to, it seems possible that at one point in the future we'll have a piece of metadata in our BIDS datasets that tells us whether multiple modalities were recorded simultaneously, e.g. MEG & EEG: something of the kind bids_basename = make_bids_basename(subject='foo')
raw_meg = read_raw_bids(bids_basename=bids_basename, bids_root=bids_root, kind='meg')
raw_eeg = read_raw_bids(bids_basename=bids_basename, bids_root=bids_root, kind='eeg') Thoughts? |
I'm working on this now and will propose a PR shortly. |
Describe the problem
make_bids_folders()
accepts akind
kwarg:mne-bids/mne_bids/utils.py
Lines 200 to 201 in 1ce1717
write_raw_bids()
infers thekind
from theRaw
object:mne-bids/mne_bids/write.py
Line 1152 in 1ce1717
write_raw_bids()
infers the filename extension from theRaw
object:mne-bids/mne_bids/write.py
Lines 1132 to 1138 in 1ce1717
read_raw_bids()
, in contrast, requires the "full name" of the input filemne-bids/mne_bids/read.py
Lines 331 to 341 in 1ce1717
This makes using
read_raw_bids()
awkwardThis is demonstrated in the example gallery:
mne-bids/examples/convert_mne_sample.py
Lines 78 to 79 in 1ce1717
Since
read_raw_bids()
doesn't know about thekind
norextension
of a file, both have to be manually appended to thebids_basename
. This leads to code that isn't data-format agnostic.Describe your solution
kind
kwarg toread_raw_bids()
to allow specification of the requested recoding modality.mne-study-template
is to concatbids_basename + '_' + kind
, and thenglob
the (hopefully correctly assembled!) directory, excluding the.json
sidecar files. This doesn't seem like a good solution.mne-bids
-based pipelines to continue working without any modifications.Additional context
I think this touches the discussion we've had in #407 (comment); namely, that it would not only be nice, but it seems increasingly necessary to have a central registry of all the files in a dataset. In the concrete example, I should be able to tell
read_raw_bids()
: Please load raw data from: sub-01, ses-test, meg and shouldn't be bothered with the filename extension or folder hierarchy.The text was updated successfully, but these errors were encountered: