Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Fixing quickstart doc bug, write_raw_bids doc bug, and coordsystem.json overwriting error #675

Merged
merged 16 commits into from
Jan 13, 2021
2 changes: 1 addition & 1 deletion doc/use.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Python
>>> from mne_bids import BIDSPath, write_raw_bids
>>> raw = mne.io.read_raw_fif('my_old_file.fif')
>>> bids_path = BIDSPath(subject='01', session='01, run='05',
datatype='meg', bids_root='./bids_dataset')
datatype='meg', root='./bids_dataset')
>>> write_raw_bids(raw, bids_path=bids_path)

Command Line Interface
Expand Down
1 change: 1 addition & 0 deletions doc/whats_new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ Bug fixes

- Fix writing MEGIN Triux files, by `Alexandre Gramfort`_ (:gh:`674`)
- Anonymization of EDF files in :func:`write_raw_bids` will now convert recording date to ``01-01-1985 00:00:00`` if anonymization takes place, while setting the recording date in the ``scans.tsv`` file to the anonymized date, thus making the file EDF/EDFBrowser compliant, by `Adam Li`_ (:gh:`669`)
- Add fix to MEG ``coordsystem.json`` file writing, where an error will be thrown if ``coordsystem.json`` file path exists, but contents differ, by `Adam Li`_ (:gh:`675`)
adam2392 marked this conversation as resolved.
Show resolved Hide resolved

:doc:`Find out what was new in previous releases <whats_new_previous_releases>`

Expand Down
18 changes: 18 additions & 0 deletions mne_bids/dig.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
# License: BSD (3-clause)
import json
from collections import OrderedDict
from pathlib import Path

import mne
import numpy as np
Expand Down Expand Up @@ -290,6 +291,23 @@ def _coordsystem_json(*, raw, unit, hpi_coord_system, sensor_coord_system,
'iEEGCoordinateUnits': unit, # m (MNE), mm, cm , or pixels
}

# note that any coordsystem.json file shared within sessions
# will be the same across all runs (currently). So
# overwrite is set to True always
# XXX: improve later when BIDS is updated
# check that there already exists a coordsystem.json
if datatype == 'meg' and Path(fname).exists() and not overwrite:
adam2392 marked this conversation as resolved.
Show resolved Hide resolved
with open(fname, 'r') as fin:
adam2392 marked this conversation as resolved.
Show resolved Hide resolved
coordsystem_dict = json.load(fin)
if fid_json != coordsystem_dict:
raise RuntimeError(f'Trying to write coordsystem.json, but '
f'it already exists at {fname} and the '
f'contents do not match. You must '
f'differentiate this coordsystem.json '
f'file from the existing one, or set '
f'"overwrite" to True.')
adam2392 marked this conversation as resolved.
Show resolved Hide resolved
# set overwrite to True
overwrite = True
adam2392 marked this conversation as resolved.
Show resolved Hide resolved
_write_json(fname, fid_json, overwrite, verbose)
adam2392 marked this conversation as resolved.
Show resolved Hide resolved


Expand Down
119 changes: 118 additions & 1 deletion mne_bids/tests/test_write.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@
mark_bad_channels, write_meg_calibration,
write_meg_crosstalk, get_entities_from_fname)
from mne_bids.utils import (_stamp_to_dt, _get_anonymization_daysback,
get_anonymization_daysback)
get_anonymization_daysback, _write_json)
from mne_bids.tsv_handler import _from_tsv, _to_tsv
from mne_bids.sidecar_updates import _update_sidecar
from mne_bids.path import _find_matching_sidecar
Expand Down Expand Up @@ -2127,6 +2127,123 @@ def test_undescribed_events(_bids_validate, drop_undescribed_events):
_bids_validate(bids_root)


@pytest.mark.parametrize(
'dir_name, fname, reader, datatype, coord_frame', [
('EDF', 'test_reduced.edf', _read_raw_edf, 'ieeg', 'mni_tal'),
('EDF', 'test_reduced.edf', _read_raw_edf, 'ieeg', 'fs_tal'),
('EDF', 'test_reduced.edf', _read_raw_edf, 'ieeg', 'unknown'),
('EDF', 'test_reduced.edf', _read_raw_edf, 'eeg', 'head'),
('EDF', 'test_reduced.edf', _read_raw_edf, 'eeg', 'mri'),
('EDF', 'test_reduced.edf', _read_raw_edf, 'eeg', 'unknown'),
('CTF', 'testdata_ctf.ds', _read_raw_ctf, 'meg', ''),
('MEG', 'sample/sample_audvis_trunc_raw.fif', _read_raw_fif, 'meg', ''), # noqa
]
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adam2392 I don't have time to review in details but please monitor test time when you add tests. Add such big parametrize has a cost in CI time and needs to be fully justified. It's a general statement. I'd to avoid having mne-bids having a test time lasting tens of minutes. I find it already quite long actually

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran these locally and it's actually quite fast because all these files are very very small. The chunk using write_raw_bids as a result is very fast.

The tests that generally seem to be taking long periods of time are actually the ones where we test for split of fif files correctly (need a humongous chunk to split it up > 2GB?), and the ones where we test the inspector. This is due to qualitative observation though, not an explicit profiling.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can take these out, but I added these cases for the sake of covering a multiple coordsystems per datatype. Do we want to limit the parametrizations?

Copy link
Member Author

@adam2392 adam2392 Jan 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually realized I could do profiling with a one liner cuz I have worked with pytest-profiler before:

Profiling (from /Users/adam2392/Documents/mne-bids/prof/combined.prof):
Thu Jan  7 16:19:56 2021    /Users/adam2392/Documents/mne-bids/prof/combined.prof

         100182265 function calls (96919567 primitive calls) in 218.726 seconds

   Ordered by: cumulative time
   List reduced from 6828 to 20 due to restriction <20>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      179    0.002    0.000  218.719    1.222 runner.py:106(pytest_runtest_protocol)
      179    0.002    0.000  218.650    1.222 runner.py:114(runtestprotocol)
      537    0.005    0.000  218.647    0.407 runner.py:212(call_and_report)
3506/1979    0.018    0.000  218.642    0.110 hooks.py:272(__call__)
3506/1979    0.003    0.000  218.627    0.110 manager.py:90(_hookexec)
3506/1979    0.008    0.000  218.625    0.110 manager.py:84(<lambda>)
3506/1979    0.046    0.000  218.619    0.110 callers.py:157(_multicall)
      537    0.003    0.000  216.582    0.403 runner.py:240(call_runtest_hook)
      537    0.006    0.000  216.573    0.403 runner.py:298(from_call)
      537    0.002    0.000  216.564    0.403 runner.py:255(<lambda>)
      179    0.001    0.000  215.183    1.202 runner.py:153(pytest_runtest_call)
      179    0.001    0.000  215.179    1.202 python.py:1639(runtest)
      179    0.272    0.002  215.169    1.202 python.py:176(pytest_pyfunc_call)
       56    0.001    0.000   81.284    1.451 conftest.py:29(_validate)
       56    0.007    0.000   81.283    1.451 <decorator-gen-2>:1(run_subprocess)
       56    0.073    0.001   81.274    1.451 misc.py:90(run_subprocess)
     7964    0.142    0.000   80.728    0.010 queue.py:153(get)
     7265    0.110    0.000   80.532    0.011 threading.py:270(wait)
    33516   80.368    0.002   80.368    0.002 {method 'acquire' of '_thread.lock' objects}
      203    0.053    0.000   42.583    0.210 write.py:904(write_raw_bids)


======================================================================== slowest 20 durations =========================================================================
35.23s call     mne_bids/tests/test_read.py::test_write_read_fif_split_file
19.52s call     mne_bids/tests/test_write.py::test_write_anat
8.63s call     mne_bids/tests/test_write.py::test_fif
6.81s call     mne_bids/tests/test_inspect.py::test_inspect_annotations
6.54s call     mne_bids/tests/test_write.py::test_eegieeg[Persyst-sub-pt1_ses-02_task-monitor_acq-ecog_run-01_clip2.lay-fn]
6.37s call     mne_bids/tests/test_inspect.py::test_inspect_set_and_unset_bads
6.27s call     mne_bids/tests/test_write.py::test_eegieeg[NihonKohden-MB0400FU.EEG-fn]
6.23s call     mne_bids/tests/test_inspect.py::test_inspect_annotations_remove_all
6.15s call     mne_bids/tests/test_write.py::test_eegieeg[EDF-test_reduced.edf-fn]
5.64s call     mne_bids/tests/test_write.py::test_set
5.56s call     mne_bids/tests/test_inspect.py::test_inspect_bads_and_annotations
3.99s call     mne_bids/tests/test_write.py::test_ctf
3.87s call     mne_bids/tests/test_update.py::test_update_sidecar_jsons
3.84s call     mne_bids/tests/test_inspect.py::test_inspect_single_file[True]
3.29s call     mne_bids/tests/test_inspect.py::test_inspect_single_file[False]
3.28s call     mne_bids/tests/test_write.py::test_mark_bad_channels[MEG 0112-Really bad!-True-True-existing_ch_names7-existing_descriptions7-None-False]
3.19s call     mne_bids/tests/test_write.py::test_mark_bad_channels[ch_names0-None-False-False-existing_ch_names0-existing_descriptions0-None-False]
3.19s call     mne_bids/tests/test_write.py::test_write_meg_calibration
3.18s call     mne_bids/tests/test_write.py::test_mark_bad_channels[ch_names8-descriptions8-False-False-existing_ch_names8-existing_descriptions8-None-False]
3.17s call     mne_bids/tests/test_write.py::test_mark_bad_channels[ch_names6-descriptions6-False-False-existing_ch_names6-existing_descriptions6-meg-False]

If anything, I think the test_eegieeg I parametrized could be widdled down (if possible). Otw, these other tests are at least for now the longest 20.

@pytest.mark.filterwarnings(warning_str['channel_unit_changed'])
@pytest.mark.filterwarnings(warning_str['encountered_data_in'])
@pytest.mark.filterwarnings(warning_str['nasion_not_found'])
def test_coordsystem_json(dir_name, fname, reader, datatype, coord_frame):
"""Tests that coordsystem.json contents are written correctly.

Tests multiple manufacturer data formats and MEG, EEG, and iEEG.

TODO: Fix coordinatesystemdescription for iEEG.
Currently, iEEG coordinate system descriptions are not written
correctly.
"""
bids_root = _TempDir()
data_path = op.join(testing.data_path(), dir_name)
raw_fname = op.join(data_path, fname)

# the BIDS path for test datasets to get written to
bids_path = _bids_path.copy().update(root=bids_root,
datatype=datatype)

raw = reader(raw_fname)
kwargs = dict(raw=raw, bids_path=bids_path, overwrite=True,
verbose=False)

if datatype == 'eeg':
raw.set_channel_types({ch: 'eeg' for ch in raw.ch_names})
landmarks = dict(nasion=[1, 0, 0],
lpa=[0, 1, 0],
rpa=[0, 0, 1])
elif datatype == 'ieeg':
raw.set_channel_types({ch: 'seeg' for ch in raw.ch_names})

# XXX: setting landmarks for ieeg montage for some reason
# transforms coord_frame -> 'CapTrak'. Possible issue in mne
landmarks = dict()

if datatype != 'meg':
# alter some channels manually with electrodes to write
ch_names = raw.ch_names
elec_locs = np.random.random((len(ch_names), 3)).tolist()
ch_pos = dict(zip(ch_names, elec_locs))
montage = mne.channels.make_dig_montage(ch_pos=ch_pos,
coord_frame=coord_frame,
**landmarks)
raw.set_montage(montage)

# write to BIDS and then check the coordsystem files
bids_output_path = write_raw_bids(**kwargs)
coordsystem_fname = _find_matching_sidecar(bids_output_path,
suffix='coordsystem',
extension='.json')
with open(coordsystem_fname, 'r', encoding='utf-8') as fin:
coordsystem_json = json.load(fin)

# if datatype is MEG and there is a change in the underlying
# coordsystem.json file, then an error will occur
if datatype == 'meg':
adam2392 marked this conversation as resolved.
Show resolved Hide resolved
# writing twice should work as long as the coordsystem
# contents have not changed
write_raw_bids(raw=raw, bids_path=bids_path.copy().update(run='02'),
overwrite=False, verbose=False)

# upon changing coordsystem contents, and overwrite not True
# this will fail
new_coordsystem_json = coordsystem_json.copy()
new_coordsystem_json['MEGCoordinateSystem'] = 'Other'
_write_json(coordsystem_fname, new_coordsystem_json, overwrite=True)
with pytest.raises(RuntimeError,
match='Trying to write coordsystem.json, '
'but it already exists'):
write_raw_bids(raw=raw, bids_path=bids_path, overwrite=False,
verbose=False)

# perform checks on the coordsystem.json file itself
if datatype == 'eeg' and coord_frame == 'head':
assert coordsystem_json['EEGCoordinateSystem'] == 'CapTrak'
assert coordsystem_json['EEGCoordinateSystemDescription'] == \
COORD_FRAME_DESCRIPTIONS['captrak']
elif datatype == 'eeg' and coord_frame == 'unknown':
assert coordsystem_json['EEGCoordinateSystem'] == 'CapTrak'
assert coordsystem_json['EEGCoordinateSystemDescription'] == \
COORD_FRAME_DESCRIPTIONS['captrak']
elif datatype == 'ieeg' and coord_frame == 'mni_tal':
assert 'space-mni' in coordsystem_fname
assert coordsystem_json['iEEGCoordinateSystem'] == 'Other'
# assert coordsystem_json['iEEGCoordinateSystemDescription'] == \
# COORD_FRAME_DESCRIPTIONS['mni_tal']
elif datatype == 'ieeg' and coord_frame == 'fs_tal':
assert 'space-fs' in coordsystem_fname
assert coordsystem_json['iEEGCoordinateSystem'] == 'Other'
# assert coordsystem_json['iEEGCoordinateSystemDescription'] == \
# COORD_FRAME_DESCRIPTIONS['fs_tal']
elif datatype == 'ieeg' and coord_frame == 'unknown':
assert coordsystem_json['iEEGCoordinateSystem'] == 'Other'
# assert coordsystem_json['iEEGCoordinateSystemDescription'] == 'n/a'
elif datatype == 'meg' and dir_name == 'CTF':
assert coordsystem_json['MEGCoordinateSystem'] == 'CTF'
assert coordsystem_json['MEGCoordinateSystemDescription'] == \
COORD_FRAME_DESCRIPTIONS['ctf']
elif datatype == 'meg' and dir_name == 'MEG':
assert coordsystem_json['MEGCoordinateSystem'] == 'ElektaNeuromag'
assert coordsystem_json['MEGCoordinateSystemDescription'] == \
COORD_FRAME_DESCRIPTIONS['elektaneuromag']


@pytest.mark.parametrize(
'subject, dir_name, fname, reader', [
('01', 'EDF', 'test_reduced.edf', _read_raw_edf),
Expand Down
5 changes: 3 additions & 2 deletions mne_bids/write.py
Original file line number Diff line number Diff line change
Expand Up @@ -933,14 +933,15 @@ def write_raw_bids(raw, bids_path, events_data=None,
Example::

bids_path = BIDSPath(subject='01', session='01', task='testing',
acquisition='01', run='01', root='/data/BIDS')
acquisition='01', run='01', datatype='meg',
root='/data/BIDS')

This will write the following files in the correct subfolder ``root``::

sub-01_ses-01_task-testing_acq-01_run-01_meg.fif
sub-01_ses-01_task-testing_acq-01_run-01_meg.json
sub-01_ses-01_task-testing_acq-01_run-01_channels.tsv
sub-01_ses-01_task-testing_acq-01_run-01_coordsystem.json
sub-01_ses-01_acq-01_coordsystem.json

and the following one if ``events_data`` is not ``None``::

Expand Down