Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Add reference volumes to common imaging derivatives #1533

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

effigies
Copy link
Collaborator

It was discovered that fmriprep, aslprep and qsiprep all generate <suffix>ref volumes for their respective input modalities. Here is an initial proposal. I do not think we need any additional metadata over the universally RECOMMENDED Description and OPTIONAL Sources.

I kept space-<label> and res-<label> but not den-<label>, since these do make sense to resample into target spaces and possibly with resolutions differing from the input images, but there doesn't seem a clear analog for surface meshes. I suppose you could theoretically sample the boldref to a surface as a diagnostic, but I figure we should let the use case come up before specifying it.

Contradicting that, I threw cbvref in there, since it's functional data like bold. I have no clue how that's processed, but it seems reasonable to say "if you want a reference file, call it cbvref". In any case, there's nothing stopping modalities from adding another suffix that in practice they use as a reference volume. It is just useful to have a name for the thing that's not an actual average image but is some attempt to make a good registration target from what we are given.

Closes #1532.

@codecov
Copy link

codecov bot commented Jun 28, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (6d7eb0f) 87.83% compared to head (ae392b9) 87.83%.
Report is 134 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1533   +/-   ##
=======================================
  Coverage   87.83%   87.83%           
=======================================
  Files          16       16           
  Lines        1356     1356           
=======================================
  Hits         1191     1191           
  Misses        165      165           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Lestropie
Copy link
Collaborator

In the current proposal, the filename suffix is the concatenation of two items: some other pre-existing suffix indicating the contrast present in some other image, and "ref". I'm going to throw a spanner in the works and question whether this should be the case, despite some limited precedent.

  1. Not all cases of "3D reference images generated from a > 3D dataset for the purpose of registration" are faithful to the suffix from which they may have originated. For instance:

    1. A mean b=0 from DWI explicitly has no diffusion-weighting; it's just a T2-weighted image.
    2. The reference volume used for registration of DWI data could look nothing like raw DWI data. Could be an FA image (or anything else that could have its own more appropriate filename), could be a pseudo-T1w image like here or here, could be anything else hypothetically.
  2. This is mixing up image content with processing. Pre-BIDS, if I were to point within a dataset at a particular file, where that file were generated by taking the mean of an fMRI time series, and I were to describe those data to a third party, I would not say "that's the fMRI reference volume used for registration".

    1. If describing the data themselves, which is IMO the purview of BIDS, I'd say that the best descriptor of that image is the mean statistic taken along the 4th axis of the BOLD timeseries. See eg. "stat" entity bids-bep016#61 (even though I've not convinced myself that that's the right way to go).

    2. If describing how the data were used, ie. specifically as the reference volume for registration in a previously completed processing pipeline, that is IMO provenance.

    3. If describing what the data are intended to be used for, ie. here's an image that this pipeline thinks would serve well as a reference volume for registration in some subsequent pipeline that uses this derivative dataset as an input, then:

      1. It should be up to that pipeline to decide what of the derivatives available it wants to use for that purpose;
      2. That falls in the domain of intended utilisation in processing. This I spoke about in Change handling of field mapping information bids-2-devel#53 as I think it should be separated from the data as much as possible.
  3. In the worst case scenario, this could lead to a doubling of the number of suffices, despite a desire to minimize such.

@effigies
Copy link
Collaborator Author

The purpose of these volumes is for performing, applying and evaluating registrations. They do not map any particular physical quantity, they cannot necessarily be described as a voxel-wise summary statistic of the original series, and two tools generating references do not need to generate the same reference. They are useful for provenance and post-pipeline reuse. For example, when attempting to find a minimal set of derivatives needed to regenerate the rest, this is a critical one.

I'm not sure if your position is that tools should not create these reference volumes, should have heterogeneous names that more closely reflect the contents, or should shove these files in .bidsignore. I think you may be trying to achieve a purity in BIDS that is not really possible.

Not all cases of "3D reference images generated from a > 3D dataset for the purpose of registration" are faithful to the suffix from which they may have originated.

I wrote "A reference volume is a 3D image that is used to represent a 4D series," which is specifically not intended to indicate that the file was generated from the original 4D dataset or that the resulting contrast matches that suffix.

This is mixing up image content with processing.

I think this derivative is both valuable and inextricable from processing.

I'd say that the best descriptor of that image is the mean statistic taken along the 4th axis of the BOLD timeseries

That's not what boldref is. I don't know if it is what aslref or dwiref is. I would say you are welcome to use stat-mean or images where that applies.

In the worst case scenario, this could lead to a doubling of the number of suffices, despite a desire to minimize such.

This is hyperbolic. There are currently 5 explicitly 4D suffixes (asl, bold, cbv, dwi, pet) where this would clearly apply, and three of them have tools that already do this. For sensor-based modalities, I don't know if there's an equivalent. If so, I would honestly be willing to move this to common derivatives as a principle, because it is so useful.

I also don't see minimizing suffixes as an explicit goal so much as a heuristic for finding cases where a few suffixes and one or two entities might replace many suffixes. I have very little concern about boldref taking a suffix that might be used for something else in another context, while something like mean could apply in many cases and it would be good not to claim it for a very narrow case.

Copy link
Member

@tsalo tsalo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The addition of reference volume suffixes makes sense to me.

Should Sources be added as a RECOMMENDED metadata field? I know it's OPTIONAL for derivatives, but, as with masks, it seems particularly relevant for reference volumes.

In regards to concerns about adding suffixes, I do think that we generally want to avoid adding unnecessary suffixes when combinations of existing suffixes and entities would be sufficient, but in this case I think these reference volume suffixes are both useful and already widely in use. The only alternative I can think of would be a general reference suffix, but that would just complicate things for a lot of tools.

@jhlegarreta
Copy link
Contributor

Not entirely sure what the most precise terms are in all this.

  • I agree with what Robert says about the "reference" for DWI registration potentially being a non-diffusion volume. So naming a e.g. population T1w average (a structural derivative) a DWI reference ("dwiref") seems misleading to me.
  • However, I grant that the description in https://github.com/bids-standard/bids-specification/pull/1533/files#diff-83476cde0b0492fc6c54a092949c7f9bf89f2cf71343ee5f400322aaeab3254cR156 does not imply that the dwiref is necessarily used ("(...) often used (...)") for registration of DWI volumes.
  • To me, the precise terms for registration purposes would be "static" and "moving" images, regardless of the modality (in the "source" and "target" terms, the first one maybe misleading within BIDS (?), or less clear (?)). Defining what the "static" image should be/how it should be computed maybe belongs to "what the data are intended to be used for", following the terms Robert used.
  • Strictly speaking, I'd say that the reference in DWI would be the $S_{0}$ volume (i.e. "b0"/"b=0" -not sure about Matt's comment in Derivative reference volumes for 4D series #1532 (comment) about these being different). So when using the term "reference" to name a e.g. T1w for a DWI volume, maybe we are being too "generous" with the terms.

Sorry for chiming in/if the above comment does not help in reaching an agreement.

@effigies
Copy link
Collaborator Author

effigies commented Jul 6, 2023

Should Sources be added as a RECOMMENDED metadata field? I know it's OPTIONAL for derivatives, but, as with masks, it seems particularly relevant for reference volumes.

I would be fine with promoting it to RECOMMENDED.

the precise terms for registration purposes would be "static" and "moving" images, regardless of the modality

Not sure that that's very useful here, as one process's static image is another process's moving. In the case of a boldref, it is static for motion correction and moving for coregistration. I really think "reference" is a useful term, as it is an image with a definite affine and grid that stands in for any other images that are aligned with it. Its contents need have no meaning apart from their usefulness in registration. If there's a more precise term for this type of image, I'm happy to use it, but it should not be dependent on the direction of registration.

For what it's worth, the original inspiration for boldref was the single-band reference (SBRef) from the Human Connectome Project, which has the property of being a useful stand-in for the BOLD series while not being derived from the series.

@Lestropie
Copy link
Collaborator

I think you may be trying to achieve a purity in BIDS that is not really possible.

I do that. You may have noticed more generally. :-P I've had plenty of experience of blurring of logical concepts leading to problems in software design / communication, so will advocate for the cleanest separation even if the consequences of failing to do so aren't clear and won't manifest for years. But I have no expectation of always getting my way. Just trying to provide insights based on that experience and seeing which of them those of authority agree with.

I wrote "A reference volume is a 3D image that is used to represent a 4D series," which is specifically not intended to indicate that the file was generated from the original 4D dataset or that the resulting contrast matches that suffix.

I think refining this might provide some guidance / consensus. "Used to represent a series" is very vague.

  1. Which of the following is the case?

    1. "The reference image" is "the image that is used for registration".
    2. "The reference image" is something like "a dataset possessing high contrast, potentially of reduced dimensionality with respect to the original dataset, that best localises the spatial position and internal structure of the data content", for which potential applications include registration but also things like visualisation.

    This might influence choice of suffix / description.

  2. With respect to the choice of suffices, I had mentioned the (unofficial?) policy of minimising new suffices. The current proposal I suppose I would place in the intermediate range in this respect: it's greater than one, and less than the total possible number of different image contrasts. But the generation of those new proposed suffices introduces two potential problems:

    1. The image content may look nothing like what the suffix describes (eg. dwiref; as mentioned earlier above)
    2. In most of these cases, the imaging modality (eg. "dwi") will already be indicated by the directory in which the file resides, and so being a part of the suffix also will be redundant.

    Reason I re-raise i. and add ii. is that there's an alternative option: introduce only one new suffix. The modality of such would be inferred from its directory location. Looks like @tsalo mentioned and discarded it, but I think it's worth considering.
    The data from which it was generated would ideally be encoded via provenance, but it might be appropriate to also define a metadata field that is mandatory for data files with this suffix that lists those data files for which it provides such a spatial reference.
    The stumbling block I see for this approach is where there are two such references generated for a single imaging modality. On first consideration, these could be disambiguated at the file level using _desc-<suffix> and more precisely from the metadata field above. But this is beyond my personal experience so curious to know if someone thinks this would be more fundamentally broken.

    And of course it would not be compatible with some BIDS Apps that have already made their own decisions on how to export such images; but as you say, I'm a purist, so I'd prefer to contemplate the decision rather than relying on precedent that may itself have come from just satisfying an immediate requirement. The relationship to SBRef makes sense, and I'd kind of guessed as such already. There "single-band" conveys a lot more about the expected image content, relating to 2.i. above.

the precise terms for registration purposes would be "static" and "moving" images, regardless of the modality

"Static" and "moving" are best avoided here, since they imply an asymmetric registration, which is not always the case. That's also tying even more strongly to actual processing, as opposed to data content or even intent of processing, which I've expressed my objection to above.

@Remi-Gau Remi-Gau changed the title ENH: Add reference volumes to common imaging derivatives [ENH] Add reference volumes to common imaging derivatives Dec 22, 2023
@Remi-Gau Remi-Gau added derivatives MRI For things that affect all MRI datatypes labels Dec 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
derivatives MRI For things that affect all MRI datatypes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Derivative reference volumes for 4D series
5 participants