-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Voting closed] Proposal: General conventions for spatial derivatives #1602
Comments
Tagging a few people that I hope they can pitchin: |
personally, I am against it as this would create a plethora of suffixes Xmap of all sorts, when meas- can take care of a lot of that ; I let @mnoergaard elaborate for BEP23 |
@CPernet Can you give an example? It seems to produce a maximum of 5 suffixes (boldmap, cbfmap, dwimap, aslmap, petmap) that I can think of, and this would not be a requirement that they be used. It would say that, if a somewhat generic image derivative is needed for that modality, that would be an acceptable suffix. In fact, this currently has reduced the number of suffixes for bold derivatives, as many have been moved into the stat- entity. (Which could be changed to meas- if that turns out to be the consensus.) |
If the goal is to produce those five suffixes then all good, but that is not how it was phrased in the BEP23 section of your proposal, mentioning T1map, T2map, RDmap, BPmap and GEmap (suggesting a plethora of suffixes with these suggestions)? |
Those are examples of either existing or proposed Xmaps in various other BEPs, cited as precedent. Are you saying that RDmap etc are no longer proposed by bep23? |
For BEP23, we now put the outcome measure e.g. binding potential (BP) into meas-BP, so there would be no need for it in the suffix. Furthermore, because this data belongs to PET data, it would always be placed inside a pet directory in the derivatives (i.e. derivatives/pipelinex/sub-XX/ses-XX/pet/), thus making the pet in map slightly redundant. Currently for BEP23, the notion of maps has not been implemented yet (we previously talked about molmap, mimap, etc), mostly because it was solved by other entities. |
What suffix are you using for |
|
@effigies right now we don't have a suffix in line with the PET examples in BEP38 (https://docs.google.com/document/d/1RxW4cARr3-EiBEcXjLpSIVidvnUSHE7yJCUY91i5TfM/edit#heading=h.4k1noo90gelw). If a suffix is required, then I think the use of petmap is the most reasonable one. |
some PET examples:
we just do not see the value adding _petmap or _mimap |
Suffixes have been a required part of BIDS files up to this point. I guess I'll go comment on that thread; I don't understand what's being gained by removing suffixes. |
' I don't understand what's being gained by removing suffixes' I'm not suggesting removing those that have been useful, I'm saying that it is not always needed, like we don't see the utility in PET to have eg petmap, see above _pet as usual works just fine |
@effigies was confused because the examples didn't have suffixes before the edit in the comment. That said, the rationale behind In a strict sense, Re: datatype folder - some other folders have been proposed, and it is not unimaginable we'll hit a use case where you may need a PETmap outside the |
reasonable argument |
Regarding the proposal of "
to provide an argument against. Those existing suffices serve to distinguish eg. a non-quantitative image that is merely weighted by T1, ie. " Strawmanning the philosophy for a moment, a BIDS App quantifying relaxation tissue properties would be expected to produce " RE distinction between "model fit" and "model-derived" (regardless of what particular terms or suffices might be involved):
Perhaps not currently, but it depends on how forward-thinking we want to be about it. In part, this statement may be expressed from the perspective of "I am a human, and I am reading filesystem names and trying to understand content". Personally I think that with derivatives, especially complex intermediate ones, machine interpretability and robust definition for utilisation in downstream analysis take precedence, whereas for analysis endpoints obviously human readability are again important. Where I saw a prospect for benefit in the distinction:
I would be interested if anyone can provide an example of a software that performs a direct optimisation of the eigenvalues and eigenvectors, rather than optimising the tensor coefficients and then performing an eigendecomposition of the result. In the absence of such I would state that these are unambiguously model-derived. It is the six coefficients of the symmetric rank-2 tensor that are the model fit parameters. I would hope (perhaps naively) that the presentation of such data in this way would be a net insight over time. Maybe part of the problem here is a lack of consensus of the criteria by which that distinction is made. In my own formulation, this is not to do with what may be yielded by any given existing software tools, but with:
Depends on what you mean by "does". It would not be fair to base such an argument on eg. BIDS Apps; while there's insight to be gained from what BIDS Apps developers have encountered and the solutions they've come up with, BIDS Derivatives should IMO not be determined entirely based on the non-conformant derivatives currently being generated by BIDS Apps. Outside of BIDS Apps, in MRtrix3 land this is exactly what we do.
I had previously considered simply "parameter" / " As I've stated previously, I'm not married to that proposal; I'm just yet to figure out / be presented with anything that I see less downsides to. |
Except for the final comment about confusion, this is a statement of a fact. Regarding the confusion, I find this proposal less confusing (since these Re: model derived/fit The following discussion should not be here, but rather in the main location where this is being discussed. Please let me know what is the target and I'll happily move the discussion there.
BIDS has been (to great success) pushed by the 80/20 principle. Attempting to capture such complexity with the filesystem organization falls in the 20% of the rule. If being able to represent the difference between derived and fit parameters is necessary, a better solution to BIDS such as NIDM should be considered.
BIDS should not suggest/support/indicate how things are implemented. Probably such a software doesn't exist but the point does not support a more favorable view of the proposal. The relationship between BIDS and practice should be the opposite - if there is a software that explicitly differentiates derived/fit in the naming at the output (therefore disallowing the user to choose their preference) because allowing the flexibility in naming would endanger the safe consumption of the results in further processing, then BIDS should maybe consider it. This is what is meant by no software "does" that (hereby the definition of "does" is complete). |
hi Folks, thanks for the comments! This a critical discussion that perhaps boils down to the following question: "What level of details (parameters, models, output, etc.) will we need in the future when generating derivatives?" If we avoid answering this important question now, we should be able to accept the current proposal. MRTrix3.0 should be able to generate derivative maps compatible with the current proposal. @Lestropie is correct in saying that currently, we are not extremely modular in BIDS. My suggestion here is to avoid discussing modularity and the type and quantity of parameters but instead discuss those in the context of BIDS 2.0 and BIDS Provenance BEP (which should be the one providing provenance of the parameters used to generate a dataset. In other words, I am suggesting separating (A) the data generated (derivative)[to be addressed here] from (B) the model and parameters used [more appropriate to be discussed in the context of provenance and/or even BIDS 2.0] |
I have been convinced by @oesteban argument above -- re _*map is not that modality anymore (a transform), but the issue are all the current _map which kinds of conflicts |
Hello! @robertoostenveld asked that I point out this document: https://docs.google.com/document/d/1JtTu5u7XTkWxxnCIH6sxGajGn1qG_syJ-p14aejpk3E/edit, and asked that we consider how the principles laid out in that document relate to this proposal. In particular, one prinicple (not very explicitly laid out in the document, imho) is to try to use entities more and suffixes less. @robertoostenveld identified that as a potential issue with the present proposal being discussed in this issue. Maybe this principle should also be made more explicit in the document. I will just weight in that this is an objection similar to the one raised by @CPernet above, but as pointed out in previous comments (e.g., #1602 (comment)) it creates a rather limited number of additional suffixes, so might be in line with the principles. |
So if I were to do a 2nd-level statistical analysis on 1st-level contrasts from bold data, would I get a |
Why not |
First, on the specific proposal:
GLM outputs are not currently in anybody's proposal. As noted by @tsalo, The goal of this proposal is more along the lines of first-order (or close enough to make little difference), voxelwise derivatives from a variety of modalities. These could be MD or FA in diffusion, cortical thickness in structural, regional homogeneity in functional. There just isn't a good term for all of these, so we could split them into a large number of suffixes to say "These must be treated as different measures", but that's excessive. The things they have in common are 1) being voxelwise derivatives; 2) generally being an input to some further processing, not an end goal (although they could be in some contexts!). I am not married to On the proposal process: I'm concerned though that we are letting the perfect be the enemy of the good, and the desire many of us have to establish some principles that will apply in all cases is halting progress. This proposal is specifically worded not to be a binding decision on all future BEPs, but as a good-enough solution to a proximal problem seen in multiple BEPs. What I would like to see, and what I heard from Franco and Peer that they would like to see, is some decision that we can go back to our BEPs and implement and not have to relitigate this separately in each BEP and again when it's time for community review. And I want to stress that we are not asking the steering group to join the debate and then decide for us. The steering group as the final authority might need to decide on contentious spec-level decisions, but the thing that needs steering is the community, not the individual decisions in each BEP. What we need is a way to establish community consensus without waiting to the end of the BEP process, and I think the steering group can play a role in creating that process and giving it legitimacy. If, at the end of that process the community consensus is clear, I would hope the steering group would announce that consensus, regardless of the consensus (or lack thereof) within the steering group itself. |
This fairly nicely seeds a different perspective by which I wanted to look at this one. Firstly, not sure if it's intentional, but to me, for a broad-reaching proposal like this I wouldn't be constraining to specifically voxel-wise maps. Any format that defines a spatial embedding of data can have stored within it data corresponding to any parameter. I don't see that "map" applies to voxels but not to eg. vertices or electrodes. On one hand that could be a good thing; but conversely it leads into my concern. It's not clear what the scope / limiting principle of the proposal is; where "map" is no longer applicable and something else is either preferable or required. If the result of computing some statistic across volumes is a "map", and a voxel-wise parameter estimated from a biophysical model is a "map", then what are the criteria by which something is not a "map"? I'll derive an example of this from BEP016, since it's where most of my thinking on the topic has been. For things such as MD or FA calculated from the tensor model fit, I can absolutely see the appeal of those being broadcast to users as "maps". Indeed even something like NODDI's Linking back to the whole MDP / MFP thing, my observation was that often times we'll fit a complex orientationally-dependent model and then derive some scalar parameter from it since it's better for visualisation / analysis than the whole model fit result, but conversely, there are some models for which there is a scalar parameter that is genuinely a component of the model fitting procedure (eg. NODDI's |
Extending your list by one would be " IN BEP016-land, this would involve giving each "data representation" its own suffix (eg. " |
I do not have much of a position about what to do with non-scalar "maps". The closest that BEP12 comes to that is per-voxel regressors or spatiotemporal decompositions, and we have given them different suffixes ( And I do agree that I would extend I'm not 100% sure you're proposing |
Probably moreso " Doesn't resolve the issues with "map" regarding applicability / scope (including to data for which more specific suffices already exist), and it'd be a sizeable strategic sidestep in terms of what attributes determine suffices. So I'm not a massive fan myself either. But it's nevertheless an option that was historically considered in this context. Alternatively, we could try massaging "map" a little. I mentioned I'd contemplated "parameter". In the context of statistical outputs, " |
It may seem trivial discussing even the name, but we don't want to reiterate the 'atlas' naming issue. @effigies is however right, I think we should 1. Agree on some naming now (map, parmap, whatever) 2. Let the BEPs work with that naming for a while so we are moving forward 3. Re-evaluate see how that works for everyone. |
Yes, but this is not written in the proposal. I asked about the proposal. My elaboration later on with So the question remains: where in the proposal is this suggested?
I believe the proposal does not give space for the user to arbitrarily create new suffixes.
BIDS should try to offer a compromise between flexibility and user-friendliness. If flexibility is afforded by means of The suffix and the prefix are traditionally the most relevant parts of names in practice. Making unspecific suffixes will inevitably lead to discussing a strong ordering criteria for entities, because humans will automatically drop the unspecific suffix from the filename and focus on the next bit. And again, let me repeat that the spirit aspect of the proposal is also very relevant. |
Here it states: "Introduce a new suffix pattern : _map, where is a BIDS suffix used in the raw data (e.g., dwi or bold). For example, the proposed pattern produces the suffices _dwimap or _boldmap." This proposal makes |
Despite I disagree recursion can be implied, further down it says:
So, the only situation where you could get a This is probably something to consider when developing the derivatives of those particular suffixes, and this proposal is also made as a recommendation. It could have some language anticipating this case and excluding existing raw suffixes ending in map from it. Finally, as a redux, this proposal could have brought into the light that currently existing "map" suffixes may require some discussion and allow alternatives that allow the "map" extension in derivatives, if "mapmap" is sufficiently annoying (it is a little to me, tbh). |
I would like to see a clearer definition of _*map.ext added to the proposal such that I can better understand the implications. In my current understanding the suffixes determine what type of metadata should be present with a file and restricts which extensions are allowed. For example, the suffixes _bold.ext, _eeg.ext, _meg.ext and _ieeg.ext indicate which .json and .tsv files MUST be present. I don't understand yet whether how that will be applied in the general _*map.ext. Say I create an _ieegmap.ext, I want to know which metadata I should save for interpretation and how I should, for example, save the type of units contained in this file.
|
@dorahermes I think finding counter-examples that would discourage applying this proposal would be very useful. We did not spend so much time on that, so I agree it's worth attempting. @francopestilli, @arokem, @PeerHerholz and @effigies please correct me if I say something imprecise about the proposal:
This particular proposal does not prescribe extensions for
It's not restricted to NIfTI (e.g., I can see it used with GIFTI and with CIFTI), nor does it impose any other metadata, such as units. Yes, this proposal wants to remain sufficiently flexible so other BEPs can easily adopt or ignore it if it doesn't apply. |
Just as a note, I am not trying to discourage this proposal in general, but some edits and clarifications would be very helpful. Specifically, I would like to see more clearly described how each BEP using |
Apologies if my response can be interpreted in this way. I meant that finding counter-examples is indeed a good idea in this particular case.
I believe the proposal is very open-ended, so I don't think the goal was to limit what metadata and extensions should/must be used. As said above, I agree the proposal could be more explicit about what may be used. Perhaps the most direct examples could come from what should not be done with this proposal:
|
@dorahermes do you see specific issues? Do you think you can provide an example? |
The specific issue I have is that the proposal section is too vague. I would suggest adding something like what I put between []: "Introduce a new suffix pattern : _map, where is a BIDS suffix used in the raw data (e.g., dwi or bold). For example, the proposed pattern produces the suffices _dwimap or _boldmap. [BEPs may use this suffix pattern under the conditions specified below and MUST specify the extension and metadata that are required with the suffix.]" @francopestilli would it be possible to add a section with 'Conditions under which BEPs may and may not use the suffix pattern?' For examples, would e.g. tmap or betamap be allowed under this pattern? Betamap seems particularly underspecified and could refer to a statistic OR a beta oscillation map of some kind without further specification. |
Hi @dorahermes what about something like the following? Introduce a new suffix pattern: (1) The file descriptor does fall under one of the generic derivatives descriptors. @oesteban @effigies @arokem @robertoostenveld @CPernet I asked @PeerHerholz to edit the original post |
FYI 'we' rather not use |
@francopestilli That sounds good to me. @CPernet I suppose I consider Speaking as someone with no electrophys experience, I did not really anticipate that they would use it, as it doesn't really seem to fit the shape of their data. That said, I could see a route to a |
@francopestilli that sounds good, thank you! |
I am fine with this as a proposal to the BIDS extensions guidelines as documented on https://github.com/bids-standard/bids-extensions. |
The originally-posited discussion period is now over, and we would like to move to a vote of support for this proposal, as indicated in the OP. Please indicate your support by adding a 👍 to this comment or your objection by adding a 👎. If the yays exceed the nays, we can close this and move on to create a PR that will add the language that has been settled through this discussion on to the bids extensions guidelines. The voting on this will close in 2 weeks, October 12th at 9am PT. |
A reminder that voting on this issue will close in two days. |
OK. Voting is now closed and the proposal is accepted. @PeerHerholz : when you get a chance, could you please work to incorporate this in https://github.com/bids-standard/bids-extensions? I'll post an issue linking back here, so we can keep track of this. |
This PR and the respective commits aim to introduce a first draft of a new page/section concerning general conventions for BEP development. This was discussed [here](bids-standard/bids-specification#1602) and addresses [this issue](bids-standard#24). To this end, a new page called "General conventions" is introduced within/from which specific conventions are included/linked. In the current form, parts of the original discussion/proposal were copy-pasted and adapted within a section called "general conventions for spatial derivatives".
Your idea
Hello @bids-standard/maintainers, @bids-standard/steering & everyone,
we, @francopestilli, @arokem, @effigies, @oesteban and @PeerHerholz, would like to submit a proposal concerning spatial derivatives. We hope to engage in fruitful discussions with you all and further refine our proposal. If you have any questions, please don't hesitate to post them as well.
Abstract
In this issue, we propose a general principle for developing BIDS extension proposals for derivative data. The goal is to establish consensus so that parts of BEPs that propose terms in line with this proposal will be considered accepted in principle. The proposal is to ask for feedback from the community, provide a timeline for the discussion, and settle on a decision-making process. At the end of the timeline, we request a decision be reached. The proposal is RECOMMENDED, not REQUIRED, in that BEPs would be allowed to deviate when deemed necessary.
Problem statement
In working through BEPs 12 and 16, we have identified a repeated pattern in generating derivatives within several imaging modalities' workflows where:
We require a reference map that is used to encode spatial features and parameters. There is an antecedent of this in BIDS with BEP23 (see below). In that BEP, the proposed naming takes the pattern
_<suffix>ref
(e.g.,_boldref
,_dwiref
, etc.), and that solution has been suggested as a possibility in issue #1532 of the spec repository.We have derived data that are no longer of the same type as the original, but for which we would like to keep the notion of the modality from which this was derived while also signaling that it is derived (i.e., non-raw).
Proposal
Introduce a new suffix pattern :
_<suffix>map
, where<suffix>
is a BIDS suffix used in the raw data (e.g.,dwi
orbold
). For example, the proposed pattern produces the suffices_dwimap
or_boldmap
. BEPs may use this suffix pattern under the conditions specified below and MUST specify the extension and metadata that are required with the suffix.BIDS
spec. For example,statmap
cannot be used, because it is already being used, or soon to be, for a different specification.Motivation
Many users are not equipped to understand fine distinctions between different classes of derivatives (e.g., those that are produced by a model fit and a direct computation)
This suffix pattern provides context through the concatenation of a raw data suffix and the word "map", which implies that the file still contains spatially contiguous information (in contrast to tabular/"tidy" data, with each row representing a brain region, for example).
Precedents and interactions with other BEPs
BEP 23: PET Derivatives
BEP 23 has introduced "maps" that correspond to the conventions introduced by BEP 001 (qMRI), such as
T1map
,T2map
, etc. The following maps were introduced:RDmap
(receptor density map)BPmap
(binding potential map)GEmap
(genetic expression map)These generally will be distributed as mean/standard-deviation pairs, for example:
sub-01_stat-mean_desc-5HT_RDmap.nii.gz
/sub-01_stat-std_desc-5HT_RDmap.nii.gz
.BEP 12: Functional MRI derivatives
BEP 12 proposes a collection of summary statistics, including mean, standard deviation, temporal SNR, regional homogeneity, etc. Following the example of BEP 23, it has adopted the proposal.
<source_entities>_stat-<mean|std|...>_boldmap.nii.gz
BEP 16: diffusion-weighted imaging derivatives
The current writing of the proposal follows the alternative listed below, where model fit and model-derived parameters are described:
<source_entities>_model.<extension>
<source_entities>_mdp.<extension>
This pattern is, in principle, more generalizable across the other ongoing BEPs and Derivatives in general:
A data process might have generated primary parameters that are either 3D (x,y,z) or 4D (x,y,z,v). These parameters might be of help for further data analysis or data interpretation, and ultimately the data end user. Examples include "statistics" such as mean, std, etc., or model derivatives, such as DTI FA.
At the same time, the process might have generated secondary parameters. These are not strictly necessary for further processing or data interpretation, but they can be potentially useful to interpret the outputs of the data process, to track history of the processing, for reproducibility and ultimately for debugging purposes of the developer/modeler of the code.
BEP 39: dimensionality reduction-based networks
The current version of the proposal uses a comparable pattern as outlined for BEP16:
<source_entities>_mdp.<extension>
<source_entities>_mfp.<extension>
Alternatives Considered
Suffixes that distinguish between model-fit and model-derived parameters. This alternative is implemented in the current state of BEP16 and BEP39. We assess this option should be deemed rejectable for the following reasons:
For the word that modifies the
<suffix>
, the following options have been consideredtensor
: this was deemed rejectable because while the fancy Google branding has run with it, it still means something in physics.array
: all non-scalar data may be considered an array, but it lacks the association with spatial meaningimage
: this was deemed rejectable because in its common usage in neuroimaging software, it implies raw data (e.g.,boldimage
would most likely be read as an image containing BOLD data)Allowing each BEP to create separate suffixes that provide a good match to the use-case in that BEP. This is the status quo and was deemed rejectable to make both decision making and technical implementation simpler because it provides a reference rule for future implementations and avoids the proliferation of suffixes.
Decision making
As outlined above, we propose a two-stage decision-making process within a set timeline to reach a consensus. Furthermore, we aim to evaluate the feasibility of this process concerning other BIDS-related discussions, ie community-driven/guided decision-making.
Stage 1
In the first stage, comments from the entire community are solicited and discussed. We suggest a time period of 2 weeks, starting the day after the proposal was initially circulated/posted.
Stage 2
In the second stage, voting on the provided/proposed options (based on the Stage 1 outcomes) will take place. Here, we also suggest a time period of 2 weeks, starting the day after Stage 1 was finished.
After this time, this proposal will become part of the standard operating procedures of BIDS and be referenced in BEP development guidelines.
The text was updated successfully, but these errors were encountered: