Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AFNI and mixed datatypes #338

Closed
neurolabusc opened this issue Sep 16, 2019 · 16 comments
Closed

AFNI and mixed datatypes #338

neurolabusc opened this issue Sep 16, 2019 · 16 comments

Comments

@neurolabusc
Copy link
Collaborator

neurolabusc commented Sep 16, 2019

For historical reasons, AFNI natively supports the datatype signed SHORT (INT16, -32768..32767) and FLOAT (SINGLE32). This matches original 12-bit MRI ADC (0..4095) and the Analyze format datatypes. In contrast, NIfTI includes additional datatypes including Unsigned SHORT (UINT16, 0..65535). Likewise, modern scanners use 16-bit ADC that can also generate unsigned shorts.

By default, dcm2niix will convert a 16-bit DICOM series to INT16 raw intensity of the voxels is in the range -32768..32767, but uses UINT16 if any voxels exceed 32767. This ensures lossless data conversion while supporting the historically more popular INT16 when possible.

However, there was an unintended consequence of this decision. Consider an fMRI study where each individual completes more than one run of the tasks. In this case, one series might be saved as INT16 while the other might be saved as UINT16. Since AFNI does not natively support UINT16, AFNI's 3dcalc will convert that run to FLOAT (requiring twice the disk space) while retaining the INT16 datatype for the other run. While AFNI can process each run, it seems that some AFNI tools such as 3dDeconvolve are unable to hand mixed datatypes and generate stats images that are gibberish (see attached).

Hopefully, AFNI can be extended to support mixed data types. In the meantime, AFNI users may consider the following:
1.) Choose -datum float when using 3dcalc. This will force INT16 data to be saved as FLOAT32. While this doubles the disk space, it will avoid this problem.
2.) One can run the developmental branch of dcm2niix can be built (Unix commands below). The new option -l o will retain the original datatype of the DICOM images, thus DICOM UINT16 data will always be saved as UINT16 regardless of the voxel range. AFNI's 3dcalc always convert UINT16 to float . While this doubles the disk space, it will avoid this problem.

git clone --branch development https://github.com/rordenlab/dcm2niix.git
cd dcm2niix/console
make

Stats

@neurolabusc
Copy link
Collaborator Author

neurolabusc commented Sep 17, 2019

Does the community have a preference for how UINT16 DICOMs should be converted to NIfTI. Off the top of my head, I think @hanayik, @poldrack, @gllmflndn, @afni-rickr and @satra are members of large teams. As I note below, different conversion tools behave differently. For datasets like openfmri or tools like AFNI, FSL and SPM one wonders if this inconsistent behavior may have unintended consequences (e.g. as I note above, while dcm2niix and Dimon are included with AFNI, both behave in a way that can disrupt AFNI processing). I have no strong views on this, but happy to implement any consensus.

Consider a DICOM image with the following tags:

(0028,0100) US 16	BitsAllocated
(0028,0101) US 16	BitsStored
(0028,0102) US 15	HighBit
(0028,0103) US 0	[PixelRepresentation = unsigned](http://dicomlookup.com/lookup.asp?sw=Tnumber&q=(0028,0103))

UINT16 DICOM with brightest voxel <32768

  1. Retain original DICOM datatype (UINT16): behavior of dicomtonifti, mrconvert, MRIConvert, mri_convert, SPM12
    • Advantage: this is the format specified by DICOM.
    • Disadvantage: this is not a native datatype for some NIfTI tools, in particular those with an Analyze format heritage. Some tools convert these to FLOAT32, doubling disk requirements.
  2. Convert to short (INT16): behavior of dcm2niix, dicm2nii, dinifti, Dimon, DWIconvert
    • Advantage: this change is lossless, INT16 is supported output datatype of fslmaths and AFNI 3dcalc.
    • Disadvantage: in a session with multiple series, it is possible for some series to exceed 32767 (requiring UINT16) while others may not (so can be stored as INT16). Therefore, the same sequence run on the same session could lead to NIfTI images that use a different datatype. This could disrupt some tools (e.g. AFNI).

UINT16 DICOM with brightest voxel >32767

  1. Retain original DICOM datatype (UINT16): behavior of dcm2niix, dicm2nii, dicomtonifti, MRIConvert, mrconvert, SPM12
  2. Convert to FLOAT32: behavior of Dimon, mri_convert
  3. Convert to INT16: (clearly wrong) such that very bright voxels are reported as exceptionally dark: behavior DWIconvert, dinifti 2.33.1

@poldrack
Copy link

looping in @oesteban and @effigies

@effigies
Copy link

I don't have a strong opinion, either, as we're working on the edge where tools are incorrectly interpreting data, which means it's hard to make assumptions about how any data will be treated.

You could reduce the range of invalid interpretations to a single point of failure by using data scaling factors: scl_slope and scl_inter. This would make the algorithm:

img.data = int16(img.data - 32767)
img.header.scl_slope = 1
img.header.scl_inter = 32767
  • Advantages: Deterministic, lossless, high support for INT16, disk space stays constant
  • Disadvantage: Interpretation will probably be float32, increasing memory requirements, possibly disk requirements for derived data.

@neurolabusc
Copy link
Collaborator Author

@effigies your proposal has clear advantages (though use "32768" instead of "32767" as INT16 is -32768..+32767 while UINT16 is 0..65535). While the benefits might outweigh limitations, potential unintended consequences would be:

  • Legacy software that treats NIfTI as Analyze or simply ignores slope/intercept might be disrupted.

By the way, your method is precisely how MRIcroGL treats 16-bit UINTs internally. It internally stores data as UINT8, INT16 or FLOAT32. When you load a UINT16 volume, it is converted to INT16 as you describe, requiring half the RAM of FLOAT32 and allowing fast INT operations instead of slower FLOAT operations.

@effigies
Copy link

I'm not sure that there's much you can do on the data end to help tools that don't distinguish between NIfTI and Analyze.

Can you distinguish between DICOMs generated by scanners using a 12- and 16-bit ADC, and cast the former to INT16 without scale factors, and the latter using scale factors? That way both are lossless and deterministic, and only new scanner data breaks Analyze?

And good catch on the intercept...

@satra
Copy link

satra commented Sep 17, 2019

@neurolabusc - i think we could optimize, but it is really the tool's responsibility to support valid NIfTI data. i would also consider that in general it is the tool developers responsibility to raise errors as necessary for aspects of the format they do not support.

in the context of your original question, these would be my preferred sequence:

  • convert to int16 if feasible
  • leave as native uint16 if not (do not try to compress to float) independent of bad consequences
    • we could add a note during conversion (if anybody reads such things) that certain tools may not accept this data type.
  • offer an option in dcm2niix to allow users to change datatype (personally, i don't think this should be done)

@xiangruili
Copy link

+1 @satra

@neurolabusc
Copy link
Collaborator Author

@effigies 12-bit ADC is easy to detect (0028,0101 = 12). DICOM allows either signed or unsigned data (0028,0103), though in my experience most vendors use 12 or 16 bit UNSIGNED and then include a negative intercept. For reference, below is a Siemens Phase map with negative and positive values. It is saved as 12-bit unsigned but adds an intercept.

(0028,0100) US 16                                       #   2, 1 BitsAllocated
(0028,0101) US 12                                       #   2, 1 BitsStored
(0028,0102) US 11                                       #   2, 1 HighBit
(0028,0103) US 0                                        #   2, 1 PixelRepresentation
...
(0028,1052) DS [-4096]                                  #   6, 1 RescaleIntercept
(0028,1053) DS [2]                                      #   2, 1 RescaleSlope

@afni-rickr
Copy link

The suggestion by @effigies of a shift/scale option sounds good. And having the ability to convert to FLOAT32 as backup behavior would be kind. To me it seems preferable to leave the data in the original format, and only modify it as an optional request by the user.
Note that the while supporting mixed data types would be feasible, I would be more inclined to post an error in such a condition, requiring the types to be made consistent up front. It seems like begging for trouble otherwise. Similarly, I imagine many packages/programs would have trouble with mixed data orientations, and would prefer up front consistency.

@satra
Copy link

satra commented Sep 18, 2019

To me it seems preferable to leave the data in the original format, and only modify it as an optional request by the user.

+1 to that

@neurolabusc
Copy link
Collaborator Author

@satra can you clarify your position. Yesterday your preference was convert to int16 if feasible (e.g. dcm2niix) whereas today you seem to support leave the data in the original format (e.g. SPM12).

Consider the common situation where DICOM stores UINT16 (0028,0100 = 16; 0028,0103 = 0) where the brightest voxel is <32768. In this case it is feasible to losslessly store as a signed INT16, but the original DICOM is explicitly unsigned UINT16 (even in the case where 0028,0101=12, suggesting 12-bit ADC where brightest value is 4095).

I am really not trying to be pedantic here or call you out. Just trying to understand your preference.

So for voting:

  1. UINT16 DICOM saved as INT16 if feasible, else saved as UINT16 (current dcm2niix, dicm2nii)
  2. UINT16 DICOM saved as INT16 if feasible, else saved as FLOAT32 (current Dimon, mri_convert)
  3. UINT16 DICOM data always saved as UINT16 NIfTI (current dicomtonifti, mrconvert, MRIConvert, mri_convert, SPM12)
  4. UINT16 DICOM data always saved as INT16, with intercepted adjusted if required (@effigies suggestion)
  5. All the above solutions are valid NIfTI conversions. There is no need for consensus in conversion tools. Any tool that reads NIfTI should be expected to either handle these or generate an error with unsupported or mixed datatypes.

@effigies
Copy link

I think there's one more option in here:

1a. UINT16 DICOM saved as INT16 if feasible, else saved as UINT16, where "feasible" is determined by the metadata indicating the possible value range is in 0-32767 (such as (0028,0101) US 12) rather than depending on the observed value range.

@satra
Copy link

satra commented Sep 18, 2019

@neurolabusc - i know i dug myself into the ground there!

my personal preference generally is to keep things close to original. so 3 in your list, or more generally maintain the dicom datatype in nifti, and convert/rescale if requested by the user.

the reason i suggested the int16 conversion as an initial middle ground was that it won't really change the operating characteristics. and since nifti is not dicom, we are not really beholden to dicom. there are many things we discard from the dicom metadata.

@mharms
Copy link
Collaborator

mharms commented Sep 18, 2019

I favor 1, which just preserves the behavior of dcm2niix going back years at this point (to when, IIRC, the HCP consortium requested that @neurolabusc add support for UINT16's that made use of the full dynamic range (i.e., max values > 32767) to support that capability in the CMRR sequences.

@neurolabusc
Copy link
Collaborator Author

@effigies, your option 1a seems reasonable. Be aware that options 1a, 2 and 3 each influence memory (disk and RAM) requirements and speed. AFNI and FSL tools natively support INT16, but will convert UINT16 to FLOAT32 (AFNI) or INT32 (FSL). In theory, the disk requirements of INT32 and INT16 files with identical data would be similar with a modern compression method (BLOSC+zstd), but for the gz compression used in NIfTI the INT32 files are ~20% larger than INT16 images with the same data.

@neurolabusc
Copy link
Collaborator Author

After discussion with @effigies and @satra, in the distant future the default behavior dcm2niix will be changed so that DICOM files that are UINT16 will not be losslessly converted to INT16. The user will have to explicitly request this modification. For the next stable release, dcm2niix will warn users if UINT16 is converted to INT16. This will provide users a chance to prepare for this change. Both methods are lossless, but some projects like dcm_qa will note that data type is changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants