Recompress compressed DICOM images after redaction #1040

niwilso · 2023-02-23T17:55:45Z

Describe the bug
When running redaction on compressed pixel data, the returned pixel data is uncompressed. This is because when adding boxes via DicomImageRedactorEngine._add_redact_box, we use the loaded DICOM instance's .pixel_array values, which is uncompressed, unlike its .PixelData.

We are still able to redact correctly, but we are then unable to save the redacted instance as a .dcm file.

Side note: If an error occurs while trying to write out the pixel data post-redaction, then gdcm may need to be installed.

Whether the pixel data is compressed or not can be checked via the DICOM tag (0028, 2110). If the value is '01', then the pixel data is compressed.

if redacted_instance[0x0028, 0x2110].value == '01':
    compression_method = instance.file_meta.TransferSyntaxUID
    print(f'Pixel data is compressed with Transfer Syntax UID: {compression_method}')

To Reproduce
Steps to reproduce the behavior:

import pydicom
from presidio_image_redactor import DicomImageRedactorEngine

# Redact text PHI
engine = DicomImageRedactorEngine()
instance = pydicom.dcmread(PATH_TO_DICOM_FILE)
redacted_instance = engine.redact(instance)

# Calculate bytes
rows = instance[0x0028, 0x0010].value
columns = instance[0x0028, 0x0011].value
samples_per_pixel = instance[0x0028, 0x0002].value
bits_allocated = instance[0x0028, 0x0100].value
try:
    number_of_frames = instance[0x0028, 0x0008].value
except:
    number_of_frames = 1
expected_num_bytes = rows * columns * number_of_frames * samples_per_pixel * (bits_allocated/8)

print(f"Expected (no compression): {int(expected_num_bytes)}")
print(f"Actual, pre-redaction: {len(instance[0x7fe0, 0x0010].value)}")
print(f"Actual, post-redaction: {len(redacted_instance[0x7fe0, 0x0010].value)}")

Note that native support for compressing is not implemented in pydicom yet. The following line would be ideal but throws an error due to it not being available.

redacted_instance.compress(transfer_syntax_uid=compression_method, encoding_plugin='gdcm')

Expected behavior
With the above, we would ideally have the number of bytes pre- and post-redaction as equal. But when no compression is re-applied to previously compressed pixel data, then the number of bytes for post-redaction would be equal to what is expected with no compression.

If we run redacted_instance.save_as('FILE_NAME_HERE.dcm'), then we get the following error (which we want to avoid):

ValueError: With tag (7fe0, 0010) got exception: (7FE0,0010) Pixel Data has an undefined length indicating that it's compressed, but the data isn't encapsulated as required. See pydicom.encaps.encapsulate() for more information
Traceback (most recent call last):
  File "/anaconda/envs/feasibility-study/lib/python3.8/site-packages/pydicom/tag.py", line 28, in tag_in_exception
    yield
  File "/anaconda/envs/feasibility-study/lib/python3.8/site-packages/pydicom/filewriter.py", line 662, in write_dataset
    write_data_element(fp, dataset.get_item(tag), dataset_encoding)
  File "/anaconda/envs/feasibility-study/lib/python3.8/site-packages/pydicom/filewriter.py", line 579, in write_data_element
    raise ValueError(
ValueError: (7FE0,0010) Pixel Data has an undefined length indicating that it's compressed, but the data isn't encapsulated as required. See pydicom.encaps.encapsulate() for more information

Additional context
Potentially helpful resources:

The text was updated successfully, but these errors were encountered:

niwilso added bug Something isn't working image-anonymization labels Feb 23, 2023

niwilso mentioned this issue Jul 3, 2023

DICOM redactor improvement: Enabling compatibility with compressed images #1105

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recompress compressed DICOM images after redaction #1040

Recompress compressed DICOM images after redaction #1040

niwilso commented Feb 23, 2023 •

edited

Loading

Recompress compressed DICOM images after redaction #1040

Recompress compressed DICOM images after redaction #1040

Comments

niwilso commented Feb 23, 2023 • edited Loading

niwilso commented Feb 23, 2023 •

edited

Loading