AttributeError: module 'PIL.Image' has no attribute 'ExifTags' #6881

albertvillanova · 2024-05-08T06:33:57Z

When trying to load an image dataset in an old Python environment (with Pillow-8.4.0), an error is raised:

AttributeError: module 'PIL.Image' has no attribute 'ExifTags'

The error traceback:

~/huggingface/datasets/src/datasets/iterable_dataset.py in __iter__(self)
   1391                 # `IterableDataset` automatically fills missing columns with None.
   1392                 # This is done with `_apply_feature_types_on_example`.
-> 1393                 example = _apply_feature_types_on_example(
   1394                     example, self.features, token_per_repo_id=self._token_per_repo_id
   1395                 )

~/huggingface/datasets/src/datasets/iterable_dataset.py in _apply_feature_types_on_example(example, features, token_per_repo_id)
   1080     encoded_example = features.encode_example(example)
   1081     # Decode example for Audio feature, e.g.
-> 1082     decoded_example = features.decode_example(encoded_example, token_per_repo_id=token_per_repo_id)
   1083     return decoded_example
   1084 

~/huggingface/datasets/src/datasets/features/features.py in decode_example(self, example, token_per_repo_id)
   1974 
-> 1975         return {
   1976             column_name: decode_nested_example(feature, value, token_per_repo_id=token_per_repo_id)
   1977             if self._column_requires_decoding[column_name]

~/huggingface/datasets/src/datasets/features/features.py in <dictcomp>(.0)
   1974 
   1975         return {
-> 1976             column_name: decode_nested_example(feature, value, token_per_repo_id=token_per_repo_id)
   1977             if self._column_requires_decoding[column_name]
   1978             else value

~/huggingface/datasets/src/datasets/features/features.py in decode_nested_example(schema, obj, token_per_repo_id)
   1339         # we pass the token to read and decode files from private repositories in streaming mode
   1340         if obj is not None and schema.decode:
-> 1341             return schema.decode_example(obj, token_per_repo_id=token_per_repo_id)
   1342     return obj
   1343 

~/huggingface/datasets/src/datasets/features/image.py in decode_example(self, value, token_per_repo_id)
    187             image = PIL.Image.open(BytesIO(bytes_))
    188         image.load()  # to avoid "Too many open files" errors
--> 189         if image.getexif().get(PIL.Image.ExifTags.Base.Orientation) is not None:
    190             image = PIL.ImageOps.exif_transpose(image)
    191         if self.mode and self.mode != image.mode:

~/huggingface/datasets/venv/lib/python3.9/site-packages/PIL/Image.py in __getattr__(name)
     75                 )
     76                 return categories[name]
---> 77         raise AttributeError(f"module '{__name__}' has no attribute '{name}'")
     78 
     79 

AttributeError: module 'PIL.Image' has no attribute 'ExifTags'

Environment info

Since datasets 2.19.0

rwightman · 2024-07-18T03:18:47Z

@albertvillanova @lhoestq just ran into it and requiring newer pillow isn't a solution as it breaks Pillow-SIMD which is behind Pillow quite a few versions but necessary for training with reasonable throughput.

A couple things here...

This can be done with a method that isn't an issue for any somewhat recent Pillow
image = ImageOps.exif_transpose(image)
I'd rather this not be done for me automatically. Sometimes exif data is correct, sometimes it's not. Sometimes I might want to correct the orientation, sometimes I might not.

In any case if I've preprocessed the images properly myself I don't want to incur overhead, possible further fp seeks, parsing, to load the exif that's not loaded and parsed when you just open and decode the image.

albertvillanova · 2024-07-18T06:15:57Z

Hi @rwightman, thanks for your feedback.

First, as a side note comment, please note that you are depending on Pillow-SIMD and that library seems no longer maintained:

it has not been updated for more than a year: last commit to main was on June 20, 2023: uploadcare/pillow-simd@faae977
in PyPI, the last release was more than 2 years ago, on January 4, 2022: https://pypi.org/project/Pillow-SIMD/#history

In relation with your suggestions for the datasets library, the changes were introduced by this PR:

Transpose images with EXIF Orientation tag #6739

I agree maybe we should have given the option whether to perform this operation or not.

rwightman · 2024-07-18T06:49:29Z

@albertvillanova

Huh, thought I'd just installed the current datasets when I ran into this, maybe it was behind...

I'm aware the support for SIMD is a problem, but it's up to 8x faster than non SIMD Pillow and really necessary in many training situations or you have lots of idle GPUs. The current situation is unfortunate but most changes since 9.0 aren't all that important for 'decoding jpegs and resizing'

albertvillanova added the bug Something isn't working label May 8, 2024

albertvillanova self-assigned this May 8, 2024

albertvillanova mentioned this issue May 8, 2024

Require Pillow >= 9.4.0 to avoid AttributeError when loading image dataset #6883

Merged

albertvillanova closed this as completed in #6883 May 16, 2024

danieljanes mentioned this issue Jun 28, 2024

AttributeError: module 'PIL.Image' has no attribute 'ExifTags' adap/flower#3687

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AttributeError: module 'PIL.Image' has no attribute 'ExifTags' #6881

AttributeError: module 'PIL.Image' has no attribute 'ExifTags' #6881

albertvillanova commented May 8, 2024

rwightman commented Jul 18, 2024 •

edited

Loading

albertvillanova commented Jul 18, 2024

rwightman commented Jul 18, 2024

AttributeError: module 'PIL.Image' has no attribute 'ExifTags' #6881

AttributeError: module 'PIL.Image' has no attribute 'ExifTags' #6881

Comments

albertvillanova commented May 8, 2024

Environment info

rwightman commented Jul 18, 2024 • edited Loading

albertvillanova commented Jul 18, 2024

rwightman commented Jul 18, 2024

rwightman commented Jul 18, 2024 •

edited

Loading