Continue from and log metadata reading errors on import #116

damianmoore · 2020-05-08T14:07:38Z

Issue #115 highlighted an exception happening in photos/utils/metadata.py when communicating with exiftool. This should be logged and not block the rest of the import process.

The text was updated successfully, but these errors were encountered:

damianmoore · 2020-11-22T18:50:46Z

This situation has been improved as of 988a63b. Still need to put more effort and tests into this area.

Red-F · 2021-08-06T11:19:55Z

photonix | 2021-08-06 10:33:09,177 ERROR Error processing task: classify.face - 092e5cfe-0da0-45ad-91a3-7f5242d8f1d2 photonix | Traceback (most recent call last): photonix | File "/srv/photonix/photos/utils/classification.py", line 70, in __process_task photonix | self.runner(task.subject_id) photonix | File "/srv/photonix/classifiers/face/model.py", line 279, in run_on_photo photonix | x = (result['box'][0] + (result['box'][2] / 2)) / photo.base_file.width photonix | TypeError: unsupported operand type(s) for /: 'float' and 'NoneType'
This is an error caused by an earlier UnicodeDecodeError exception during metadata import from the photo. This causes All metadata attributes to be undefined but the 'MIME Type' attribute. Therefore photo.base_file.width returns NoneType causing the above exception when trying to divide.

The UnicodeDecodeError is thrown because of an invalid UTF-8 byte sequence in one of the metadata fields returned by the EXIFTOOL, in this case a 0xF8 which is not a valid start of an UTF-8 sequence.

PR #303 tries to solve this by adding the 'ignore' parameter to the decode call. This simply skips the invalid bytes, returning all metadata as expected. Strictly speaking of course at least on the attributes will not contain the data 'as it was' in the photo, but since that data is not readable anyway not much is lost.

In case ignoring the unicode errors causes the lines to be no longer syntactically correct with the 'text':'text' pattern, your above commit will catch that. This again may lead to an attribute being dropped, but is still much better than only returning the 'MIME Type' attribute.

damianmoore added backend bug python Pull requests that update Python code labels May 8, 2020

damianmoore added this to the 1.0 milestone May 8, 2020

Red-F mentioned this issue Aug 6, 2021

ignore (skipping) invalid utf-8 sequences from exiftool #303

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Continue from and log metadata reading errors on import #116

Continue from and log metadata reading errors on import #116

damianmoore commented May 8, 2020

damianmoore commented Nov 22, 2020

Red-F commented Aug 6, 2021

Continue from and log metadata reading errors on import #116

Continue from and log metadata reading errors on import #116

Comments

damianmoore commented May 8, 2020

damianmoore commented Nov 22, 2020

Red-F commented Aug 6, 2021