Skip to content

Commit

Permalink
Avoid crashing the event processing thread on non-utf8 filenames (#812)
Browse files Browse the repository at this point in the history
* Avoid putting the process in a zombie state when encountering non-unicode filenames

This patch matches the behavior of the python3 branch on python2:
If a file's name is not a valid string in the filesystem's character encoding, then it is processed with a filename string where the invalid characters are encoded as unicode surrogate pairs.

This matches the behavior of os.fsdecode which is used on python 3

https://docs.python.org/3/library/os.html#os.fsdecode

* Update changelog.rst

Co-authored-by: Mickaël Schoentgen <contact@tiger-222.fr>
  • Loading branch information
lovasoa and BoboTiG authored Jul 1, 2021
1 parent e7f29d1 commit 17dd0d7
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 4 deletions.
3 changes: 2 additions & 1 deletion changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ Changelog

202x-xx-xx • `full history <https://github.com/gorakhargosh/watchdog/compare/v0.10.6...python-2.7>`__

- Avoid crashing the event processing thread on non-utf8 filenames (`#811 <https://github.com/gorakhargosh/watchdog/pull/811>`_)
- [backport 1.0.0] [mac] Regression fixes for native ``fsevents`` (`#717 <https://github.com/gorakhargosh/watchdog/pull/717>`_)
- [backport 1.0.0] [windows] ``winapi.BUFFER_SIZE`` now defaults to ``64000`` (instead of ``2048``) (`#700 <https://github.com/gorakhargosh/watchdog/pull/700>`_)
- [backport 1.0.0] [windows] Introduced ``winapi.PATH_BUFFER_SIZE`` (defaults to ``2048``) to keep the old behavior with path-realted functions (`#700 <https://github.com/gorakhargosh/watchdog/pull/700>`_)
Expand All @@ -21,7 +22,7 @@ Changelog
- [backport 2.0.0] [mac] Support coalesced filesystem events (`#734 <https://github.com/gorakhargosh/watchdog/pull/734>`_)
- [backport 2.0.0] [mac] Drop support for OSX 10.12 and earlier (`#750 <https://github.com/gorakhargosh/watchdog/pull/750>`_)
- [backport 2.0.0] [mac] Fix an issue when renaming an item changes only the casing (`#750 <https://github.com/gorakhargosh/watchdog/pull/750>`_)
- Thanks to our beloved contributors: @SamSchott, @bstaletic, @BoboTiG, @CCP-Aporia, @di, @lukassup, @ysard
- Thanks to our beloved contributors: @SamSchott, @bstaletic, @BoboTiG, @CCP-Aporia, @di, @lukassup, @ysard, @lovasoa


0.10.6
Expand Down
9 changes: 6 additions & 3 deletions src/watchdog/utils/unicode_paths.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,10 +55,13 @@ def encode(path):

def decode(path):
if isinstance(path, bytes_cls):
# Try the filesystem encoding and the fallback encoding.
# If all fails, encode invalid characters using surrogate pairs
try:
path = path.decode(fs_encoding, 'strict')
except UnicodeDecodeError:
if not platform.is_linux():
raise
path = path.decode(fs_fallback_encoding, 'strict')
try:
path = path.decode(fs_fallback_encoding, 'strict')
except UnicodeDecodeError:
path = path.decode(fs_encoding, 'surrogateescape')
return path

0 comments on commit 17dd0d7

Please sign in to comment.