Skip to content

Commit

Permalink
[utils] Sanitize look-alike Unicode glyphs in non-ID filename fields …
Browse files Browse the repository at this point in the history
…when --restrict-filenames

Implements ytdl-org#31216 (comment), which has a test.
  • Loading branch information
dirkf authored and alxlive committed Feb 27, 2023
1 parent dbb7776 commit 46de920
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions youtube_dl/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
import tempfile
import time
import traceback
import unicodedata
import xml.etree.ElementTree
import zlib

Expand Down Expand Up @@ -2118,6 +2119,9 @@ def replace_insane(char):
return '_'
return char

# Replace look-alike Unicode glyphs
if restricted and not is_id:
s = unicodedata.normalize('NFKC', s)
# Handle timestamps
s = re.sub(r'[0-9]+(?::[0-9]+)+', lambda m: m.group(0).replace(':', '_'), s)
result = ''.join(map(replace_insane, s))
Expand Down

0 comments on commit 46de920

Please sign in to comment.