Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crunchyroll - Can't Download - "latin-1 codec can't encode character \u2019" #26128

Closed
5 of 6 tasks
werewolf004 opened this issue Jul 27, 2020 · 7 comments
Closed
5 of 6 tasks

Comments

@werewolf004
Copy link

Checklist

  • I'm reporting a broken site support issue
  • I've verified that I'm running youtube-dl version 2020.06.16.1
  • I've checked that all provided URLs are alive and playable in a browser
  • I've checked that all URLs and arguments with special characters are properly quoted or escaped
  • I've searched the bugtracker for similar bug reports including closed ones
  • I've read bugs section in FAQ

Verbose log

C:\youtube-dl\youtube_dl>youtube-dl -v -i "https://www.crunchyroll.com/fr/shadow
verse/episode-15-super-rich-miyabi-zaizenji-795760" --user-agent "Mozilla/5.0 (W
indows NT 10.0; Win64; x64) Chrome/83.0.4103.61 Safari/537.36" --cookies cookies
.txt -f best --write-sub --sub-lang frFR --sub-format ass --no-check-certificate

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-v', '-i', 'https://www.crunchyroll.com/fr/shadowve
rse/episode-15-super-rich-miyabi-zaizenji-795760', '--user-agent', 'Mozilla/5.0
(Windows NT 10.0; Win64; x64) Chrome/83.0.4103.61 Safari/537.36', '--cookies', '
cookies.txt', '-f', 'best', '--write-sub', '--sub-lang', 'frFR', '--sub-format',
 'ass', '--no-check-certificate']
[debug] Encodings: locale cp1252, fs mbcs, out cp850, pref cp1252
[debug] youtube-dl version 2020.06.16.1
[debug] Python version 3.4.4 (CPython) - Windows-7-6.1.7601-SP1
[debug] exe versions: ffmpeg 4.0.2
[debug] Proxy map: {}
[crunchyroll] 795760: Downloading webpage
ERROR: 'latin-1' codec can't encode character '\u2019' in position 1206: ordinal
 not in range(256)
Traceback (most recent call last):
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpspbsbt
mq\build\youtube_dl\YoutubeDL.py", line 797, in extract_info
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpspbsbt
mq\build\youtube_dl\extractor\common.py", line 530, in extract
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpspbsbt
mq\build\youtube_dl\extractor\crunchyroll.py", line 426, in _real_extract
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpspbsbt
mq\build\youtube_dl\extractor\crunchyroll.py", line 277, in _download_webpage
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpspbsbt
mq\build\youtube_dl\extractor\common.py", line 794, in _download_webpage
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpspbsbt
mq\build\youtube_dl\extractor\common.py", line 660, in _download_webpage_handle
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpspbsbt
mq\build\youtube_dl\extractor\common.py", line 627, in _request_webpage
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpspbsbt
mq\build\youtube_dl\YoutubeDL.py", line 2238, in urlopen
  File "C:\Python\Python34\lib\urllib\request.py", line 464, in open
  File "C:\Python\Python34\lib\urllib\request.py", line 482, in _open
  File "C:\Python\Python34\lib\urllib\request.py", line 442, in _call_chain
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmpspbsbt
mq\build\youtube_dl\utils.py", line 2580, in http_open
  File "C:\Python\Python34\lib\urllib\request.py", line 1183, in do_open
  File "C:\Python\Python34\lib\http\client.py", line 1137, in request
  File "C:\Python\Python34\lib\http\client.py", line 1177, in _send_request
  File "C:\Python\Python34\lib\http\client.py", line 1109, in putheader
UnicodeEncodeError: 'latin-1' codec can't encode character '\u2019' in position
1206: ordinal not in range(256)

Description

Since 1 week I can't download anymore with this command. It always ask me about latin-1 codec and \u2019 character. I make a test on another windows and I have the same error.
Is someone have a solution with this ?

@werewolf004
Copy link
Author

werewolf004 commented Jul 27, 2020

Think problem came from this: [debug] Encodings: locale cp1252, fs mbcs, out cp850, pref cp1252
On a working machine of a friend, there is [debug] Encodings: locale cp1252, fs utf-8, out utf-8, pref cp1252 But how can I change mbcs / cp850 to utf-8 ??

Edit: If I use the youtube_dl installed with pip I use this: python youtube_dl blablabla => I get FS and OUT in UTF-8 but I have the same error, It seems it came from cookie file

@ImVantexHD
Copy link

Yep, the problem comes from the cookie file. Did you change your browser language recently?
You can try to save your cookie file as ANSI, that should fix the problem.

@Isis45
Copy link

Isis45 commented Aug 12, 2020

I have the same problem since today
Saving cookies in ANSI does not solve the problem, the error is the same.

UnicodeEncodeError: 'latin-1' codec can't encode character '\u2019' in position 153: ordinal not in range(256)

@ImVantexHD
Copy link

ImVantexHD commented Aug 12, 2020

Saving cookies in ANSI does not solve the problem, the error is the same.

oh boy... it should work.
ok, try to edit the cookie file manually and then save it as ANSI, something like this:
b12JoXC13T
cmd_pejxZiGAlG

@Isis45
Copy link

Isis45 commented Aug 12, 2020

Thanks for your reply but ...
I have another error now:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x92 in position 1338: invalid start byte

Edit: Important point, if I save cookies with chrome, it don't work,
with Firefox it works.
So for me the problem is solved.

@Haevens
Copy link

Haevens commented Aug 20, 2020

Solution :
On python : Python38-32\Lib\http : client.py
Modify line 1228 : values[i] = one_value.encode('latin-1') in values[i] = one_value.encode('utf-8')
It working for me.

@dirkf
Copy link
Contributor

dirkf commented Jun 13, 2022

U+2019 RIGHT SINGLE QUOTATION MARK

This character shouldn't be in the cookie file, and shouldn't be set in a header value. The cookie file is treated as UTF-8 and so yt-dl can read such a character from a cookie file and try to set it in the header.

A fix was proposed in response to #6769 but the fix code has been commented out.

However the problem URL can be downloaded just by omitting the cookie option.

@dirkf dirkf closed this as completed Jun 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants