You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, we arbitrarily decided to consider only the first 1024 bytes of the content to lookup for charset. While this default value makes sense as a compromise between capacity to find all charsets and performance / memory footprint, it would help a lot if we could customize this option for the rare cases like here where the content-type is specified, a bit custom (windows-1252 here), and we don't mind to explore more bytes on all contents.
I suggest we should add an option to customize this "magic number".
The text was updated successfully, but these errors were encountered:
At https://www.marxists.org/espanol/justo/suvida.htm, the charset specified in HTML header is unfortunately far away (we need 1028 bytes to find it in full, instead of the default 1024 bytes).
Currently, we arbitrarily decided to consider only the first 1024 bytes of the content to lookup for charset. While this default value makes sense as a compromise between capacity to find all charsets and performance / memory footprint, it would help a lot if we could customize this option for the rare cases like here where the content-type is specified, a bit custom (
windows-1252
here), and we don't mind to explore more bytes on all contents.I suggest we should add an option to customize this "magic number".
The text was updated successfully, but these errors were encountered: