Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low effort port to charset_normalizer #1690

Closed
wants to merge 1 commit into from

Conversation

hroncok
Copy link

@hroncok hroncok commented Jul 16, 2022

See PyYoshi/cChardet#77

Note that I have not updated setup/arch-ci.sh and bypy/sources.json, as I don't really understand the files.

@kovidgoyal
Copy link
Owner

That's a pure python implementation. The whole point of using cchardet
is that it's a lot faster than pure python implementations (calibre used
to use the python chardet package). If cchardet is not being maintained,
I will simply fork it and maintain it myself as I do for many other
python packages.

@kovidgoyal kovidgoyal closed this Jul 16, 2022
@hroncok hroncok deleted the charset_normalizer branch July 16, 2022 07:13
@kovidgoyal
Copy link
Owner

I have now switched calibre to directly using uchardet, which is what cchardet is an overly elaborate wrapper of.

kovidgoyal added a commit that referenced this pull request Jul 16, 2022
cchardet is not maintained anymore: PyYoshi/cChardet#77

cchardet is based on uchardet with the addition of reporting encoding
detection confidence. We dont really need that, so moving to uchardet is
simplest.

See #1690 (Low effort port to charset_normalizer)
@hroncok
Copy link
Author

hroncok commented Jul 16, 2022

Thank you, that gets the job done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants