Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError occurs when reading settings.ini file containing CJK characters on Windows, due to missing encoding parameter #483

Closed
YIsoda opened this issue Sep 7, 2022 · 1 comment
Labels
bug Something isn't working

Comments

@YIsoda
Copy link

YIsoda commented Sep 7, 2022

When I tried to preview/test/prepare an nbdev project with settings.ini (UTF-8 encorded) which containing some CKJ (or maybe other non-ascii) characters, an error such as UnicodeDecodeError: 'cp932' codec can't decode byte 0x82 in position 725: illegal multibyte sequence ocurred.

Example of settings and full error message

When a setting file containing a line like

description = サンプル プロジェクト (sample project)

and nbdev_* command executed, output is like below:

$ nbdev_preview.exe
Traceback (most recent call last):
  File "C:\Users\<user_home>\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\<user_home>\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "<path_to_venv>\Scripts\nbdev_preview.exe\__main__.py", line 7, in <module>
  File "<path_to_venv>\lib\site-packages\fastcore\script.py", line 119, in _f
    return tfunc(**merge(args, args_from_prog(func, xtra)))
  File "<path_to_venv>\lib\site-packages\nbdev\quarto.py", line 278, in preview
    nbdev_quarto.__wrapped__(path, preview=True, **kwargs)
  File "<path_to_venv>\lib\site-packages\nbdev\quarto.py", line 256, in nbdev_quarto
    nbdev.doclinks._build_modidx(skip_exists=True)
  File "<path_to_venv>\lib\site-packages\nbdev\doclinks.py", line 74, in _build_modidx
    if dest is None: dest = get_config().lib_path
  File "<path_to_venv>\lib\site-packages\nbdev\config.py", line 199, in get_config
    cfg = Config(cfg_file.parent, cfg_file.name, extra_files=extra_files, types=_types)
  File "<path_to_venv>\lib\site-packages\fastcore\foundation.py", line 258, in __init__
    found = [Path(o) for o in self._cfg.read(L(extra_files)+[self.config_file])]#, encoding='utf-8')]
  File "C:\Users\<user_home>\AppData\Local\Programs\Python\Python310\lib\configparser.py", line 698, in read
    self._read(fp, filename)
  File "C:\Users\<user_home>\AppData\Local\Programs\Python\Python310\lib\configparser.py", line 1021, in _read
    for lineno, line in enumerate(fp, start=1):
UnicodeDecodeError: 'cp932' codec can't decode byte 0x82 in position 725: illegal multibyte sequence

Version info:
Operating system: Windows 11 Pro (Japanese)
Python 3.10.6
nbdev 2.1.7

This error is likely caused due to no encoding being specified here:
https://github.com/fastai/fastcore/blob/894bf94a3fcab91c85f05bc9a974a747533e9040/fastcore/foundation.py#L258

The error seems to be resolved by adding encoding='utf-8' to the argument of the ConfigParser.read() method.

@YIsoda YIsoda changed the title UnicodeDecodeError occurs when reading settings.ini file containing CJK characters due to missing encoding specification UnicodeDecodeError occurs when reading settings.ini file containing CJK characters due to missing encoding paraneter Sep 7, 2022
@YIsoda YIsoda changed the title UnicodeDecodeError occurs when reading settings.ini file containing CJK characters due to missing encoding paraneter UnicodeDecodeError occurs when reading settings.ini file containing CJK characters due to missing encoding parameter Sep 7, 2022
@YIsoda YIsoda changed the title UnicodeDecodeError occurs when reading settings.ini file containing CJK characters due to missing encoding parameter UnicodeDecodeError occurs when reading settings.ini file containing CJK characters on Windows, due to missing encoding parameter Sep 7, 2022
@seeM seeM added the bug Something isn't working label Sep 8, 2022
@jph00 jph00 closed this as completed in e1c85ca Sep 8, 2022
@seeM
Copy link
Contributor

seeM commented Sep 8, 2022

Thanks for the great issue write-up :D you practically fixed it for us! It should work in latest master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants