Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dump JSON containing multibyte characters #1999

Closed
elh-zbeloki opened this issue Mar 19, 2020 · 1 comment
Closed

Dump JSON containing multibyte characters #1999

elh-zbeloki opened this issue Mar 19, 2020 · 1 comment

Comments

@elh-zbeloki
Copy link

elh-zbeloki commented Mar 19, 2020

I want to be able to dump JSON containing multibyte characters.

My std::string contains 3 characters (or glyphs) in UTF-8: Año.

In UTF-8, 'ñ' is represented as two bytes: \xC3 \xB1.

So, my string contains 4 bytes: [\x41, \xC3, \xB1, \x6F]

The problem appears when I get the dump of the JSON containing the mentioned string. It seems that the library dumps each byte as a character in Latin1: "{"text": "Año"}"

I'm expecting the following string in the dump: "{"text": "A\u00f1o"}"

But I read in other issues of the library that it assumes all strings are UTF-8. Shouldn't this mean that it must be able to know that the "\xC3 \xB1" bytes represent a single character? Am I missing something here?

I always code in Linux + Emacs + GCC, but in this project I need to use Visual Studio 2017.

@elh-zbeloki
Copy link
Author

I solved the problem.

I discovered the ensure_ascii option in dump(). If I call dump with ensure_ascii=true, I get the desired output: "{"text": "A\u00f1o"}"

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant