-
Notifications
You must be signed in to change notification settings - Fork 779
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON: unsigned long are base64 encoded #72
Comments
This isn't intended to catch anything that isn't as big as a We can change this so that it serializes the same way as the XML archive, that is into a string based representation of the number instead of the base64 encoded version. rapidjson doesn't provide a native way to serialize numeric types larger than 64 bits, which is why we went with this originally. It might make more sense for us to switch since these are text based archives and supposed to be human readable. Also looking up the specification for long double I just realized that there's really no standard for it, so using a binary serialization would not be portable. |
Hello, (and sorry for the triple-post…) The result is the same using clang (Apple LLVM version 5.0 (clang-500.2.79) (based on LLVM 3.3svn)) and gcc (gcc (MacPorts gcc48 4.8.2_0+universal) 4.8.2), on Mac OS X 10.9.2, using the following code.
Result:
Furthermore, when compiling with -m32, the compiler is unable to find the good overload, as it is ambiguous. |
You are right about the overloads on 32 bit compilation. I haven't been able to reproduce the base64 encoding for a long on 64 bit Ubuntu using g++ 4.8, 4.7, or clang 3.3, but I'll keep looking into that. long should definitely be being serialized as a base 10 number. What are your thoughts on the format for long long and long double? Do you have a preference over base 10 or base 64 for those? Also reading base64 in python should be quite easy (http://docs.python.org/2/library/base64.html). |
3 ways seem possible/sensible:
|
@AzothAmmo I made the test on a CentOS 6 and long types are serialized as base 10. I still have to find why the behavior is different on Mac OS X. Speaking of Python, I tried decoding base64 using the standard library, but I only obtain a string like b',\x00\x00\x00\x00\x00\x00\x00'. I still must find how to decode that.
As you see, it's not very portable as we have to know the endianness. @DrAWolf In the short term, I prefer the first solution, as it seems more natural when using the JSON file which is more or less 'human-readable'. But in a distant futur, it would be nice to be able to chose how to encode. |
We have a commit (ba2ca7c) in that I think fixes the long being serialized incorrectly. If you can try out the develop branch, it should fix that portion of the issue. We haven't addressed the serializing long long or long double as something other than a base64 string yet. rapidjson's number parser would need to be updated to be able to handle these kinds of numbers if we want to directly write them as "numbers" in the JSON. |
It works for 32 bits, but not 64 bits. Comments at https://github.com/USCiLab/cereal/blob/develop/include/cereal/archives/json.hpp#L221 shows that it's indeed the case. Maybe there should be these overloads for 64 bits:
The problem is that it causes an ambiguous call with the following overload ( https://github.com/USCiLab/cereal/blob/develop/include/cereal/archives/json.hpp#L231 ):
Modifying
to
removed the ambiguity, but I'm not sure if it's a good idea or not. |
Can you post the output of this on your various OS/compiler mixes? #include <iostream>
#include <type_traits>
int main()
{
std::cout << std::boolalpha;
std::cout << "int " << sizeof(int) << std::endl;
std::cout << "uint " << sizeof(unsigned int) << std::endl;
std::cout << "long " << sizeof(long) << std::endl;
std::cout << "unsigned long " << sizeof(unsigned long) << std::endl;
std::cout << "long long " << sizeof(long long) << std::endl;
std::cout << "unsigned long long " << sizeof(unsigned long long) << std::endl;
std::cout << "long int32 " << std::is_same<long, int32_t>::value << std::endl;
std::cout << "long int64 " << std::is_same<long, int64_t>::value << std::endl;
std::cout << "ulong uint32 " << std::is_same<unsigned long, uint32_t>::value << std::endl;
std::cout << "ulong uint64 " << std::is_same<unsigned long, uint64_t>::value << std::endl;
std::cout << "long long int32 " << std::is_same<long long, int32_t>::value << std::endl;
std::cout << "long long int64 " << std::is_same<long long, int64_t>::value << std::endl;
std::cout << "ulong long uint32 " << std::is_same<unsigned long long, uint32_t>::value << std::endl;
std::cout << "ulong long uint64 " << std::is_same<unsigned long long, uint64_t>::value << std::endl;
} |
I used clang on OS X (results are the same with gcc) and gcc on CentOS. I only post results for 64 bits, as there are no differences between Linux and OS X in 32 bits.
GCC version on CentOS:
clang/OS X:
gcc/CentOS:
|
longs should now properly serialize under 32 or 64 bit machines. long long, unsigned long long, and long double now serialize as base10 strings instead of base64. see issue #72
Ready to close this if the issue is gone for you. I think what I've committed should cover all the cases, but I don't have a Mac to test on. Now using base10 instead of base64. |
Sorry for the delay. Alas, there's still an ambiguous call (the error is similar with GCC 4.8):
|
Let me know how this one does. |
Everything works fine. Thank you! |
When serializing an unsigned long to JSON, cereal outputs some non-human readable data which seems to be in base64. Indeed, looking at the source code (https://github.com/USCiLab/cereal/blob/master/include/cereal/archives/json.hpp#L110), any type with a size >= sizeof(long long) is considered 'exotic' and thus base64-encoded.
Furthermore, the XML output is perfectly fine (by that, I mean readable).
I fail to see why longs are base64 encoded. Is this a desired thing? Also, it means that when compiling the code in 32bits or 64 bits, we have a different output. Finally, it makes unnecessarily difficult to read the JSON output in another language (Python in my case)
The text was updated successfully, but these errors were encountered: