-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Numeric read/write locale independent #1322
Comments
A slightly "heretic" idea, but as I understand things SPI1D and SPI3D are Sony inventions which more or less are only used by OCIO, correct? Developing or using a binary format would solve both performance (loading a bunch of LUTs is still kinda slow, even with the sscanf code; initializing OCIO takes a second or so in my app, even though it's effectively only handling a few MB) and interoperability issues, as well as approximately halving file size. This of course doesn't solve the problem for existing deployments and files. I briefly looked into using one of the "faster-than-libc" float parsing libraries that are out there (and generally completely ignore locale for performance) for #1089 but in part due to the CLA stuff concluded using such a thing would be infeasible in OCIO. That's pretty much were I got off the train. |
We had also seen some slow startup times (yes, we should load on a background thread, etc.) and I solved it two ways:
|
FWIW, I had investigated using hex float representation (e.g., "0x1.921FB4D12D84Ap+1"), but oddly they parsed slower than decimal. |
Hexfloats aren't used a lot, so I wouldn't be surprised if they're not well optimized in standard libraries. I mean... you can't even read them using C++ streams. |
The most obvious problem with outputting binary/binary equivalents is that you are adding a requirement for equivalence of representation across platforms - now this may not be a problem as most platforms probably use IEEE 754 but there is at least an endian related problem. At which point I'd certainly look for a library solution for handling this serialisation e.g protobuf, thrift, or maybe just look for some other lightweight bson library rather than rolling our own routines. Kevin |
Thanks for all the great ideas but the focus for short-term is to support existing LUT file formats independently from the locale and without any performance hit. On the other hand developing a fast binary LUT file format is a great idea. But I think that it should not be an OCIO specific format. I would suggest to bring that discussion to the ACES CLF working group. If accepted, that approach would then provide standardisation (i.e. under the ACES CLF umbrella with OCIO as a reference implementation) and performance. |
The pull request #1351 contains a discussion about the locale agnostic feature plus some hints from OpenImageIO on how to make it happen. |
Hi @hodoulp, I just closed bug 407921 in Krita which is a direct repro of #297. The only way I could get rid of it (short of employing a sstream-based parser) is by NatronGitHub/openfx-io@27a13e0. Is there any news regarding this? |
@amyspark no significant progress on that front. It would be great if you can add make a pull request with the |
I can't do it against OCIO, as it's BSD3 and Natron is GPL2. |
Ugh, this is why I can't stand GPL. But I digress. There are two ways to skin this cat: First, unless the technique is patented, you (or somebody else) ought to be able to rewrite the code from scratch, an independent implementation of the idea of saving/restoring the locale in any OCIO function that does I/O. It's just not that complicated. Second, whoever wrote this code originally still owns it. They may have submitted it to Natron to be redistributed under GPL2, but the original author still owns the copyright and so I believe they are free to submit it to a different project under a different license. (Unless Natron requires copyright assignment of all submissions, not merely licensing. VERY few projects require copyright assignment. But if that is the case, then forget what I said about relicensing, you would need to re-implement.) Though personally, although I do appreciate this clever approach and like how unobtrusive it is to the surrounding code, I think it's a little silly to constantly get, save, change, and restore the locale EVERY time you enter and exit certain functions. It seems like a better approach is to just use locale-independent I/O, and then it just doesn't matter what locale the surrounding app is using. Furthermore, locale-independent I/O will tend to be much faster. My favourite choice these days is to base all text I/O on the fmt library. It's locale-independent, uses the new C++20 std::format syntax (but also has printf-like versions), has hands-down the fastest int and float <-> string implementations, and a bunch of other goodies. |
I wrote a C++ RAII wrapper that saves and restores locale on Windows, Mac, and Linux. I'd be happy to contribute it. We use it to wrap our calls into OCIO v1, so it's OCIO-proven already. |
I'd refrained from suggesting charconv's from_chars because OCIO requires C++11, but if fmt can be used, 🎉 ! |
fmtlib seems to only require C++11: "The library is highly portable and relies only on a small set of C++11 features:" Which would be great news! By far the best solution to this problem. Maybe that can even boost performance noticeably over scanf (unlike C++ iostream cough cough) |
Upon further review, If |
Um, you may be right, it may be that fmt has great float->string, but maybe not the other way around. That would make sense now, since fmt is about output, not input. OIIO (in its strutil.cpp) has its own locale-independent stoi/stof, but I don't know off the top of my head how their performance is, because in the context of OIIO it's just not important. Wait, that's not true, I have a unit test that does benchmarks. This is all on Mac with clang, YMMV on Linux or gcc. The "std" is C++ std. The "Strutil" versions are all using fmt or our own rolled underneath.
So the stoi I'm using is significantly faster than atoi, but the stof is the same speed or maybe just a tad slower than std strtod or atof (but for my needs here, it didn't need to be faster, it just needed to be reliably locale-independent). I'm sure there are faster implementations you can find if you look. Note also that I time how long it takes to get a reference to the C locale (only 1.2ns) and the cost of switching locales (263ns -- an order of magnitude more expensive than a string->float conversion). And also, below, I have output timing comparisons of std::sprintf vs the Strutil::format (based on fmt, and thus much faster than std C++):
If you have a current local build of OIIO, you can just run |
This commit brings in the amalgamated header of the fast_float library, and applies it to our uses of sscanf with %f. This library is locale-free, and does not incur the performance penalty of using a thread locale RAII class. Fixes AcademySoftwareFoundation#297 Fixes AcademySoftwareFoundation#379 Fixes AcademySoftwareFoundation#1322 Signed-off-by: L. E. Segovia <13498015+amyspark@users.noreply.github.com>
This commit adds Daniel Lemire's fast_float library as an external dependency, and applies it to our uses of sscanf with %f as well as istringstream. This library is locale-free, and does not incur the performance penalty of using a thread locale RAII class or imbuing a locale into a stringstream. See: Daniel Lemire, Number Parsing at a Gigabyte per Second, Software: Pratice and Experience 51 (8), 2021. <https://arxiv.org/abs/2101.11408> Fixes AcademySoftwareFoundation#297 Fixes AcademySoftwareFoundation#379 Fixes AcademySoftwareFoundation#1322 Signed-off-by: L. E. Segovia <13498015+amyspark@users.noreply.github.com>
This commit adds Daniel Lemire's fast_float library as an external dependency, and applies it to our uses of sscanf with %f as well as istringstream. This library is locale-free, and does not incur the performance penalty of using a thread locale RAII class or imbuing a locale into a stringstream. See: Daniel Lemire, Number Parsing at a Gigabyte per Second, Software: Pratice and Experience 51 (8), 2021. <https://arxiv.org/abs/2101.11408> Fixes AcademySoftwareFoundation#297 Fixes AcademySoftwareFoundation#379 Fixes AcademySoftwareFoundation#1322 Signed-off-by: L. E. Segovia <13498015+amyspark@users.noreply.github.com>
This commit adds Daniel Lemire's fast_float library as an external dependency, and applies it to our uses of sscanf with %f, strtod, and istringstream. Additionally, for parsing integer numbers, we implement a from_chars shim that forwards the call to strtol_l along with a statically initialized locale constant. The usage of fast_float is warded by a new OCIO_USE_FAST_FLOAT configuration variable; if disabled, an identical approach to integers is followed. See: Daniel Lemire, Number Parsing at a Gigabyte per Second, Software: Pratice and Experience 51 (8), 2021. <https://arxiv.org/abs/2101.11408> Fixes AcademySoftwareFoundation#297 Fixes AcademySoftwareFoundation#379 Fixes AcademySoftwareFoundation#1322 Co-Authored-By: Patrick Hodoul <Patrick.Hodoul@autodesk.com> Signed-off-by: L. E. Segovia <13498015+amyspark@users.noreply.github.com>
This commit adds Daniel Lemire's fast_float library as an external dependency, and applies it to our uses of sscanf with %f, strtod, and istringstream. Additionally, for parsing integer numbers, we implement a from_chars shim that forwards the call to strtol_l along with a statically initialized locale constant. The usage of fast_float is warded by a new OCIO_USE_FAST_FLOAT configuration variable; if disabled, an identical approach to integers is followed. See: Daniel Lemire, Number Parsing at a Gigabyte per Second, Software: Pratice and Experience 51 (8), 2021. <https://arxiv.org/abs/2101.11408> Fixes AcademySoftwareFoundation#297 Fixes AcademySoftwareFoundation#379 Fixes AcademySoftwareFoundation#1322 Co-Authored-By: Patrick Hodoul <Patrick.Hodoul@autodesk.com> Signed-off-by: L. E. Segovia <13498015+amyspark@users.noreply.github.com>
This commit adds Daniel Lemire's fast_float library as an external dependency, and applies it to our uses of sscanf with %f, strtod, and istringstream. Additionally, for parsing integer numbers, we implement a from_chars shim that forwards the call to strtol_l along with a statically initialized locale constant. The usage of fast_float is warded by a new OCIO_USE_FAST_FLOAT configuration variable; if disabled, an identical approach to integers is followed. See: Daniel Lemire, Number Parsing at a Gigabyte per Second, Software: Pratice and Experience 51 (8), 2021. <https://arxiv.org/abs/2101.11408> Fixes AcademySoftwareFoundation#297 Fixes AcademySoftwareFoundation#379 Fixes AcademySoftwareFoundation#1322 Co-Authored-By: Patrick Hodoul <Patrick.Hodoul@autodesk.com> Signed-off-by: L. E. Segovia <13498015+amyspark@users.noreply.github.com>
This commit adds Daniel Lemire's fast_float library as an external dependency, and applies it to our uses of sscanf with %f, strtod, and istringstream. Additionally, for parsing integer numbers, we implement a from_chars shim that forwards the call to strtol_l along with a statically initialized locale constant. The usage of fast_float is warded by a new OCIO_USE_FAST_FLOAT configuration variable; if disabled, an identical approach to integers is followed. See: Daniel Lemire, Number Parsing at a Gigabyte per Second, Software: Pratice and Experience 51 (8), 2021. <https://arxiv.org/abs/2101.11408> Fixes AcademySoftwareFoundation#297 Fixes AcademySoftwareFoundation#379 Fixes AcademySoftwareFoundation#1322 Co-Authored-By: Patrick Hodoul <Patrick.Hodoul@autodesk.com> Signed-off-by: L. E. Segovia <13498015+amyspark@users.noreply.github.com>
This commit adds Daniel Lemire's fast_float library as an external dependency, and applies it to our uses of sscanf with %f as well as istringstream. This library is locale-free, and does not incur the performance penalty of using a thread locale RAII class or imbuing a locale into a stringstream. See: Daniel Lemire, Number Parsing at a Gigabyte per Second, Software: Pratice and Experience 51 (8), 2021. <https://arxiv.org/abs/2101.11408> Fixes AcademySoftwareFoundation#297 Fixes AcademySoftwareFoundation#379 Fixes AcademySoftwareFoundation#1322 Signed-off-by: L. E. Segovia <13498015+amyspark@users.noreply.github.com>
This commit adds Daniel Lemire's fast_float library as an external dependency, and applies it to our uses of sscanf with %f, strtod, and istringstream. Additionally, for parsing integer numbers, we implement a from_chars shim that forwards the call to strtol_l along with a statically initialized locale constant. Unfortunately, to preserve consistency with from_chars the character range must be copied to a temporary std::string. The test suite has been adapted to account for this new behaviour. The usage of fast_float is warded by a new OCIO_USE_FAST_FLOAT configuration variable; if disabled, an identical approach to integers is followed. See: Daniel Lemire, Number Parsing at a Gigabyte per Second, Software: Pratice and Experience 51 (8), 2021. <https://arxiv.org/abs/2101.11408> Fixes AcademySoftwareFoundation#297 Fixes AcademySoftwareFoundation#379 Fixes AcademySoftwareFoundation#1322 Co-Authored-By: Patrick Hodoul <Patrick.Hodoul@autodesk.com> Signed-off-by: L. E. Segovia <13498015+amyspark@users.noreply.github.com>
This commit adds Daniel Lemire's fast_float library as an external dependency, and applies it to our uses of sscanf with %f, strtod, and istringstream. Additionally, for parsing integer numbers, we implement a from_chars shim that forwards the call to strtol_l along with a statically initialized locale constant. Unfortunately, to preserve consistency with from_chars the character range must be copied to a temporary std::string. The test suite has been adapted to account for this new behaviour. The usage of fast_float is warded by a new OCIO_USE_FAST_FLOAT configuration variable; if disabled, an identical approach to integers is followed. See: Daniel Lemire, Number Parsing at a Gigabyte per Second, Software: Pratice and Experience 51 (8), 2021. <https://arxiv.org/abs/2101.11408> Fixes AcademySoftwareFoundation#297 Fixes AcademySoftwareFoundation#379 Fixes AcademySoftwareFoundation#1322 Co-Authored-By: Patrick Hodoul <Patrick.Hodoul@autodesk.com> Signed-off-by: L. E. Segovia <13498015+amyspark@users.noreply.github.com>
This commit adds support for parsing numbers without being influenced by the current system locale. We implement a from_chars shim that forwards the call to strto*_l along with a statically initialized locale constant. Fixes AcademySoftwareFoundation#297 Fixes AcademySoftwareFoundation#379 Fixes AcademySoftwareFoundation#1322 Co-Authored-By: Patrick Hodoul <Patrick.Hodoul@autodesk.com> Signed-off-by: L. E. Segovia <13498015+amyspark@users.noreply.github.com>
This commit adds support for parsing numbers without being influenced by the current system locale. We implement a from_chars shim that forwards the call to strto*_l along with a statically initialized locale constant. Fixes AcademySoftwareFoundation#297 Fixes AcademySoftwareFoundation#379 Fixes AcademySoftwareFoundation#1322 Co-Authored-By: Patrick Hodoul <Patrick.Hodoul@autodesk.com> Signed-off-by: L. E. Segovia <13498015+amyspark@users.noreply.github.com>
This commit adds support for parsing numbers without being influenced by the current system locale. We implement a from_chars shim that forwards the call to strto*_l along with a statically initialized locale constant. Fixes #297 Fixes #379 Fixes #1322 Co-Authored-By: Patrick Hodoul <Patrick.Hodoul@autodesk.com> Signed-off-by: L. E. Segovia <13498015+amyspark@users.noreply.github.com>
This commit adds support for parsing numbers without being influenced by the current system locale. We implement a from_chars shim that forwards the call to strto*_l along with a statically initialized locale constant. Fixes #297 Fixes #379 Fixes #1322 Co-Authored-By: Patrick Hodoul <Patrick.Hodoul@autodesk.com> Signed-off-by: L. E. Segovia <13498015+amyspark@users.noreply.github.com>
This commit adds support for parsing numbers without being influenced by the current system locale. We implement a from_chars shim that forwards the call to strto*_l along with a statically initialized locale constant. Fixes #297 Fixes #379 Fixes #1322 Co-Authored-By: Patrick Hodoul <Patrick.Hodoul@autodesk.com> Signed-off-by: amyspark <13498015+amyspark@users.noreply.github.com>
This commit adds support for parsing numbers without being influenced by the current system locale. We implement a from_chars shim that forwards the call to strto*_l along with a statically initialized locale constant. Fixes #297 Fixes #379 Fixes #1322 Co-Authored-By: Patrick Hodoul <Patrick.Hodoul@autodesk.com> Signed-off-by: amyspark <13498015+amyspark@users.noreply.github.com> Co-authored-by: amyspark <13498015+amyspark@users.noreply.github.com>
For the last few years, the locale introduced some challenges to any OpenColorIO integration as the library does not recommend or enforce any locale requirement(s). So, the float number read could fail if the floats are saved like
10,0
instead of10.0
(i.e. French versus English notations). For example using a French localized DCC to save a LUT file and/or a config file, and later use an English localized DCC to read one of them is the use case to correctly handle.Following the discussion at a TSC meeting, the proposal is then to at least enforce an English float notation independently from the host locale, to read/write all numerics.
Over the years, several pull requests were created around that problem: #1089, #297, etc.
The text was updated successfully, but these errors were encountered: