Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Switch from strftime to internal implementation for locale independent formats #2541

Closed
phprus opened this issue Oct 9, 2021 · 16 comments
Labels

Comments

@phprus
Copy link
Contributor

phprus commented Oct 9, 2021

Implementation of the idea:

I'm not sure if we should add more special cases but it would be nice to get rid of strftime completely in formatter::format by parsing the format string and using the optimization when possible. It would be not as fast as special casing but would be more generally applicable.

Originally posted by @vitaut in #2500 (comment)

My implementation

https://github.com/phprus/fmt/tree/optimize-tm-formatting-3

Functions https://github.com/phprus/fmt/blob/6c13ec65d06c5c0bba7806a01252179ae76cb1c9/include/fmt/chrono.h#L1545-L1576
can be optimized.

Performance

Not tested yet.

Problem

I don't know why, but on Windows the detail::write ( https://github.com/phprus/fmt/blob/6c13ec65d06c5c0bba7806a01252179ae76cb1c9/include/fmt/chrono.h#L364-L369 ) function works with a different locale than strftime.

 3/19 Test  #3: chrono-test ......................***Failed    0.06 sec
[ RUN      ] chrono_test.time_point
D:\a\fmt\fmt\test\chrono-test.cc(179): error: Expected equality of these values:
  output
    Which is: "Sat Oct  9 19:37:48 2021"
  fmt::format(fmt::runtime(fmt_spec), t1)
    Which is: "10/09/21 19:37:48"
D:\a\fmt\fmt\test\chrono-test.cc(180): error: Expected equality of these values:
  output
    Which is: "Sat Oct  9 19:37:48 2021"
  fmt::format(fmt::runtime(fmt_spec), tm)
    Which is: "10/09/21 19:37:48"
D:\a\fmt\fmt\test\chrono-test.cc(179): error: Expected equality of these values:
  output
    Which is: "Sat Oct  9 19:37:48 2021"
  fmt::format(fmt::runtime(fmt_spec), t1)
    Which is: "10/09/21 19:37:48"
D:\a\fmt\fmt\test\chrono-test.cc(180): error: Expected equality of these values:
  output
    Which is: "Sat Oct  9 19:37:48 2021"
  fmt::format(fmt::runtime(fmt_spec), tm)
    Which is: "10/09/21 19:37:48"
D:\a\fmt\fmt\test\chrono-test.cc(179): error: Expected equality of these values:
  output
    Which is: "07:37:48 PM"
  fmt::format(fmt::runtime(fmt_spec), t1)
    Which is: "19:37:48"
D:\a\fmt\fmt\test\chrono-test.cc(180): error: Expected equality of these values:
  output
    Which is: "07:37:48 PM"
  fmt::format(fmt::runtime(fmt_spec), tm)
    Which is: "19:37:48"
[  FAILED  ] chrono_test.time_point (3 ms)


16/19 Test #16: xchar-test .......................***Failed    0.03 sec
[ RUN      ] chrono_test.time_point
D:\a\fmt\fmt\test\xchar-test.cc(289): error: Expected equality of these values:
  output
    Which is: L"Sat Oct  9 19:37:49 2021"
  fmt::format(fmt_spec, t1)
    Which is: L"10/09/21 19:37:49"
D:\a\fmt\fmt\test\xchar-test.cc(290): error: Expected equality of these values:
  output
    Which is: L"Sat Oct  9 19:37:49 2021"
  fmt::format(fmt_spec, tm)
    Which is: L"10/09/21 19:37:49"
D:\a\fmt\fmt\test\xchar-test.cc(289): error: Expected equality of these values:
  output
    Which is: L"Sat Oct  9 19:37:49 2021"
  fmt::format(fmt_spec, t1)
    Which is: L"10/09/21 19:37:49"
D:\a\fmt\fmt\test\xchar-test.cc(290): error: Expected equality of these values:
  output
    Which is: L"Sat Oct  9 19:37:49 2021"
  fmt::format(fmt_spec, tm)
    Which is: L"10/09/21 19:37:49"
D:\a\fmt\fmt\test\xchar-test.cc(289): error: Expected equality of these values:
  output
    Which is: L"07:37:49 PM"
  fmt::format(fmt_spec, t1)
    Which is: L"19:37:49"
D:\a\fmt\fmt\test\xchar-test.cc(290): error: Expected equality of these values:
  output
    Which is: L"07:37:49 PM"
  fmt::format(fmt_spec, tm)
    Which is: L"19:37:49"
[  FAILED  ] chrono_test.time_point (3 ms)

@vitaut
Copy link
Contributor

vitaut commented Oct 10, 2021

strftime uses the C locale while detail::write uses the C++ locale (std::locale). We should use the latter and update the tests.

@phprus
Copy link
Contributor Author

phprus commented Oct 10, 2021

C and C++ locales are not compatible in Windows? I was unable to get the same output for strftime and detail::write.
I think I need help with work with std::locale...

I plan to write performance tests in the next few days.

Now all locale independent formats are implemented without strftime.

@phprus
Copy link
Contributor Author

phprus commented Oct 10, 2021

Other prototype for fast time formatting from @toughengineer (https://github.com/toughengineer/fmt, toughengineer@3ae5d81)

@phprus
Copy link
Contributor Author

phprus commented Oct 11, 2021

Perfomance test:
https://github.com/phprus/fmt-bench/tree/optimize-tm-formatting-3

@vitaut,
Can I create PR with implementation on strftime?
Or do I need to implement formatting using C ++ features?

@toughengineer
Copy link
Contributor

If you plan to stop using strftime,

  • you'll have to remove mentions of strftime (to set users' expectations straight) and fully specify supported formatting features, and
  • fmtlib will stop being a drop-in replacement for strftime in some cases, e.g. see here: https://godbolt.org/z/8Wz79YeaT.

What's the policy regarding compatibility with strftime?

@phprus
Copy link
Contributor Author

phprus commented Oct 12, 2021

These cases are GNU extensions. POSIX and MSVC strftime doesn't support them (https://pubs.opengroup.org/onlinepubs/9699919799/functions/strftime.html, https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/strftime-wcsftime-strftime-l-wcsftime-l?view=msvc-140)

The C and C++ standard defines a limited set of supported formats.

@toughengineer
Copy link
Contributor

toughengineer commented Oct 12, 2021

To clarify my earlier comment, currently the documentation says:

std::tm uses the system’s strftime so refer to its documentation for details on supported conversion specifiers.

which means that I should be able to use whatever format features supported by "the system’s strftime".

@phprus
Copy link
Contributor Author

phprus commented Oct 12, 2021

@vitaut, what do you think about this?
Need to maintain compatibility with C/C++ standard or POSIX or GNU or MS?

@vitaut
Copy link
Contributor

vitaut commented Oct 14, 2021

Ideally we shouldn't use strftime internally or in tests because it depends on global C locale. For locale-specific formatting you can use the time_put facet instead: https://github.com/fmtlib/fmt/blob/master/include/fmt/chrono.h#L296.

@phprus
Copy link
Contributor Author

phprus commented Oct 14, 2021

Ideally, but has a problem

On Windows strftime is slow, because using memory allocate and char <-> wchar_t encoding conversion (#2544 (comment)).

time_put (msvc stl, libstdc++ and libc++) call locale-specific strftime_* function on every call. Because of this, formatting with multiply call strftime/time_put works slower in Windows than a single strftime call.

My possible suggestion in PR #2544 description: 2 way processing with failback to strftime for locale dependent or non-standard formatter.

Questions for working on PR #2544:

  1. What about non C++11 standard formatters: failback to strftime/time_put or error?
  2. If the format contains locale dependent parts, then failback to a single strftime/time_put call or not? (for windows)?
  3. Migrate to time_put in this PR or separated?

If need, failback to a single strftime/time_put is easier to implement in this PR.

Questions about migrate to time_put:
Function do_write

fmt/include/fmt/chrono.h

Lines 291 to 292 in dcd282b

inline auto do_write(const std::tm& time, const std::locale& loc, char format,
char modifier) -> std::string {

format time always to UTF-8 string.
do_write function is used in detail::chrono_formatter which can work with wchar_t. wchar_t tests for this case are missing.
Maybe there is an error in this code and conversion to UTF-8 should be performed only for char?

@vitaut
Copy link
Contributor

vitaut commented Oct 16, 2021

  1. We don't need to support non-standard format specs. Giving an error is perfectly fine.
  2. No, we shouldn't fallback to a single call because it may give inconsistent results. Each format spec should be processed individually.
  3. Separate please.

Maybe there is an error in this code and conversion to UTF-8 should be performed only for char?

wchar_t is not implemented there yet, feel free to change but in a separate PR, not in #2544.

@toughengineer
Copy link
Contributor

If non-standard specs are not needed to be supported and each format spec should be processed individually, why not use locale facet instead of strftime and kill two birds with one stone also supporting format() calls with locale?

@toughengineer
Copy link
Contributor

toughengineer commented Oct 17, 2021

Is speed in FMT_COMPILE case as important as in the runtime case? I sped up formatting with arbitrary compile time format string more than twice, should it be added in this library? If yes, should the speed be achieved by any means, or is there a limit to complexity of the code?

@vitaut
Copy link
Contributor

vitaut commented Oct 17, 2021

If non-standard specs are not needed to be supported and each format spec should be processed individually, why not use locale facet instead of strftime and kill two birds with one stone also supporting format() calls with locale?

That's the plan.

Is speed in FMT_COMPILE case as important as in the runtime case?

Yes.

I sped up formatting with arbitrary compile time format string more than twice, should it be added in this library?

Definitely and thank you for doing this. Are you referring to #2543?

If yes, should the speed be achieved by any means, or is there a limit to complexity of the code?

There is a limit, we probably don't want to go very platform-specific.

@toughengineer
Copy link
Contributor

@vitaut

Definitely and thank you for doing this. Are you referring to #2543?

No, I made a PoC and was wondering if that's something of interest for this library.

There is a limit, we probably don't want to go very platform-specific.

Not platform specific, no JIT, just pure C++ (hopefully even without UB).

@vitaut
Copy link
Contributor

vitaut commented Oct 17, 2021

PR is welcome!

phprus referenced this issue Nov 11, 2021
* Fix unicode test

* Add xchar support to chrono formatter

* Replace strftime with std::time_put

* Add std::locale support to std::tm formatter

* Use predefined names and formats for C-locale

* Performance improvement

* Make locale-independent and C locale formats consistent among platforms
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants