-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix time_since_epoch different in different os default return precision #4288
Conversation
5577bbb
to
33c7e57
Compare
Look abseil time library cause AddressSanitizer fail |
test/test_common/utility.cc
Outdated
static const int32_t SEED = std::chrono::duration_cast<std::chrono::nanoseconds>( | ||
std::chrono::system_clock::now().time_since_epoch()) | ||
.count(); | ||
static const int32_t SEED = absl::ToUnixNanos(absl::Now()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like the spot where the static initialization fiasco is getting introduced. I'm actually surprised it wasn't an issue before - maybe it was, and the effect is benign if it just perturbs the random seed.
I would try the technique described in https://isocpp.org/wiki/faq/ctors#static-init-order-on-first-use to see if the ASAN error goes away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, you are right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this direction (introducing absl::Time) is right. Though another thing to consideration is incorporate this better with Envoy::TimeSource, @jmarantz?
test/test_common/utility.cc
Outdated
std::chrono::system_clock::now().time_since_epoch()) | ||
.count(); | ||
int32_t getSeed() { | ||
static int32_t seed = absl::ToUnixNanos(absl::Now()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to add a const to the return value that is a value type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this what we want, to have the seed be static for the binary, as opposed to static for an individual test? Moreover, this may be related to the AddressSanitizer problem you have above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More good reading about the static initialization "fiasco" referenced in the AddressSanitizer report is here:
https://isocpp.org/wiki/faq/ctors#static-init-order-on-first-use
My personal rule of thumb: avoid class statics and pre-main function-calls wherever possible and programs will be easier to reason about and more portable. Compile-time initialized scalars, arrays, and structs are fine.
Re the "const" question; that's covered in https://google.github.io/styleguide/cppguide.html#Use_of_const
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain more about why you want to move from std::chrono to absl::time? From the description I don't fully understand what problem this solves.
If we do that I think it should be done pretty broadly over all of Envoy, and might be a bit easier in conjunction with follow-ups to #4257 , with check_format enhancements extending what's in #4248 to ensure we are working with a single coherent time system.
test/test_common/utility.cc
Outdated
std::chrono::system_clock::now().time_since_epoch()) | ||
.count(); | ||
int32_t getSeed() { | ||
static int32_t seed = absl::ToUnixNanos(absl::Now()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this what we want, to have the seed be static for the binary, as opposed to static for an individual test? Moreover, this may be related to the AddressSanitizer problem you have above.
@jmarantz Can you drive this PR to some conclusion that is compatible with your WIP? Thanks. |
@jmarantz
|
a721b2e
to
e248078
Compare
Would it be possible to solve #4278 by using std::chrono differently, rather than switching to a different time library? |
test/test_common/utility.cc
Outdated
std::chrono::system_clock::now().time_since_epoch()) | ||
.count(); | ||
int32_t getSeed() { | ||
static const int32_t seed = absl::ToUnixNanos(absl::Now()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. The old code was probably only working because it got 'lucky', despite its static-init fiasco. With absl::time you are getting less lucky.
Can you try just removing the 'static' keyword here? It will only get called once (per test) when TestRandomGenerator is constructed, and that should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I followed up in #4314 as that's where the code change is being proposed; hopefully there's enough context there now.
@jmarantz of course, but I still think that replacing |
c6fad8e
to
44b68e9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree absl::time is the right way to go, but not at just one point in the code. I think you should just fix the bug without introducing an alternate mechanism to measure time. Once there is a strong abstraction in place a global pivot to absl::time should be straightforward and safe.
In particular I would worry about subtle issues of skew.
Of course for a random seed it's probably ok, but seems unrelated to the bug fix and maybe should be split out?
@jmarantz Agree, I am re-editing the code, about the static initialization "fiasco" question, do I need to create an issue and pr? |
RE 'fiasco' -- I think just putting that getSeed() change into a separate PR would be fine; I'm not sure it needs to fix an issue, since the old code worked (even if just by luck). |
metric->set_timestamp_ms(std::chrono::system_clock::now().time_since_epoch().count()); | ||
metric->set_timestamp_ms(std::chrono::duration_cast<std::chrono::milliseconds>( | ||
std::chrono::system_clock::now().time_since_epoch()) | ||
.count()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great; can you factor out this verbose function call -- repeated 3x, into a helper function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I can put the helper function on the anonymous namespace in the current file?Or in the common/common/utility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
easiest to put it in local anon namespace, I will be moving it into TimeSource anyway.
There's a bit of debt here -- we should be using TimeSource::systemTime() rather than directly reading the system-clock, so that this code can have time-injection, but I'll fix that after you merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And actually, can you add (near your helper method):
// TODO(#4160): use TimeSource::systemTime() rather than directly reading the system-clock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks; looks great.
@dnoe You need to continue to review? |
@lizan Is this ready to go from your perspective? |
I didn't actually intend to approve, just to remove the "changes requested" in UI. Defer to @lizan for initial maintainer review. |
fix static initialization fiasco problem in #4288 Risk Level: low Signed-off-by: tianqian.zyf <tianqian.zyf@alibaba-inc.com>
@@ -16,6 +16,15 @@ namespace Extensions { | |||
namespace StatSinks { | |||
namespace MetricsService { | |||
|
|||
namespace { | |||
// TODO(#4160): use TimeSource::systemTime() rather than directly reading the system-clock. | |||
int64_t getUnixMicrosForNow() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function name suggests that it returns microseconds, but the body is dealing with milliseconds, seems like one of these should change.
@@ -16,6 +16,15 @@ namespace Extensions { | |||
namespace StatSinks { | |||
namespace MetricsService { | |||
|
|||
namespace { | |||
// TODO(#4160): use TimeSource::systemTime() rather than directly reading the system-clock. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmarantz what's stopping us switching to this right away? Seems best to avoid exception cases like this if we can given how regular time is now treated in Envoy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nothing, now that you merged the TimeSystem PR; this TODO is addressable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI the path to doing this is to plumb Event::TimeSystem& into constructors as needed, up to MainCommon. Note that both ServerImpl and Dispatcher support timeSystem() or timeSource() APIs, so once you go up the constructor stack till you have one of those, you can stop. #4341 is an example of this refactor.
Once we have all the std::chrono removed from everywhere in Envoy we could decide to switch to absl::Time, though I'd want to socialize the benefits of that to envoy-dev on slack first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…ecision Signed-off-by: tianqian.zyf <tianqian.zyf@alibaba-inc.com>
Signed-off-by: tianqian.zyf <tianqian.zyf@alibaba-inc.com>
Signed-off-by: tianqian.zyf <tianqian.zyf@alibaba-inc.com>
Signed-off-by: tianqian.zyf <tianqian.zyf@alibaba-inc.com>
Signed-off-by: tianqian.zyf <tianqian.zyf@alibaba-inc.com>
Signed-off-by: tianqian.zyf <tianqian.zyf@alibaba-inc.com>
Signed-off-by: tianqian.zyf <tianqian.zyf@alibaba-inc.com>
Signed-off-by: tianqian.zyf <tianqian.zyf@alibaba-inc.com>
82d1695
to
a4e61f9
Compare
Signed-off-by: tianqian.zyf <tianqian.zyf@alibaba-inc.com>
Signed-off-by: tianqian.zyf <tianqian.zyf@alibaba-inc.com>
Signed-off-by: tianqian.zyf <tianqian.zyf@alibaba-inc.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks for iterating on this. I've kicked the failing CI jobs, will merge when they pass.
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: tianqian.zyf tianqian.zyf@alibaba-inc.com
Description: time_since_epoch different in different os default return precision, use abseil time library to use unified timestamp
Risk Level: low
Testing: N/A
Docs Changes:
Release Notes:
[Optional Fixes #Issue] #4278
[Optional Deprecated:]