-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TSAN error triggered by std::string race in FakeRawConnection::write #944
Comments
See envoyproxy#944. I don't think we properly understand the root cause yet, but this at least unblocks CI.
See #944. I don't think we properly understand the root cause yet, but this at least unblocks CI.
AFAIK this repros using GCC 5.4 stdlib on 16.04 in CI, so it's definitely not fixed by GCC 5 unless i'm missing something. I haven't looked into this in detail, but I would be utterly shocked if this bug exists even in 4.9 on normally compiled code. I strongly suspect that this is somehow related to TSAN clang compilation. We probably need to compare the generated unoptimized code from TSAN to both normal clang as well as GCC 5.4. |
Did you set |
See envoyproxy#944. I don't think we properly understand the root cause yet, but this at least unblocks CI.
I see. Wow. Learned something new today. Kind of amazed that I never encountered this before. I did a bunch of reading and I concur this is busted before GCC 5. My suggestion is that we drop GCC 4.9 and move to 5.4. On trusty we already make people install the PPA to get 4.9. 5.4 is available. On Xenial it's the default. RH is so far behind that people have to compile the toolchain anyway to get 4.9 so I think it doesn't matter. Thoughts? |
Additional data point: This issue did not reproduce in repeated runs of 1000 simultaneous tests with "-c opt" build mode. It only occurs in debug builds. |
We are moving to GCC > 5 because std::string in GCC < 5 is not thread safe and we were seeing issues on TSAN builds. Fixes #944
We are moving to GCC > 5 because std::string in GCC < 5 is not thread safe and we were seeing issues on TSAN builds. Fixes #944
See envoyproxy#944. I don't think we properly understand the root cause yet, but this at least unblocks CI.
Description: Switches CI workflows to be based off our new default main branch. Risk Level: Low Testing: CI Signed-off-by: Mike Schore <mike.schore@gmail.com> Signed-off-by: JP Simard <jp@jpsim.com>
Description: Switches CI workflows to be based off our new default main branch. Risk Level: Low Testing: CI Signed-off-by: Mike Schore <mike.schore@gmail.com> Signed-off-by: JP Simard <jp@jpsim.com>
malloc_fast_path now receives oom function instead of full allocation function and windows/patch_function.cc wasn't updated until now. It caused assertion failures as reported in issue envoyproxy#944.
From debugging prints etc. it seems that
memcpy
fromdata.c_str()
in the closure inFakeRawConnection::write()
is racing withdelete
onstd::string
in the main thread execution action closure.In gcc-4.9, the
std::basic_string
implementation is using reference counting and CoW. It's unclear exactly why the reference counting is not working correctly across threads, but there are some known issues (e.g. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21334).The workaround is to ensure the underlying
std::string
that is shared across threads stays alive for the test.This may just be an artifact of TSAN, or affect other parts of Envoy. This issue should stay open until we understand better what's going on.
The text was updated successfully, but these errors were encountered: