IpVersions/EchoIntegrationTest.AddRemoveListener/IPv6 is flaky #3997

lizan · 2018-07-31T06:40:53Z

Description:
on master 028387a run:
bazel test --runs_per_test=100 //test/integration:echo_integration_test

will result 2 runs out of 100 TIMEOUT, with "-l trace" got 8 out of 100.

The text was updated successfully, but these errors were encountered:

mattklein123 · 2018-07-31T16:17:15Z

I've seen this also on my own machine.

zuercher · 2018-08-14T18:26:17Z

I poke around this a bit yesterday evening and it seems to be a race in the AddRemoveListenerTest between the RawConnectionDriver making a connect attempt and the actual listener socket being closed.

It seems like sometimes the listener socket is closed concurrently with the RawConnectionDriver's connect attempt. The RawConnectionDriver's ConnectionImpl sees a successful connection (write event triggers onWriteReady and getsockopt returns no error) and then writes the initial data. No further events occur and the RawConnectionDriver waits in Dispatcher::run until the test times out.

When the test passes, the connect either happens before or after the socket close which either leads to an immediate connect failure or a deferred one, and in both those cases the test terminates successfully.

alyssawilk · 2018-08-14T21:00:50Z

This was failing enough today I'd back disabling first and debugging later, if anyone is willing to own debug

At least one failure mode is that when the listener was released, some other test would yoink the released port, and the "make sure we can not connect to a removed listener" check would unexpectedly result in a connection. Running the test as exclusive should fix that particular failure mode, and allow us to see if others exist. I believe the reason the test was flaking more often when run in parallel with -l trace is because the test ran more slowly, the lag between the listener releasing the port and the raw connection driver increased, so the likelihood that another test would snag the port also increased. Risk Level: Low (test only) Testing: 1000 runs with "exclusive" Docs Changes: n/a Release Notes: n/a Fixes #3997 Signed-off-by: Alyssa Wilk <alyssar@chromium.org>

lizan added the area/test flakes label Jul 31, 2018

mattklein123 added the help wanted Needs help! label Jul 31, 2018

alyssawilk mentioned this issue Aug 30, 2018

test: hopefully deflaking echo integration test #4304

Merged

alyssawilk self-assigned this Aug 30, 2018

htuch closed this as completed in #4304 Aug 30, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IpVersions/EchoIntegrationTest.AddRemoveListener/IPv6 is flaky #3997

IpVersions/EchoIntegrationTest.AddRemoveListener/IPv6 is flaky #3997

lizan commented Jul 31, 2018

mattklein123 commented Jul 31, 2018

zuercher commented Aug 14, 2018 •

edited

Loading

alyssawilk commented Aug 14, 2018

IpVersions/EchoIntegrationTest.AddRemoveListener/IPv6 is flaky #3997

IpVersions/EchoIntegrationTest.AddRemoveListener/IPv6 is flaky #3997

Comments

lizan commented Jul 31, 2018

mattklein123 commented Jul 31, 2018

zuercher commented Aug 14, 2018 • edited Loading

alyssawilk commented Aug 14, 2018

zuercher commented Aug 14, 2018 •

edited

Loading