Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bandaid for #44072 #47285

Merged
merged 1 commit into from
Jan 25, 2021
Merged

Conversation

josalem
Copy link
Contributor

@josalem josalem commented Jan 21, 2021

  • real fix needs to happen in TraceEvent

As noted in #44072, the intermittent failure is caused by the reader (EventPipeEventSource) indefinitely pausing while trying to read bytes to reach 8-byte alignment. EventBlocks are not required to be aligned lengths. There are only 2 events associated with the keyword EEStartupStart belongs to, so once those events have been received there won't be more data sent. If the EventBlock containing those events is not an 8-byte aligned size, the reader will read the EventBlock, but not dispatch the events until it receives enough data to fill the alignment requirement. That data won't arrive, so the test fails with the reader never reporting that the EEStartupStart event fired.

The bandaid fix is to put another provider on the test that guarantees there will be more data to read and the EventBlock containing EEStartupStart will get dispatched by EventPipeEventSource. In this case, I'm adding Microsoft-DotNETCore-SampleProfiler, but theoretically any other provider that generates an event would be sufficient.

I'm hacking up a patch for the reader code in TraceEvent that will prevent this corner case, but that will take time to flow into the runtime repo, so this bandaid will do for now.

I'll open the PR in draft mode to run the CI a few times through and then mark it for review. I want these tests turned back on before we complete #46079.

CC @tommcdon @noahfalk @sywhang @lateralusX @brianrob

* real fix needs to happen in TraceEvent
@josalem josalem added test-enhancement Improvements of test source code EventPipe labels Jan 21, 2021
@josalem josalem added this to the 6.0.0 milestone Jan 21, 2021
@josalem josalem self-assigned this Jan 21, 2021
@ghost
Copy link

ghost commented Jan 21, 2021

Tagging subscribers to this area: @tarekgh, @tommcdon, @pjanotti
See info in area-owners.md if you want to be subscribed.

Issue Details
  • real fix needs to happen in TraceEvent

As noted in #44072, the intermittent failure is caused by the reader (EventPipeEventSource) indefinitely pausing while trying to read bytes to reach 8-byte alignment. EventBlocks are not required to be aligned lengths. There are only 2 events associated with the keyword EEStartupStart belongs to, so once those events have been received there won't be more data sent. If the EventBlock containing those events is not an 8-byte aligned size, the reader will read the EventBlock, but not dispatch the events until it receives enough data to fill the alignment requirement. That data won't arrive, so the test fails with the reader never reporting that the EEStartupStart event fired.

The bandaid fix is to put another provider on the test that guarantees there will be more data to read and the EventBlock containing EEStartupStart will get dispatched by EventPipeEventSource. In this case, I'm adding Microsoft-DotNETCore-SampleProfiler, but theoretically any other provider that generates an event would be sufficient.

I'm hacking up a patch for the reader code in TraceEvent that will prevent this corner case, but that will take time to flow into the runtime repo, so this bandaid will do for now.

I'll open the PR in draft mode to run the CI a few times through and then mark it for review. I want these tests turned back on before we complete #46079.

CC @tommcdon @noahfalk @sywhang @lateralusX @brianrob

Author: josalem
Assignees: josalem
Labels:

EventPipe, area-System.Diagnostics.Tracing, test enhancement

Milestone: 6.0.0

@josalem
Copy link
Contributor Author

josalem commented Jan 21, 2021

/azp run runtime

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@josalem josalem requested review from sywhang and noahfalk and removed request for sywhang January 22, 2021 18:45
@josalem josalem marked this pull request as ready for review January 22, 2021 18:45
@josalem josalem requested a review from sywhang January 22, 2021 18:45
@josalem
Copy link
Contributor Author

josalem commented Jan 22, 2021

/azp run runtime

@josalem
Copy link
Contributor Author

josalem commented Jan 22, 2021

2 successful runs and I'm running a 3rd for more confidence.

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Contributor

@sywhang sywhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as long as you remove this when we have the new TraceEvent with the actual fix.

Copy link
Member

@noahfalk noahfalk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent bandaiding : )

@josalem josalem merged commit 6959d98 into dotnet:master Jan 25, 2021
@josalem josalem deleted the dev/josalem/diagport-test-bandaid branch January 25, 2021 18:44
@ghost ghost locked as resolved and limited conversation to collaborators Feb 24, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants