Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-27071][CORE] Expose additional metrics in status.api.v1.StageData #24011

Closed
wants to merge 5 commits into from

Conversation

tomvanbussel
Copy link
Contributor

@tomvanbussel tomvanbussel commented Mar 7, 2019

What changes were proposed in this pull request?

This PR exposes additional metrics in status.api.v1.StageData. These metrics were already computed for LiveStage, but they were never exposed to the user. This includes extra metrics about the JVM GC, executor (de)serialization times, shuffle reads and writes, and more.

How was this patch tested?

Existing tests.

cc @hvanhovell

@hvanhovell
Copy link
Contributor

@vanzin can you take a look?

@hvanhovell
Copy link
Contributor

ok to test

@attilapiros
Copy link
Contributor

attilapiros commented Mar 7, 2019

Unfortunately this change also increases the event log size meanwhile there are problems caused by the too huge event logs (think about a long running streaming app).

As this metrics are available via the monitoring subsystem and can be sent to an external system I have some doubts whether this is really needed.

@vanzin
Copy link
Contributor

vanzin commented Mar 7, 2019

@attilapiros I don't see any change that would affect the event log at all here?

(Haven't looked at everything in detail yet.)

@attilapiros
Copy link
Contributor

attilapiros commented Mar 8, 2019

@vanzin Oh I see SparkListenerStageCompleted already contains the TaskMetrics.

@hvanhovell
Copy link
Contributor

Any more feedback here? Otherwise I am going to pull the trigger on this one.

@vanzin
Copy link
Contributor

vanzin commented Mar 21, 2019

I most probably won't get to reviewing this. But tests haven't run.

This is also technically an API breakage, because you're removing fields from the public REST types.

@tomvanbussel
Copy link
Contributor Author

If we cannot break the API, then I see two options:

  1. We leave the old fields, and we also add the TaskMetrics. This would mean that some fields are duplicated.
  2. We inline all the fields from TaskMetrics into StageData. This is not as nice (in my opinion), but there will be no duplicates.

Which option is better?

@hvanhovell
Copy link
Contributor

@tomvanbussel lets go with option 2 for now.

@hvanhovell
Copy link
Contributor

ok to test

@SparkQA
Copy link

SparkQA commented Apr 1, 2019

Test build #104163 has finished for PR 24011 at commit 4974fa0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@hvanhovell hvanhovell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I will defer to @vanzin for final sign-off.

@hvanhovell
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented May 15, 2019

Test build #105429 has finished for PR 24011 at commit 4974fa0.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@hvanhovell
Copy link
Contributor

retest this please

1 similar comment
@hvanhovell
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented May 27, 2019

Test build #105829 has finished for PR 24011 at commit 4974fa0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@hvanhovell
Copy link
Contributor

Merging to master. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants