Skip to content

Commit

Permalink
[SPARK-33906][WEBUI] Fix the bug of UI Executor page stuck due to und…
Browse files Browse the repository at this point in the history
…efined peakMemoryMetrics

### What changes were proposed in this pull request?
Check if the executorSummary.peakMemoryMetrics is defined before accessing it. Without checking, the UI has risked being stuck at the Executors page.

### Why are the changes needed?
App live UI may stuck at Executors page without this fix.
Steps to reproduce (with master branch):
In mac OS standalone mode, open a spark-shell
$SPARK_HOME/bin/spark-shell --master spark://localhost:7077

val x = sc.makeRDD(1 to 100000, 5)
x.count()

Then open the app UI in the browser, and click the Executors page, will get stuck at this page:
![image](https://user-images.githubusercontent.com/26694233/103105677-ca1a7380-45f4-11eb-9245-c69f4a4e816b.png)

Also, the return JSON from API endpoint http://localhost:4040/api/v1/applications/app-20201224134418-0003/executors miss "peakMemoryMetrics" for executor objects. I attached the full json text in https://issues.apache.org/jira/browse/SPARK-33906.

I debugged it and observed that ExecutorMetricsPoller
.getExecutorUpdates returns an empty map, which causes peakExecutorMetrics to None in https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/status/LiveEntity.scala#L345. The possible reason for returning the empty map is that the stage completion time is shorter than the heartbeat interval, so the stage entry in stageTCMP has already been removed before the reportHeartbeat is called.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manual test, rerun the steps of bug reproduce and see the bug is gone.

Closes #30920 from baohe-zhang/SPARK-33906.

Authored-by: Baohe Zhang <baohe.zhang@verizonmedia.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
  • Loading branch information
Baohe Zhang authored and dongjoon-hyun committed Dec 31, 2020
1 parent ed9f728 commit 45df6db
Showing 1 changed file with 56 additions and 20 deletions.
76 changes: 56 additions & 20 deletions core/src/main/resources/org/apache/spark/ui/static/executorspage.js
Original file line number Diff line number Diff line change
Expand Up @@ -414,38 +414,74 @@ $(document).ready(function () {
},
{
data: function (row, type) {
if (type !== 'display')
return row.peakMemoryMetrics.JVMHeapMemory;
else
return (formatBytes(row.peakMemoryMetrics.JVMHeapMemory, type) + ' / ' +
formatBytes(row.peakMemoryMetrics.JVMOffHeapMemory, type));
var peakMemoryMetrics = row.peakMemoryMetrics;
if (typeof peakMemoryMetrics !== 'undefined') {
if (type !== 'display')
return peakMemoryMetrics.JVMHeapMemory;
else
return (formatBytes(peakMemoryMetrics.JVMHeapMemory, type) + ' / ' +
formatBytes(peakMemoryMetrics.JVMOffHeapMemory, type));
} else {
if (type !== 'display') {
return 0;
} else {
return '0.0 B / 0.0 B';
}
}
}
},
{
data: function (row, type) {
if (type !== 'display')
return row.peakMemoryMetrics.OnHeapExecutionMemory;
else
return (formatBytes(row.peakMemoryMetrics.OnHeapExecutionMemory, type) + ' / ' +
formatBytes(row.peakMemoryMetrics.OffHeapExecutionMemory, type));
var peakMemoryMetrics = row.peakMemoryMetrics;
if (typeof peakMemoryMetrics !== 'undefined') {
if (type !== 'display')
return peakMemoryMetrics.OnHeapExecutionMemory;
else
return (formatBytes(peakMemoryMetrics.OnHeapExecutionMemory, type) + ' / ' +
formatBytes(peakMemoryMetrics.OffHeapExecutionMemory, type));
} else {
if (type !== 'display') {
return 0;
} else {
return '0.0 B / 0.0 B';
}
}
}
},
{
data: function (row, type) {
if (type !== 'display')
return row.peakMemoryMetrics.OnHeapStorageMemory;
else
return (formatBytes(row.peakMemoryMetrics.OnHeapStorageMemory, type) + ' / ' +
formatBytes(row.peakMemoryMetrics.OffHeapStorageMemory, type));
var peakMemoryMetrics = row.peakMemoryMetrics;
if (typeof peakMemoryMetrics !== 'undefined') {
if (type !== 'display')
return peakMemoryMetrics.OnHeapStorageMemory;
else
return (formatBytes(peakMemoryMetrics.OnHeapStorageMemory, type) + ' / ' +
formatBytes(peakMemoryMetrics.OffHeapStorageMemory, type));
} else {
if (type !== 'display') {
return 0;
} else {
return '0.0 B / 0.0 B';
}
}
}
},
{
data: function (row, type) {
if (type !== 'display')
return row.peakMemoryMetrics.DirectPoolMemory;
else
return (formatBytes(row.peakMemoryMetrics.DirectPoolMemory, type) + ' / ' +
formatBytes(row.peakMemoryMetrics.MappedPoolMemory, type));
var peakMemoryMetrics = row.peakMemoryMetrics;
if (typeof peakMemoryMetrics !== 'undefined') {
if (type !== 'display')
return peakMemoryMetrics.DirectPoolMemory;
else
return (formatBytes(peakMemoryMetrics.DirectPoolMemory, type) + ' / ' +
formatBytes(peakMemoryMetrics.MappedPoolMemory, type));
} else {
if (type !== 'display') {
return 0;
} else {
return '0.0 B / 0.0 B';
}
}
}
},
{data: 'diskUsed', render: formatBytes},
Expand Down

0 comments on commit 45df6db

Please sign in to comment.