-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-8593] [core] Sort app attempts by start time. #7253
Conversation
Pulling functionality from apache spark
pull latest from apache spark
Pulling functionality from apache spark
add to whitelist |
Test build #36709 has finished for PR 7253 at commit
|
@@ -168,6 +174,10 @@ private[history] class HistoryPage(parent: HistoryServer) extends WebUIPage("") | |||
"-" | |||
} | |||
val lastUpdated = UIUtils.formatDate(attempt.lastUpdated) | |||
var someAttemptCompleted = false | |||
info.attempts.foreach{ attempt => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be val someAttemptCompleted = info.attempts.exists(_.completed)
My interpretation of the bug is different. Basically, the bug says that when an old attempt that shows up as "running" exists, the newer attempt, even if finished, is ignored, meaning the application is actually finished but does not show up on the finished list. I think the bug is actually in
That is wrong given what the bug describes; perhaps the correct sort order should be based solely on the attempt's start time. |
@vanzin interpretation of the bug is correct. If an early attempt is left around a .inprogress file but then the next attempt completed successfully then it should be considered completed and at least show the last attempt that completed successfully. |
thanks @vanzin @tgravescs @srowen @andrewor14 for your comments. My first line of attack was to remove early filtering in HistoryPage which entirely discards if first attempt was not completed(.filter(_.attempts.head.completed != requestedIncomplete)) but updating order of multiple attempts seems perfect. New order: Completed attempts before running attempts, running attempts sorted by start time showing whichever started first, completed attempts sorted by end time showing whichever ended first. |
@@ -64,61 +64,61 @@ private[history] class HistoryPage(parent: HistoryServer) extends WebUIPage("") | |||
{providerConfig.map { case (k, v) => <li><strong>{k}:</strong> {v}</li> }} | |||
</ul> | |||
{ | |||
// This displays the indices of pages that are within `plusOrMinus` pages of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the changes in this file are unnecessary, please revert.
You'll also need to write a unit test for this (see |
Test build #36817 has finished for PR 7253 at commit
|
Test build #36821 has finished for PR 7253 at commit
|
Test build #36829 has finished for PR 7253 at commit
|
Test build #36846 has finished for PR 7253 at commit
|
* application. Order is: running attempts before complete attempts, running attempts sorted | ||
* by start time, completed attempts sorted by end time. | ||
* application. Order is: completed attempts before running attempts, running attempts sorted | ||
* by start time showing whichever started first, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wording is weird. What does "showing" mean?
I'd say: "sorted by descending start time", "sorted by descending end time".
Note, also, that your current code sorts in ascending order, while I think it should be the opposite.
Just a minor thing about the ordering (which will probably also affect the test code), otherwise looks ok. Since this is not really exclusive to yarn, I'd remove it from the PR title. |
thanks @vanzin . unless i m missing something i intentionally kept ascending order for both start and end time. Users want to see which ended first (ascending endtime) on top, and which started first, which also has highest probability of getting finished first as well on top (ascending start time) The updated order as is below: |
No, users want to see the most recent one on top (i.e. the one that ended last). |
Ack @vanzin updating, start time also descending order - users want to see which started last first?? |
Correct. It should be very rare (if not impossible) for two attempts to overlap in time (at least on YARN), so an attempt that starts later will also end later. |
I think you are confusing me. As I said in my first comment, descending start time should cover all cases. |
imo that would not be correct @vanzin for the case when a1 is completed, and a2 is incomplete if a1 start times before a2, doing descending start time sort would imply sort returning a2 before a1, and thus showing incompleted attempt before completed.not right as per intention of the issue.we always want to show completed first.hence i do not think descending start time can cover all cases. The code would look as below when you say - "As I said in my first comment, descending start time should cover all cases."
For clear illustration, that would lead to 3rd block on all states of a1 and a2.('After' block is what is in my commit presently) |
Hi. updated commit for having order of descending start time if both attempts are completed or running, else completed attempts before running attempts.maybe that is what you meant earlier? @vanzin ? |
Test build #37101 has finished for PR 7253 at commit
|
That's the correct behavior. |
hmm..I see your point @vanzin .According to your prescription then, later attempts before earlier attempts is the correct order.Did you consider all possibilities?speculative execution?? |
There's no such thing as speculative execution of app attempts.
That should never happen (see above, attempts are only started after a previous attempt has finished).
No it's not. The case he described is when the first attempt finished executing (i.e. there's no live SparkContext anymore) but the log file where the attempt was writing its logs wasn't properly closed. There are never two attempts executing at the same time, and it's a safe assumption that if there is more than one attempt, all attempts except the most recent one (i.e. the one with the most recent start time) are not running. |
…time works everytime)
…time works everytime)
Test build #37161 has finished for PR 7253 at commit
|
Test build #37162 has finished for PR 7253 at commit
|
Jenkins retest this please. |
Test build #15 has finished for PR 7253 at commit
|
Test build #37236 has finished for PR 7253 at commit
|
Test build #37262 has finished for PR 7253 at commit
|
done @vanzin thanks |
Code looks good now, thanks! Could you update the PR title and description to match the actual change? e.g.
|
done @vanzin thanks |
ping @andrewor14 @srowen |
LGTM, will merge shortly if there are no objections into master and 1.4. |
Looks fine, feel free to merge it. |
This makes sure attempts are listed in the order they were executed, and that the app's state matches the state of the most current attempt. Author: Joshi <rekhajoshm@gmail.com> Author: Rekha Joshi <rekhajoshm@gmail.com> Closes #7253 from rekhajoshm/SPARK-8593 and squashes the following commits: 874dd80 [Joshi] History Server: updated order for multiple attempts(logcleaner) 716e0b1 [Joshi] History Server: updated order for multiple attempts(descending start time works everytime) 548c753 [Joshi] History Server: updated order for multiple attempts(descending start time works everytime) 83306a8 [Joshi] History Server: updated order for multiple attempts(descending start time) b0fc922 [Joshi] History Server: updated order for multiple attempts(updated comment) cc0fda7 [Joshi] History Server: updated order for multiple attempts(updated test) 304cb0b [Joshi] History Server: updated order for multiple attempts(reverted HistoryPage) 85024e8 [Joshi] History Server: updated order for multiple attempts a41ac4b [Joshi] History Server: updated order for multiple attempts ab65fa1 [Joshi] History Server: some attempt completed to work with showIncomplete 0be142d [Rekha Joshi] Merge pull request #3 from apache/master 106fd8e [Rekha Joshi] Merge pull request #2 from apache/master e3677c9 [Rekha Joshi] Merge pull request #1 from apache/master (cherry picked from commit 42d8a01) Signed-off-by: Sean Owen <sowen@cloudera.com>
thanks @vanzin @srowen @andrewor14 ! |
This makes sure attempts are listed in the order they were executed, and that the
app's state matches the state of the most current attempt.