Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-3140] Clarify confusing PySpark exception message #2067

Closed
wants to merge 2 commits into from

Conversation

andrewor14
Copy link
Contributor

We read the py4j port from the stdout of the bin/spark-submit subprocess. If there is interference in stdout (e.g. a random echo in spark-submit), we throw an exception with a warning message. We do not, however, distinguish between this case from the case where no stdout is produced at all.

I wasted a non-trivial amount of time being baffled by this exception in search of places where I print random whitespace (in vain, of course). A clearer exception message that distinguishes between these cases will prevent similar headaches that I have gone through.

@andrewor14
Copy link
Contributor Author

@kayousterhout @JoshRosen

@@ -54,12 +54,18 @@ def preexec_func():
gateway_port = proc.stdout.readline()
gateway_port = int(gateway_port)
except ValueError:
# Grab the remaining lines of stdout
(stdout, _) = proc.communicate()
exit_code = proc.poll()
error_msg = "Launching GatewayServer failed"
error_msg += " with exit code %d! " % exit_code if exit_code else "! "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated to your change but can you fix this too while you're at it? Looks like there's an extra "!" when there's no exit code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait, when there's no exit code it looks like this, no?

Launching GatewayServer failed! (Warning: ...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OH sorry misread this -- thought the "if" was for what to substitute. NVM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I misread it too the first time after reading your comment :)

@kayousterhout
Copy link
Contributor

This is great...there's been quite a lot of pain associated with these lines of code and I think this will help a lot.

@andrewor14
Copy link
Contributor Author

With the latest commit, this is what it looks like with extra output:

Exception: Launching GatewayServer failed!
Warning: Expected GatewayServer to output a port, but found the following:

--------------------------------------------------------------
Hello
Second line
Third line
56306
--------------------------------------------------------------

and no output:

Exception: Launching GatewayServer failed!
Warning: Expected GatewayServer to output a port, but found no output.

@kayousterhout
Copy link
Contributor

Cool this looks great!

@JoshRosen
Copy link
Contributor

This looks great. Surprised Jenkins hasn't chimed in to test this, but I don't think it would (intentionally) exercise this code path anyways, so I think it's safe to merge this since you've tested it locally.

@kayousterhout
Copy link
Contributor

Given the number of accidental-build-breaks that have happened recently, I think it would be good to let Jenkins have at this unless there's a rush to get it in

@kayousterhout
Copy link
Contributor

Jenkins, test this please

@SparkQA
Copy link

SparkQA commented Aug 20, 2014

QA tests have started for PR 2067 at commit 742f823.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Aug 20, 2014

QA tests have finished for PR 2067 at commit 742f823.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • shift # Ignore main class (org.apache.spark.deploy.SparkSubmit) and use our own
    • case class SparkListenerTaskStart(stageId: Int, stageAttemptId: Int, taskInfo: TaskInfo)

@andrewor14
Copy link
Contributor Author

Thanks, merged into master and 1.1

asfgit pushed a commit that referenced this pull request Aug 21, 2014
We read the py4j port from the stdout of the `bin/spark-submit` subprocess. If there is interference in stdout (e.g. a random echo in `spark-submit`), we throw an exception with a warning message. We do not, however, distinguish between this case from the case where no stdout is produced at all.

I wasted a non-trivial amount of time being baffled by this exception in search of places where I print random whitespace (in vain, of course). A clearer exception message that distinguishes between these cases will prevent similar headaches that I have gone through.

Author: Andrew Or <andrewor14@gmail.com>

Closes #2067 from andrewor14/python-exception and squashes the following commits:

742f823 [Andrew Or] Further clarify warning messages
e96a7a0 [Andrew Or] Distinguish between unexpected output and no output at all

(cherry picked from commit ba3c730)
Signed-off-by: Andrew Or <andrewor14@gmail.com>
@asfgit asfgit closed this in ba3c730 Aug 21, 2014
@andrewor14 andrewor14 deleted the python-exception branch August 25, 2014 22:48
xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
We read the py4j port from the stdout of the `bin/spark-submit` subprocess. If there is interference in stdout (e.g. a random echo in `spark-submit`), we throw an exception with a warning message. We do not, however, distinguish between this case from the case where no stdout is produced at all.

I wasted a non-trivial amount of time being baffled by this exception in search of places where I print random whitespace (in vain, of course). A clearer exception message that distinguishes between these cases will prevent similar headaches that I have gone through.

Author: Andrew Or <andrewor14@gmail.com>

Closes apache#2067 from andrewor14/python-exception and squashes the following commits:

742f823 [Andrew Or] Further clarify warning messages
e96a7a0 [Andrew Or] Distinguish between unexpected output and no output at all
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants