Skip to content

Commit

Permalink
[SPARK-3140] Clarify confusing PySpark exception message
Browse files Browse the repository at this point in the history
We read the py4j port from the stdout of the `bin/spark-submit` subprocess. If there is interference in stdout (e.g. a random echo in `spark-submit`), we throw an exception with a warning message. We do not, however, distinguish between this case from the case where no stdout is produced at all.

I wasted a non-trivial amount of time being baffled by this exception in search of places where I print random whitespace (in vain, of course). A clearer exception message that distinguishes between these cases will prevent similar headaches that I have gone through.

Author: Andrew Or <andrewor14@gmail.com>

Closes apache#2067 from andrewor14/python-exception and squashes the following commits:

742f823 [Andrew Or] Further clarify warning messages
e96a7a0 [Andrew Or] Distinguish between unexpected output and no output at all
  • Loading branch information
andrewor14 authored and conviva-zz committed Sep 4, 2014
1 parent fb196c0 commit 541c1eb
Showing 1 changed file with 10 additions and 3 deletions.
13 changes: 10 additions & 3 deletions python/pyspark/java_gateway.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,12 +54,19 @@ def preexec_func():
gateway_port = proc.stdout.readline()
gateway_port = int(gateway_port)
except ValueError:
# Grab the remaining lines of stdout
(stdout, _) = proc.communicate()
exit_code = proc.poll()
error_msg = "Launching GatewayServer failed"
error_msg += " with exit code %d! " % exit_code if exit_code else "! "
error_msg += "(Warning: unexpected output detected.)\n\n"
error_msg += gateway_port + stdout
error_msg += " with exit code %d!\n" % exit_code if exit_code else "!\n"
error_msg += "Warning: Expected GatewayServer to output a port, but found "
if gateway_port == "" and stdout == "":
error_msg += "no output.\n"
else:
error_msg += "the following:\n\n"
error_msg += "--------------------------------------------------------------\n"
error_msg += gateway_port + stdout
error_msg += "--------------------------------------------------------------\n"
raise Exception(error_msg)

# Create a thread to echo output from the GatewayServer, which is required
Expand Down

0 comments on commit 541c1eb

Please sign in to comment.