You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Occasionally, if we've run some Spark jobs via Zeppelin notebooks, if we attempt to run the same notebooks a few days/weeks later, an exception will show up and any Spark job will fail until the Zeppelin instance is restarted.
Logs
Fail to execute line 7: count = sc.parallelize(xrange(0, NUM_SAMPLES)) \ Traceback (most recent call last): File "/tmp/zeppelin_pyspark-237895005518745498.py", line 375, in <module> File "<stdin>", line 7, in <module> File "/home/fedora/spark/python/lib/pyspark.zip/pyspark/context.py", line 513, in parallelize return self.parallelize([], numSlices).mapPartitionsWithIndex(f) File "/home/fedora/spark/python/lib/pyspark.zip/pyspark/context.py", line 527, in parallelize jrdd = self._serialize_to_jvm(c, serializer, reader_func, createRDDServer) File "/home/fedora/spark/python/lib/pyspark.zip/pyspark/context.py", line 556, in _serialize_to_jvm tempFile = NamedTemporaryFile(delete=False, dir=self._temp_dir) File "/usr/lib64/python2.7/tempfile.py", line 475, in NamedTemporaryFile (fd, name) = _mkstemp_inner(dir, prefix, suffix, flags) File "/usr/lib64/python2.7/tempfile.py", line 244, in _mkstemp_inner fd = _os.open(file, flags, 0600) OSError: [Errno 2] No such file or directory: '/tmp/spark-temp/spark-f8efede5-5ffe-4b5d-a916-3197bd1f1232/pyspark-4a1de947-7798-4aae-9d81-91baa6ee2a1f/tmpoeV_tI'
How to Reproduce
Run a Spark job on the current Zeppelin/Hadoop prototype, leave idle for > 2 weeks, then attempt to run the same job again.
It looks like it's an issue with Zeppelin / Spark jobs storing files in /tmp which gets garbage collected after some time, and the Zeppelin interpreter expects to find a folder which is not there.
The commit above describes a potential fix, of using a different directory as the Spark local dir for Zeppelin.
stvoutsin
changed the title
Exception when running any Notebook example after long period of inactivity
Exception when running any Spark Notebook after long period of inactivity
May 28, 2020
Do we close this one (we have successfully identified the bug) and create some new issues, one for making it easier to flush the stale state of a notebook, one for placing temp files in a separate directory managed by us and finally creating the tools for managing a user's temp files ?
Symptoms:
Occasionally, if we've run some Spark jobs via Zeppelin notebooks, if we attempt to run the same notebooks a few days/weeks later, an exception will show up and any Spark job will fail until the Zeppelin instance is restarted.
Logs
Fail to execute line 7: count = sc.parallelize(xrange(0, NUM_SAMPLES)) \ Traceback (most recent call last): File "/tmp/zeppelin_pyspark-237895005518745498.py", line 375, in <module> File "<stdin>", line 7, in <module> File "/home/fedora/spark/python/lib/pyspark.zip/pyspark/context.py", line 513, in parallelize return self.parallelize([], numSlices).mapPartitionsWithIndex(f) File "/home/fedora/spark/python/lib/pyspark.zip/pyspark/context.py", line 527, in parallelize jrdd = self._serialize_to_jvm(c, serializer, reader_func, createRDDServer) File "/home/fedora/spark/python/lib/pyspark.zip/pyspark/context.py", line 556, in _serialize_to_jvm tempFile = NamedTemporaryFile(delete=False, dir=self._temp_dir) File "/usr/lib64/python2.7/tempfile.py", line 475, in NamedTemporaryFile (fd, name) = _mkstemp_inner(dir, prefix, suffix, flags) File "/usr/lib64/python2.7/tempfile.py", line 244, in _mkstemp_inner fd = _os.open(file, flags, 0600) OSError: [Errno 2] No such file or directory: '/tmp/spark-temp/spark-f8efede5-5ffe-4b5d-a916-3197bd1f1232/pyspark-4a1de947-7798-4aae-9d81-91baa6ee2a1f/tmpoeV_tI'
How to Reproduce
Run a Spark job on the current Zeppelin/Hadoop prototype, leave idle for > 2 weeks, then attempt to run the same job again.
This may be related to issue #83
The text was updated successfully, but these errors were encountered: