Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for multi-line log messages #455

Closed
soxofaan opened this issue Jun 8, 2022 · 5 comments · Fixed by #457
Closed

support for multi-line log messages #455

soxofaan opened this issue Jun 8, 2022 · 5 comments · Fixed by #457
Labels

Comments

@soxofaan
Copy link
Member

soxofaan commented Jun 8, 2022

When a batch job fails in the VITO back-end, the error logs typically contains a multi-line stack trace.
We currently encode the newlines (JSON style) as \n in the message string.
This \n is currently an ad-hoc solution in terms of the openEO API, because it does not specify how things (like multi-line log messages) should be encoded.
Can we standardize this in some way, so that all clients can build on this?

E.g. the web editor component collapses all whitespace (newlines and indentation), which makes user support painful.
And in the python client you also have to be careful to get a useful render of the log message.
Screenshot from 2022-06-08 17-28-12

@soxofaan soxofaan changed the title log messages: support for encoding or multi-line messages support for multi-line log messages Jun 8, 2022
@soxofaan
Copy link
Member Author

soxofaan commented Jun 8, 2022

Possible solutions:

  • Clarify that \n can be used for encoding newlines multi-line messages and that subsequent indentation should be preserved when rendering the message in the client
  • allow message to be a array of strings

@m-mohr
Copy link
Member

m-mohr commented Jun 8, 2022

This might be more a client issue, because the API only says "string" which implicitly allows new-lines, I'd say.
On the other hand, I did not expect people to put exessively long stacktraces into the message. What would be more user-friedly is to only pass the actual error message to message and pass the stack-trace e.g. to "data", that would massively improve the user experience. We could probably add a "formatting rule" for stack-traces in the implementation guide, similar to how we have it for inspect. The log component can already offer much more if fed correctly: https://open-eo.github.io/openeo-vue-components/ (see logs -> example ...)

@m-mohr
Copy link
Member

m-mohr commented Jun 8, 2022

@soxofaan Could you copy me an example for such an error in Python? So that I can see how it is structured and how we could format it? I think we can already make this much more pleasing without a lot of rewriting.

Thinking about something like:

{
  "id": "132",
  "level": "error",
  "message": "error processing batch job due to ...",
  "data": {
    "type": "Stacktrace",
    "stacktrace": [
      {"file": "batch_job.py", "line": 319, "text": "in main"},
      {"file": "batch_job.py", "line": 292, "text": "in run_driver"},
      ...
    ]
  }
}

Something like that should already render much nicer in the component.

@m-mohr m-mohr added the question label Jun 8, 2022
@soxofaan
Copy link
Member Author

soxofaan commented Jun 8, 2022

This is an example /logs response JSON dump from VITO backend:

{"logs":[{"id": "error", "level": "error", "message": "error processing batch job\nTraceback (most recent call last):\n  File \"batch_job.py\", line 319, in main\n    run_driver()\n  File \"batch_job.py\", line 292, in run_driver\n    run_job(\n  File \"/data2/hadoop/yarn/local/usercache/johndoe/appcache/application_1652795411773_14281/container_e5028_1652795411773_14281_01_000002/venv/lib/python3.8/site-packages/openeogeotrellis/utils.py\", line 41, in memory_logging_wrapper\n    return function(*args, **kwargs)\n  File \"batch_job.py\", line 388, in run_job\n    assets_metadata = result.write_assets(str(output_file))\n  File \"/data2/hadoop/yarn/local/usercache/johndoe/appcache/application_1652795411773_14281/container_e5028_1652795411773_14281_01_000002/venv/lib/python3.8/site-packages/openeo_driver/save_result.py\", line 110, in write_assets\n    return self.cube.write_assets(filename=directory, format=self.format, format_options=self.options)\n  File \"/data2/hadoop/yarn/local/usercache/johndoe/appcache/application_1652795411773_14281/container_e5028_1652795411773_14281_01_000002/venv/lib/python3.8/site-packages/openeogeotrellis/geopysparkdatacube.py\", line 1542, in write_assets\n    timestamped_paths = self._get_jvm().org.openeo.geotrellis.geotiff.package.saveRDDTemporal(\n  File \"/opt/spark3_2_0/python/lib/py4j-0.10.9.2-src.zip/py4j/java_gateway.py\", line 1309, in __call__\n    return_value = get_return_value(\n  File \"/opt/spark3_2_0/python/lib/py4j-0.10.9.2-src.zip/py4j/protocol.py\", line 326, in get_return_value\n    raise Py4JJavaError(\npy4j.protocol.Py4JJavaError: An error occurred while calling z:org.openeo.geotrellis.geotiff.package.saveRDDTemporal.\n: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3652 in stage 14.0 failed 4 times, most recent failure: Lost task 3652.3 in stage 14.0 (TID 3949) (epod130.vgt.vito.be executor 37): net.jodah.failsafe.FailsafeException: java.net.SocketTimeoutException: connect timed out\n\tat net.jodah.failsafe.FailsafeExecutor.call(FailsafeExecutor.java:385)\n\tat net.jodah.failsafe.FailsafeExecutor.get(FailsafeExecutor.java:68)\n\tat org.openeo.geotrellissentinelhub.package$.withRetries(package.scala:59)\n\tat org.openeo.geotrellissentinelhub.DefaultProcessApi.getTile(ProcessApi.scala:119)\n\tat org.openeo.geotrellissentinelhub.PyramidFactory.$anonfun$datacube_seq$1(PyramidFactory.scala:193)\n\tat org.openeo.geotrellissentinelhub.MemoizedRlGuardAdapterCachedAccessTokenWithAuthApiFallbackAuthorizer.authorized(Authorizer.scala:46)\n\tat org.openeo.geotrellissentinelhub.PyramidFactory.authorized(PyramidFactory.scala:56)\n\tat org.openeo.geotrellissentinelhub.PyramidFactory.org$openeo$geotrellissentinelhub$PyramidFactory$$getTile$1(PyramidFactory.scala:191)\n\tat org.openeo.geotrellissentinelhub.PyramidFactory.org$openeo$geotrellissentinelhub$PyramidFactory$$dataTile$1(PyramidFactory.scala:201)\n\tat org.openeo.geotrellissentinelhub.PyramidFactory.loadMasked$1(PyramidFactory.scala:226)\n\tat org.openeo.geotrellissentinelhub.PyramidFactory.$anonfun$datacube_seq$16(PyramidFactory.scala:283)\n\tat scala.collection.Iterator$$anon$10.next(Iterator.scala:459)\n\tat scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:512)\n\tat scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)\n\tat scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)\n\tat scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)\n\tat scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)\n\tat scala.collection.Iterator.foreach(Iterator.scala:941)\n\tat scala.collection.Iterator.foreach$(Iterator.scala:941)\n\tat scala.collection.AbstractIterator.foreach(Iterator.scala:1429)\n\tat org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:307)\n\tat org.apache.spark.api.python.PythonRunner$$anon$2.writeIteratorToStream(PythonRunner.scala:670)\n\tat org.apache.spark.api.python.BasePythonRunner$WriterThread.$anonfun$run$1(PythonRunner.scala:424)\n\tat org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:2019)\n\tat org.apache.spark.api.python.BasePythonRunner$WriterThread.run(PythonRunner.scala:259)\nCaused by: java.net.SocketTimeoutException: connect timed out\n\tat java.base/java.net.PlainSocketImpl.socketConnect(Native Method)\n\tat java.base/java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:412)\n\tat java.base/java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:255)\n\tat java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:237)\n\tat java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)\n\tat java.base/java.net.Socket.connect(Socket.java:609)\n\tat java.base/sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:300)\n\tat java.base/sun.net.NetworkClient.doConnect(NetworkClient.java:177)\n\tat java.base/sun.net.[www.http.HttpClient.openServer(HttpClient.java:474)\n\tat](https://www.http.HttpClient.openServer(HttpClient.java:474)\n\tat) java.base/sun.net.[www.http.HttpClient.openServer(HttpClient.java:569)\n\tat](https://www.http.HttpClient.openServer(HttpClient.java:569)\n\tat) java.base/sun.net.[www.protocol.https.HttpsClient.<init>(HttpsClient.java:266)\n\tat](https://www.protocol.https.HttpsClient.<init>(HttpsClient.java:266)\n\tat) java.base/sun.net.[www.protocol.https.HttpsClient.New(HttpsClient.java:373)\n\tat](https://www.protocol.https.HttpsClient.New(HttpsClient.java:373)\n\tat) java.base/sun.net.[www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:203)\n\tat](https://www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:203)\n\tat) java.base/sun.net.[www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1187)\n\tat](https://www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1187)\n\tat) java.base/sun.net.[www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1081)\n\tat](https://www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1081)\n\tat) java.base/sun.net.[www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:189)\n\tat](https://www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:189)\n\tat) java.base/sun.net.[www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1592)\n\tat](https://www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1592)\n\tat) java.base/sun.net.[www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1520)\n\tat](https://www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1520)\n\tat) java.base/java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:527)\n\tat java.base/sun.net.[www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:334)\n\tat](https://www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:334)\n\tat) scalaj.http.HttpRequest.doConnection(Http.scala:367)\n\tat scalaj.http.HttpRequest.exec(Http.scala:343)\n\tat org.openeo.geotrellissentinelhub.DefaultProcessApi.$anonfun$getTile$7(ProcessApi.scala:120)\n\tat org.openeo.geotrellissentinelhub.package$$anon$1.get(package.scala:60)\n\tat net.jodah.failsafe.Functions.lambda$get$0(Functions.java:46)\n\tat net.jodah.failsafe.RetryPolicyExecutor.lambda$supply$0(RetryPolicyExecutor.java:65)\n\tat net.jodah.failsafe.Execution.executeSync(Execution.java:128)\n\tat net.jodah.failsafe.FailsafeExecutor.call(FailsafeExecutor.java:378)\n\t... 24 more\n\nDriver stacktrace:\n\tat org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2403)\n\tat org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2352)\n\tat org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2351)\n\tat scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)\n\tat scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)\n\tat scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)\n\tat org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2351)\n\tat org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1109)\n\tat org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1109)\n\tat scala.Option.foreach(Option.scala:407)\n\tat org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1109)\n\tat org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2591)\n\tat org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2533)\n\tat org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2522)\n\tat org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)\n\tat org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:898)\n\tat org.apache.spark.SparkContext.runJob(SparkContext.scala:2214)\n\tat org.apache.spark.SparkContext.runJob(SparkContext.scala:2235)\n\tat org.apache.spark.SparkContext.runJob(SparkContext.scala:2254)\n\tat org.apache.spark.SparkContext.runJob(SparkContext.scala:2279)\n\tat org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1030)\n\tat org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)\n\tat org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)\n\tat org.apache.spark.rdd.RDD.withScope(RDD.scala:414)\n\tat org.apache.spark.rdd.RDD.collect(RDD.scala:1029)\n\tat org.openeo.geotrellis.geotiff.package$.saveRDDTemporal(package.scala:136)\n\tat org.openeo.geotrellis.geotiff.package.saveRDDTemporal(package.scala)\n\tat java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\tat java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.base/java.lang.reflect.Method.invoke(Method.java:566)\n\tat py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)\n\tat py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)\n\tat py4j.Gateway.invoke(Gateway.java:282)\n\tat py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)\n\tat py4j.commands.CallCommand.execute(CallCommand.java:79)\n\tat py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)\n\tat py4j.ClientServerConnection.run(ClientServerConnection.java:106)\n\tat java.base/java.lang.Thread.run(Thread.java:829)\nCaused by: net.jodah.failsafe.FailsafeException: java.net.SocketTimeoutException: connect timed out\n\tat net.jodah.failsafe.FailsafeExecutor.call(FailsafeExecutor.java:385)\n\tat net.jodah.failsafe.FailsafeExecutor.get(FailsafeExecutor.java:68)\n\tat org.openeo.geotrellissentinelhub.package$.withRetries(package.scala:59)\n\tat org.openeo.geotrellissentinelhub.DefaultProcessApi.getTile(ProcessApi.scala:119)\n\tat org.openeo.geotrellissentinelhub.PyramidFactory.$anonfun$datacube_seq$1(PyramidFactory.scala:193)\n\tat org.openeo.geotrellissentinelhub.MemoizedRlGuardAdapterCachedAccessTokenWithAuthApiFallbackAuthorizer.authorized(Authorizer.scala:46)\n\tat org.openeo.geotrellissentinelhub.PyramidFactory.authorized(PyramidFactory.scala:56)\n\tat org.openeo.geotrellissentinelhub.PyramidFactory.org$openeo$geotrellissentinelhub$PyramidFactory$$getTile$1(PyramidFactory.scala:191)\n\tat org.openeo.geotrellissentinelhub.PyramidFactory.org$openeo$geotrellissentinelhub$PyramidFactory$$dataTile$1(PyramidFactory.scala:201)\n\tat org.openeo.geotrellissentinelhub.PyramidFactory.loadMasked$1(PyramidFactory.scala:226)\n\tat org.openeo.geotrellissentinelhub.PyramidFactory.$anonfun$datacube_seq$16(PyramidFactory.scala:283)\n\tat scala.collection.Iterator$$anon$10.next(Iterator.scala:459)\n\tat scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:512)\n\tat scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)\n\tat scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)\n\tat scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)\n\tat scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)\n\tat scala.collection.Iterator.foreach(Iterator.scala:941)\n\tat scala.collection.Iterator.foreach$(Iterator.scala:941)\n\tat scala.collection.AbstractIterator.foreach(Iterator.scala:1429)\n\tat org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:307)\n\tat org.apache.spark.api.python.PythonRunner$$anon$2.writeIteratorToStream(PythonRunner.scala:670)\n\tat org.apache.spark.api.python.BasePythonRunner$WriterThread.$anonfun$run$1(PythonRunner.scala:424)\n\tat org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:2019)\n\tat org.apache.spark.api.python.BasePythonRunner$WriterThread.run(PythonRunner.scala:259)\nCaused by: java.net.SocketTimeoutException: connect timed out\n\tat java.base/java.net.PlainSocketImpl.socketConnect(Native Method)\n\tat java.base/java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:412)\n\tat java.base/java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:255)\n\tat java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:237)\n\tat java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)\n\tat java.base/java.net.Socket.connect(Socket.java:609)\n\tat java.base/sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:300)\n\tat java.base/sun.net.NetworkClient.doConnect(NetworkClient.java:177)\n\tat java.base/sun.net.[www.http.HttpClient.openServer(HttpClient.java:474)\n\tat](https://www.http.HttpClient.openServer(HttpClient.java:474)\n\tat) java.base/sun.net.[www.http.HttpClient.openServer(HttpClient.java:569)\n\tat](https://www.http.HttpClient.openServer(HttpClient.java:569)\n\tat) java.base/sun.net.[www.protocol.https.HttpsClient.<init>(HttpsClient.java:266)\n\tat](https://www.protocol.https.HttpsClient.<init>(HttpsClient.java:266)\n\tat) java.base/sun.net.[www.protocol.https.HttpsClient.New(HttpsClient.java:373)\n\tat](https://www.protocol.https.HttpsClient.New(HttpsClient.java:373)\n\tat) java.base/sun.net.[www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:203)\n\tat](https://www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:203)\n\tat) java.base/sun.net.[www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1187)\n\tat](https://www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1187)\n\tat) java.base/sun.net.[www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1081)\n\tat](https://www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1081)\n\tat) java.base/sun.net.[www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:189)\n\tat](https://www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:189)\n\tat) java.base/sun.net.[www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1592)\n\tat](https://www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1592)\n\tat) java.base/sun.net.[www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1520)\n\tat](https://www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1520)\n\tat) java.base/java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:527)\n\tat java.base/sun.net.[www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:334)\n\tat](https://www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:334)\n\tat) scalaj.http.HttpRequest.doConnection(Http.scala:367)\n\tat scalaj.http.HttpRequest.exec(Http.scala:343)\n\tat org.openeo.geotrellissentinelhub.DefaultProcessApi.$anonfun$getTile$7(ProcessApi.scala:120)\n\tat org.openeo.geotrellissentinelhub.package$$anon$1.get(package.scala:60)\n\tat net.jodah.failsafe.Functions.lambda$get$0(Functions.java:46)\n\tat net.jodah.failsafe.RetryPolicyExecutor.lambda$supply$0(RetryPolicyExecutor.java:65)\n\tat net.jodah.failsafe.Execution.executeSync(Execution.java:128)\n\tat net.jodah.failsafe.FailsafeExecutor.call(FailsafeExecutor.java:378)\n\t... 24 more\n\n"}],"links":[]}

Also note that because of our processing stack, this is pretty complex stack trace: part of it is a python stack trace, and part of it is Java/Scala stack trace, both of which can have multiple phases (e.g. an exception handler that raises another exception):

Traceback (most recent call last):
  File "batch_job.py", line 319, in main
    run_driver()
  File "batch_job.py", line 292, in run_driver
    run_job(
...
File "/opt/spark3_2_0/python/lib/py4j-0.10.9.2-src.zip/py4j/protocol.py", line 326, in get_return_value
    raise Py4JJavaError(
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.openeo.geotrellis.geotiff.package.saveRDDTemporal.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3652 in stage 14.0 failed 4 times, most recent failure: Lost task 3652.3 in stage 14.0 (TID 3949) (epod130.vgt.vito.be executor 37): net.jodah.failsafe.FailsafeException: java.net.SocketTimeoutException: connect timed out
	at net.jodah.failsafe.FailsafeExecutor.call(FailsafeExecutor.java:385)
	at net.jodah.failsafe.FailsafeExecutor.get(FailsafeExecutor.java:68)
...
Caused by: java.net.SocketTimeoutException: connect timed out
	at java.base/java.net.PlainSocketImpl.socketConnect(Native Method)
...
	... 24 more

Driver stacktrace:
	at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2403)
	at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2352)
...
Caused by: net.jodah.failsafe.FailsafeException: java.net.SocketTimeoutException: connect timed out
	at net.jodah.failsafe.FailsafeExecutor.call(FailsafeExecutor.java:385)

That makes it pretty hard to come up with some useful "type": "Stacktrace" standardization.

@m-mohr
Copy link
Member

m-mohr commented Jun 8, 2022

It would already help to extract the actual error message and put the stacktrace in an array/object style structure so that the component can render it in a more structured way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants