You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On the driver (but this could happen anywhere), a read call failed in GoogleStorageFS. In particular line 205:
if (reader != null) {
reader.read(bb)
} else {
We don't retry transient errors here or below in the other call to read. We only retry on the initial creation of the stream.
I think we are concerned that the stream is in a bad state, possible advanced a few bytes. If we were to read from it, we might drop some data. The safe thing to do is to seek to the correct position. This will likely initiate a new HTTP request to GCS, which is fine, because we almost certainly lost the old connection due to the transient error.
I also think we need to remove lazyPosition. I think we can achieve the requester pays nonsense by just relying on the pos from the parent class (see FS.scala).
Version
0.2.115-71fc978b5c22
Relevant log output
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/reanalysis/summarise_clinvar_entries.py", line 531, in<module>
main(subs=args.s, date=processed_date, variants=args.v, out=args.o)
File "/usr/local/lib/python3.10/site-packages/reanalysis/summarise_clinvar_entries.py", line 505, in main
parse_into_table(json_path=temp_output, out_path=out)
File "/usr/local/lib/python3.10/site-packages/reanalysis/summarise_clinvar_entries.py", line 439, in parse_into_table
ht.write(out_path, overwrite=True)
File "<decorator-gen-1106>", line 2, in write
File "/usr/local/lib/python3.10/site-packages/hail/typecheck/check.py", line 584, in wrapper
return __original_func(*args_, **kwargs_)
File "/usr/local/lib/python3.10/site-packages/hail/table.py", line 1392, in write
Env.backend().execute(ir.TableWrite(self._tir, ir.TableNativeWriter(output, overwrite, stage_locally, _codec_spec)))
File "/usr/local/lib/python3.10/site-packages/hail/backend/service_backend.py", line 490, in execute
return self._cancel_on_ctrl_c(self._async_execute(ir, timed=timed))
File "/usr/local/lib/python3.10/site-packages/hail/backend/service_backend.py", line 481, in _cancel_on_ctrl_c
return async_to_blocking(coro)
File "/usr/local/lib/python3.10/site-packages/hailtop/utils/utils.py", line 152, in async_to_blocking
return loop.run_until_complete(task)
File "/usr/local/lib/python3.10/site-packages/nest_asyncio.py", line 90, in run_until_complete
returnf.result()
File "/usr/local/lib/python3.10/asyncio/futures.py", line 201, in result
raise self._exception.with_traceback(self._exception_tb)
File "/usr/local/lib/python3.10/asyncio/tasks.py", line 232, in __step
result = coro.send(None)
File "/usr/local/lib/python3.10/site-packages/hail/backend/service_backend.py", line 509, in _async_execute
_, resp, timings = await self._rpc('execute(...)', inputs, ir=ir, progress=progress)
File "/usr/local/lib/python3.10/site-packages/hail/backend/service_backend.py", line 451, in _rpc
result_bytes = await retry_transient_errors(self._read_output, ir, iodir + '/out')
File "/usr/local/lib/python3.10/site-packages/hailtop/utils/utils.py", line 779, in retry_transient_errors
return await retry_transient_errors_with_debug_string('', 0, f, *args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/hailtop/utils/utils.py", line 792, in retry_transient_errors_with_debug_string
return await f(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/hail/backend/service_backend.py", line 477, in _read_output
raise reconstructed_error.maybe_user_error(ir)
hail.utils.java.FatalError: SocketException: Connection reset
Java stack trace:
javax.net.ssl.SSLException: Connection reset
at sun.security.ssl.Alert.createSSLException(Alert.java:127)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:324)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:267)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:262)
at sun.security.ssl.SSLTransport.decode(SSLTransport.java:138)
at sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1400)
at sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1368)
at sun.security.ssl.SSLSocketImpl.access$300(SSLSocketImpl.java:73)
at sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:962)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at sun.net.www.MeteredStream.read(MeteredStream.java:134)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3456)
at com.google.api.client.http.javanet.NetHttpResponse$SizeValidatingInputStream.read(NetHttpResponse.java:164)
at java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
at is.hail.relocated.com.google.cloud.storage.StorageByteChannels$ScatteringByteChannelFacade.read(StorageByteChannels.java:226)
at is.hail.relocated.com.google.cloud.storage.ApiaryUnbufferedReadableByteChannel.read(ApiaryUnbufferedReadableByteChannel.java:104)
at is.hail.relocated.com.google.cloud.storage.UnbufferedReadableByteChannelSession$UnbufferedReadableByteChannel.read(UnbufferedReadableByteChannelSession.java:36)
at is.hail.relocated.com.google.cloud.storage.DefaultBufferedReadableByteChannel.read(DefaultBufferedReadableByteChannel.java:106)
at is.hail.relocated.com.google.cloud.storage.StorageByteChannels$SynchronizedBufferedReadableByteChannel.read(StorageByteChannels.java:84)
at is.hail.relocated.com.google.cloud.storage.BaseStorageReadChannel.read(BaseStorageReadChannel.java:91)
at is.hail.io.fs.GoogleStorageFS$$anon$1.readHandlingRequesterPays(GoogleStorageFS.scala:205)
at is.hail.io.fs.GoogleStorageFS$$anon$1.fill(GoogleStorageFS.scala:242)
at is.hail.io.fs.FSSeekableInputStream.read(FS.scala:164)
at java.io.DataInputStream.read(DataInputStream.java:100)
at is.hail.expr.ir.GenericLines$$anon$2.loadBuffer(GenericLines.scala:84)
at is.hail.expr.ir.GenericLines$$anon$2.readLine(GenericLines.scala:194)
at is.hail.expr.ir.GenericLines$$anon$2.hasNext(GenericLines.scala:214)
at __C18collect_distributed_array_shuffle_initial_write.apply_region1_42(Unknown Source)
at __C18collect_distributed_array_shuffle_initial_write.apply(Unknown Source)
at __C18collect_distributed_array_shuffle_initial_write.apply(Unknown Source)
at is.hail.backend.BackendUtils.$anonfun$collectDArray$2(BackendUtils.scala:38)
at is.hail.utils.package$.using(package.scala:635)
at is.hail.backend.BackendUtils.$anonfun$collectDArray$1(BackendUtils.scala:37)
at is.hail.utils.package$.using(package.scala:635)
at is.hail.backend.BackendUtils.collectDArray(BackendUtils.scala:36)
at __C5Compiled.__m7split_Let(Emit.scala)
at __C5Compiled.apply(Emit.scala)
at is.hail.expr.ir.CompileAndEvaluate$.$anonfun$_apply$7(CompileAndEvaluate.scala:74)
at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:23)
at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
at is.hail.expr.ir.CompileAndEvaluate$._apply(CompileAndEvaluate.scala:74)
at is.hail.expr.ir.CompileAndEvaluate$.$anonfun$apply$1(CompileAndEvaluate.scala:19)
at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
at is.hail.expr.ir.CompileAndEvaluate$.apply(CompileAndEvaluate.scala:19)
at is.hail.expr.ir.lowering.LowerDistributedSort$.distributedSort(LowerDistributedSort.scala:163)
at is.hail.backend.service.ServiceBackend.lowerDistributedSort(ServiceBackend.scala:356)
at is.hail.backend.Backend.lowerDistributedSort(Backend.scala:100)
at is.hail.expr.ir.lowering.LowerAndExecuteShuffles$.$anonfun$apply$1(LowerAndExecuteShuffles.scala:23)
at is.hail.expr.ir.RewriteBottomUp$.$anonfun$apply$4(RewriteBottomUp.scala:26)
at is.hail.utils.StackSafe$More.advance(StackSafe.scala:60)
at is.hail.utils.StackSafe$.run(StackSafe.scala:16)
at is.hail.utils.StackSafe$StackFrame.run(StackSafe.scala:32)
at is.hail.expr.ir.RewriteBottomUp$.apply(RewriteBottomUp.scala:36)
at is.hail.expr.ir.lowering.LowerAndExecuteShuffles$.apply(LowerAndExecuteShuffles.scala:20)
at is.hail.expr.ir.lowering.LowerAndExecuteShufflesPass.transform(LoweringPass.scala:157)
at is.hail.expr.ir.lowering.LoweringPass.$anonfun$apply$3(LoweringPass.scala:16)
at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
at is.hail.expr.ir.lowering.LoweringPass.$anonfun$apply$1(LoweringPass.scala:16)
at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
at is.hail.expr.ir.lowering.LoweringPass.apply(LoweringPass.scala:14)
at is.hail.expr.ir.lowering.LoweringPass.apply$(LoweringPass.scala:13)
at is.hail.expr.ir.lowering.LowerAndExecuteShufflesPass.apply(LoweringPass.scala:151)
at is.hail.expr.ir.lowering.LoweringPipeline.$anonfun$apply$1(LoweringPipeline.scala:22)
at is.hail.expr.ir.lowering.LoweringPipeline.$anonfun$apply$1$adapted(LoweringPipeline.scala:20)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at is.hail.expr.ir.lowering.LoweringPipeline.apply(LoweringPipeline.scala:20)
at is.hail.backend.service.ServiceBackend.execute(ServiceBackend.scala:312)
at is.hail.backend.service.ServiceBackend.execute(ServiceBackend.scala:348)
at is.hail.backend.service.ServiceBackendSocketAPI2.$anonfun$executeOneCommand$12(ServiceBackend.scala:700)
at is.hail.backend.service.ServiceBackendSocketAPI2.withIRFunctionsReadFromInput(ServiceBackend.scala:803)
at is.hail.backend.service.ServiceBackendSocketAPI2.$anonfun$executeOneCommand$11(ServiceBackend.scala:698)
at is.hail.backend.service.ServiceBackendSocketAPI2.$anonfun$executeOneCommand$2(ServiceBackend.scala:656)
at is.hail.backend.ExecuteContext$.$anonfun$scoped$3(ExecuteContext.scala:75)
at is.hail.utils.package$.using(package.scala:635)
at is.hail.backend.ExecuteContext$.$anonfun$scoped$2(ExecuteContext.scala:75)
at is.hail.utils.package$.using(package.scala:635)
at is.hail.annotations.RegionPool$.scoped(RegionPool.scala:17)
at is.hail.backend.ExecuteContext$.scoped(ExecuteContext.scala:63)
at is.hail.backend.service.ServiceBackendSocketAPI2.$anonfun$executeOneCommand$1(ServiceBackend.scala:646)
at is.hail.utils.ExecutionTimer$.time(ExecutionTimer.scala:52)
at is.hail.utils.ExecutionTimer$.logTime(ExecutionTimer.scala:59)
at is.hail.backend.service.ServiceBackendSocketAPI2.withExecuteContext$1(ServiceBackend.scala:633)
at is.hail.backend.service.ServiceBackendSocketAPI2.executeOneCommand(ServiceBackend.scala:695)
at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$6(ServiceBackend.scala:461)
at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$6$adapted(ServiceBackend.scala:460)
at is.hail.utils.package$.using(package.scala:635)
at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$5(ServiceBackend.scala:460)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at is.hail.services.package$.retryTransientErrors(package.scala:124)
at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$4(ServiceBackend.scala:460)
at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$4$adapted(ServiceBackend.scala:458)
at is.hail.utils.package$.using(package.scala:635)
at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$3(ServiceBackend.scala:458)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at is.hail.services.package$.retryTransientErrors(package.scala:124)
at is.hail.backend.service.ServiceBackendSocketAPI2$.main(ServiceBackend.scala:458)
at is.hail.backend.service.Main$.main(Main.scala:33)
at is.hail.backend.service.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at is.hail.JVMEntryway$1.run(JVMEntryway.java:105)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:210)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:464)
at sun.security.ssl.SSLSocketInputRecord.decodeInputRecord(SSLSocketInputRecord.java:237)
at sun.security.ssl.SSLSocketInputRecord.decode(SSLSocketInputRecord.java:190)
at sun.security.ssl.SSLTransport.decode(SSLTransport.java:109)
at sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1400)
at sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1368)
at sun.security.ssl.SSLSocketImpl.access$300(SSLSocketImpl.java:73)
at sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:962)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at sun.net.www.MeteredStream.read(MeteredStream.java:134)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3456)
at com.google.api.client.http.javanet.NetHttpResponse$SizeValidatingInputStream.read(NetHttpResponse.java:164)
at java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
at is.hail.relocated.com.google.cloud.storage.StorageByteChannels$ScatteringByteChannelFacade.read(StorageByteChannels.java:226)
at is.hail.relocated.com.google.cloud.storage.ApiaryUnbufferedReadableByteChannel.read(ApiaryUnbufferedReadableByteChannel.java:104)
at is.hail.relocated.com.google.cloud.storage.UnbufferedReadableByteChannelSession$UnbufferedReadableByteChannel.read(UnbufferedReadableByteChannelSession.java:36)
at is.hail.relocated.com.google.cloud.storage.DefaultBufferedReadableByteChannel.read(DefaultBufferedReadableByteChannel.java:106)
at is.hail.relocated.com.google.cloud.storage.StorageByteChannels$SynchronizedBufferedReadableByteChannel.read(StorageByteChannels.java:84)
at is.hail.relocated.com.google.cloud.storage.BaseStorageReadChannel.read(BaseStorageReadChannel.java:91)
at is.hail.io.fs.GoogleStorageFS$$anon$1.readHandlingRequesterPays(GoogleStorageFS.scala:205)
at is.hail.io.fs.GoogleStorageFS$$anon$1.fill(GoogleStorageFS.scala:242)
at is.hail.io.fs.FSSeekableInputStream.read(FS.scala:164)
at java.io.DataInputStream.read(DataInputStream.java:100)
at is.hail.expr.ir.GenericLines$$anon$2.loadBuffer(GenericLines.scala:84)
at is.hail.expr.ir.GenericLines$$anon$2.readLine(GenericLines.scala:194)
at is.hail.expr.ir.GenericLines$$anon$2.hasNext(GenericLines.scala:214)
at __C18collect_distributed_array_shuffle_initial_write.apply_region1_42(Unknown Source)
at __C18collect_distributed_array_shuffle_initial_write.apply(Unknown Source)
at __C18collect_distributed_array_shuffle_initial_write.apply(Unknown Source)
at is.hail.backend.BackendUtils.$anonfun$collectDArray$2(BackendUtils.scala:38)
at is.hail.utils.package$.using(package.scala:635)
at is.hail.backend.BackendUtils.$anonfun$collectDArray$1(BackendUtils.scala:37)
at is.hail.utils.package$.using(package.scala:635)
at is.hail.backend.BackendUtils.collectDArray(BackendUtils.scala:36)
at __C5Compiled.__m7split_Let(Emit.scala)
at __C5Compiled.apply(Emit.scala)
at is.hail.expr.ir.CompileAndEvaluate$.$anonfun$_apply$7(CompileAndEvaluate.scala:74)
at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:23)
at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
at is.hail.expr.ir.CompileAndEvaluate$._apply(CompileAndEvaluate.scala:74)
at is.hail.expr.ir.CompileAndEvaluate$.$anonfun$apply$1(CompileAndEvaluate.scala:19)
at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
at is.hail.expr.ir.CompileAndEvaluate$.apply(CompileAndEvaluate.scala:19)
at is.hail.expr.ir.lowering.LowerDistributedSort$.distributedSort(LowerDistributedSort.scala:163)
at is.hail.backend.service.ServiceBackend.lowerDistributedSort(ServiceBackend.scala:356)
at is.hail.backend.Backend.lowerDistributedSort(Backend.scala:100)
at is.hail.expr.ir.lowering.LowerAndExecuteShuffles$.$anonfun$apply$1(LowerAndExecuteShuffles.scala:23)
at is.hail.expr.ir.RewriteBottomUp$.$anonfun$apply$4(RewriteBottomUp.scala:26)
at is.hail.utils.StackSafe$More.advance(StackSafe.scala:60)
at is.hail.utils.StackSafe$.run(StackSafe.scala:16)
at is.hail.utils.StackSafe$StackFrame.run(StackSafe.scala:32)
at is.hail.expr.ir.RewriteBottomUp$.apply(RewriteBottomUp.scala:36)
at is.hail.expr.ir.lowering.LowerAndExecuteShuffles$.apply(LowerAndExecuteShuffles.scala:20)
at is.hail.expr.ir.lowering.LowerAndExecuteShufflesPass.transform(LoweringPass.scala:157)
at is.hail.expr.ir.lowering.LoweringPass.$anonfun$apply$3(LoweringPass.scala:16)
at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
at is.hail.expr.ir.lowering.LoweringPass.$anonfun$apply$1(LoweringPass.scala:16)
at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
at is.hail.expr.ir.lowering.LoweringPass.apply(LoweringPass.scala:14)
at is.hail.expr.ir.lowering.LoweringPass.apply$(LoweringPass.scala:13)
at is.hail.expr.ir.lowering.LowerAndExecuteShufflesPass.apply(LoweringPass.scala:151)
at is.hail.expr.ir.lowering.LoweringPipeline.$anonfun$apply$1(LoweringPipeline.scala:22)
at is.hail.expr.ir.lowering.LoweringPipeline.$anonfun$apply$1$adapted(LoweringPipeline.scala:20)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at is.hail.expr.ir.lowering.LoweringPipeline.apply(LoweringPipeline.scala:20)
at is.hail.backend.service.ServiceBackend.execute(ServiceBackend.scala:312)
at is.hail.backend.service.ServiceBackend.execute(ServiceBackend.scala:348)
at is.hail.backend.service.ServiceBackendSocketAPI2.$anonfun$executeOneCommand$12(ServiceBackend.scala:700)
at is.hail.backend.service.ServiceBackendSocketAPI2.withIRFunctionsReadFromInput(ServiceBackend.scala:803)
at is.hail.backend.service.ServiceBackendSocketAPI2.$anonfun$executeOneCommand$11(ServiceBackend.scala:698)
at is.hail.backend.service.ServiceBackendSocketAPI2.$anonfun$executeOneCommand$2(ServiceBackend.scala:656)
at is.hail.backend.ExecuteContext$.$anonfun$scoped$3(ExecuteContext.scala:75)
at is.hail.utils.package$.using(package.scala:635)
at is.hail.backend.ExecuteContext$.$anonfun$scoped$2(ExecuteContext.scala:75)
at is.hail.utils.package$.using(package.scala:635)
at is.hail.annotations.RegionPool$.scoped(RegionPool.scala:17)
at is.hail.backend.ExecuteContext$.scoped(ExecuteContext.scala:63)
at is.hail.backend.service.ServiceBackendSocketAPI2.$anonfun$executeOneCommand$1(ServiceBackend.scala:646)
at is.hail.utils.ExecutionTimer$.time(ExecutionTimer.scala:52)
at is.hail.utils.ExecutionTimer$.logTime(ExecutionTimer.scala:59)
at is.hail.backend.service.ServiceBackendSocketAPI2.withExecuteContext$1(ServiceBackend.scala:633)
at is.hail.backend.service.ServiceBackendSocketAPI2.executeOneCommand(ServiceBackend.scala:695)
at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$6(ServiceBackend.scala:461)
at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$6$adapted(ServiceBackend.scala:460)
at is.hail.utils.package$.using(package.scala:635)
at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$5(ServiceBackend.scala:460)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at is.hail.services.package$.retryTransientErrors(package.scala:124)
at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$4(ServiceBackend.scala:460)
at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$4$adapted(ServiceBackend.scala:458)
at is.hail.utils.package$.using(package.scala:635)
at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$3(ServiceBackend.scala:458)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at is.hail.services.package$.retryTransientErrors(package.scala:124)
at is.hail.backend.service.ServiceBackendSocketAPI2$.main(ServiceBackend.scala:458)
at is.hail.backend.service.Main$.main(Main.scala:33)
at is.hail.backend.service.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at is.hail.JVMEntryway$1.run(JVMEntryway.java:105)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Hail version: 0.2.115-71fc978b5c22
Error summary: SocketException: Connection reset
-------------------
Some more content from the failing worker job:
...
2023-05-04 01:04:35.959 : INFO: executing D-Array [shuffle_initial_write] with 1 tasks
2023-05-04 01:04:35.960 : INFO: RegionPool: initialized for thread 8: pool-1-thread-1
2023-05-04 01:04:35.965 GoogleStorageFS$: INFO: createNoCompression: gs://cpg-acute-care-hail/batch-tmp/tmp/hail/pV2Mgy4FVKSGKMwZGafyTh/hail_shuffle_temp_initial-ktRgTs8RfA9fHie5JKHmUy0e020450-e61c-4fa9-9419-2278528f3c86
2023-05-04 01:04:37.559 : INFO: TaskReport: stage=0, partition=0, attempt=0, peakBytes=132096, peakBytesReadable=129.00 KiB, chunks requested=0, cache hits=0
2023-05-04 01:04:37.560 : INFO: RegionPool: FREE: 129.0K allocated (129.0K blocks / 0 chunks), regions.size = 3, 0 current java objects, thread 8: pool-1-thread-1
2023-05-04 01:04:37.561 : ERROR: error while applying lowering 'LowerAndExecuteShuffles'
2023-05-04 01:04:37.600 : INFO: RegionPool: initialized for thread 8: pool-1-thread-1
2023-05-04 01:04:37.601 : INFO: TaskReport: stage=0, partition=0, attempt=0, peakBytes=0, peakBytesReadable=0.00 B, chunks requested=0, cache hits=0
2023-05-04 01:04:37.601 : INFO: RegionPool: FREE: 0 allocated (0 blocks / 0 chunks), regions.size = 0, 0 current java objects, thread 8: pool-1-thread-1
2023-05-04 01:04:37.601 : INFO: RegionPool: FREE: 128.0K allocated (128.0K blocks / 0 chunks), regions.size = 2, 0 current java objects, thread 8: pool-1-thread-1
2023-05-04 01:04:37.603 : ERROR: SocketException: Connection reset
From javax.net.ssl.SSLException: Connection reset
at sun.security.ssl.Alert.createSSLException(Alert.java:127)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:324)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:267)
at sun.security.ssl.TransportContext.fatal(TransportContext.java:262)
at sun.security.ssl.SSLTransport.decode(SSLTransport.java:138)
at sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1400)
at sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1368)
at sun.security.ssl.SSLSocketImpl.access$300(SSLSocketImpl.java:73)
at sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:962)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at sun.net.www.MeteredStream.read(MeteredStream.java:134)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3456)
at com.google.api.client.http.javanet.NetHttpResponse$SizeValidatingInputStream.read(NetHttpResponse.java:164)
at java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
at is.hail.relocated.com.google.cloud.storage.StorageByteChannels$ScatteringByteChannelFacade.read(StorageByteChannels.java:226)
at is.hail.relocated.com.google.cloud.storage.ApiaryUnbufferedReadableByteChannel.read(ApiaryUnbufferedReadableByteChannel.java:104)
at is.hail.relocated.com.google.cloud.storage.UnbufferedReadableByteChannelSession$UnbufferedReadableByteChannel.read(UnbufferedReadableByteChannelSession.java:36)
at is.hail.relocated.com.google.cloud.storage.DefaultBufferedReadableByteChannel.read(DefaultBufferedReadableByteChannel.java:106)
at is.hail.relocated.com.google.cloud.storage.StorageByteChannels$SynchronizedBufferedReadableByteChannel.read(StorageByteChannels.java:84)
at is.hail.relocated.com.google.cloud.storage.BaseStorageReadChannel.read(BaseStorageReadChannel.java:91)
at is.hail.io.fs.GoogleStorageFS$$anon$1.readHandlingRequesterPays(GoogleStorageFS.scala:205)
at is.hail.io.fs.GoogleStorageFS$$anon$1.fill(GoogleStorageFS.scala:242)
at is.hail.io.fs.FSSeekableInputStream.read(FS.scala:164)
at java.io.DataInputStream.read(DataInputStream.java:100)
at is.hail.expr.ir.GenericLines$$anon$2.loadBuffer(GenericLines.scala:84)
at is.hail.expr.ir.GenericLines$$anon$2.readLine(GenericLines.scala:194)
at is.hail.expr.ir.GenericLines$$anon$2.hasNext(GenericLines.scala:214)
at __C18collect_distributed_array_shuffle_initial_write.apply_region1_42(Unknown Source)
at __C18collect_distributed_array_shuffle_initial_write.apply(Unknown Source)
at __C18collect_distributed_array_shuffle_initial_write.apply(Unknown Source)
at is.hail.backend.BackendUtils.$anonfun$collectDArray$2(BackendUtils.scala:38)
at is.hail.utils.package$.using(package.scala:635)
at is.hail.backend.BackendUtils.$anonfun$collectDArray$1(BackendUtils.scala:37)
at is.hail.utils.package$.using(package.scala:635)
at is.hail.backend.BackendUtils.collectDArray(BackendUtils.scala:36)
at __C5Compiled.__m7split_Let(Emit.scala)
at __C5Compiled.apply(Emit.scala)
at is.hail.expr.ir.CompileAndEvaluate$.$anonfun$_apply$7(CompileAndEvaluate.scala:74)
at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:23)
at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
at is.hail.expr.ir.CompileAndEvaluate$._apply(CompileAndEvaluate.scala:74)
at is.hail.expr.ir.CompileAndEvaluate$.$anonfun$apply$1(CompileAndEvaluate.scala:19)
at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
at is.hail.expr.ir.CompileAndEvaluate$.apply(CompileAndEvaluate.scala:19)
at is.hail.expr.ir.lowering.LowerDistributedSort$.distributedSort(LowerDistributedSort.scala:163)
at is.hail.backend.service.ServiceBackend.lowerDistributedSort(ServiceBackend.scala:356)
at is.hail.backend.Backend.lowerDistributedSort(Backend.scala:100)
at is.hail.expr.ir.lowering.LowerAndExecuteShuffles$.$anonfun$apply$1(LowerAndExecuteShuffles.scala:23)
at is.hail.expr.ir.RewriteBottomUp$.$anonfun$apply$4(RewriteBottomUp.scala:26)
at is.hail.utils.StackSafe$More.advance(StackSafe.scala:60)
at is.hail.utils.StackSafe$.run(StackSafe.scala:16)
at is.hail.utils.StackSafe$StackFrame.run(StackSafe.scala:32)
at is.hail.expr.ir.RewriteBottomUp$.apply(RewriteBottomUp.scala:36)
at is.hail.expr.ir.lowering.LowerAndExecuteShuffles$.apply(LowerAndExecuteShuffles.scala:20)
at is.hail.expr.ir.lowering.LowerAndExecuteShufflesPass.transform(LoweringPass.scala:157)
at is.hail.expr.ir.lowering.LoweringPass.$anonfun$apply$3(LoweringPass.scala:16)
at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
at is.hail.expr.ir.lowering.LoweringPass.$anonfun$apply$1(LoweringPass.scala:16)
at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
at is.hail.expr.ir.lowering.LoweringPass.apply(LoweringPass.scala:14)
at is.hail.expr.ir.lowering.LoweringPass.apply$(LoweringPass.scala:13)
at is.hail.expr.ir.lowering.LowerAndExecuteShufflesPass.apply(LoweringPass.scala:151)
at is.hail.expr.ir.lowering.LoweringPipeline.$anonfun$apply$1(LoweringPipeline.scala:22)
at is.hail.expr.ir.lowering.LoweringPipeline.$anonfun$apply$1$adapted(LoweringPipeline.scala:20)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at is.hail.expr.ir.lowering.LoweringPipeline.apply(LoweringPipeline.scala:20)
at is.hail.backend.service.ServiceBackend.execute(ServiceBackend.scala:312)
at is.hail.backend.service.ServiceBackend.execute(ServiceBackend.scala:348)
at is.hail.backend.service.ServiceBackendSocketAPI2.$anonfun$executeOneCommand$12(ServiceBackend.scala:700)
at is.hail.backend.service.ServiceBackendSocketAPI2.withIRFunctionsReadFromInput(ServiceBackend.scala:803)
at is.hail.backend.service.ServiceBackendSocketAPI2.$anonfun$executeOneCommand$11(ServiceBackend.scala:698)
at is.hail.backend.service.ServiceBackendSocketAPI2.$anonfun$executeOneCommand$2(ServiceBackend.scala:656)
at is.hail.backend.ExecuteContext$.$anonfun$scoped$3(ExecuteContext.scala:75)
at is.hail.utils.package$.using(package.scala:635)
at is.hail.backend.ExecuteContext$.$anonfun$scoped$2(ExecuteContext.scala:75)
at is.hail.utils.package$.using(package.scala:635)
at is.hail.annotations.RegionPool$.scoped(RegionPool.scala:17)
at is.hail.backend.ExecuteContext$.scoped(ExecuteContext.scala:63)
at is.hail.backend.service.ServiceBackendSocketAPI2.$anonfun$executeOneCommand$1(ServiceBackend.scala:646)
at is.hail.utils.ExecutionTimer$.time(ExecutionTimer.scala:52)
at is.hail.utils.ExecutionTimer$.logTime(ExecutionTimer.scala:59)
at is.hail.backend.service.ServiceBackendSocketAPI2.withExecuteContext$1(ServiceBackend.scala:633)
at is.hail.backend.service.ServiceBackendSocketAPI2.executeOneCommand(ServiceBackend.scala:695)
at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$6(ServiceBackend.scala:461)
at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$6$adapted(ServiceBackend.scala:460)
at is.hail.utils.package$.using(package.scala:635)
at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$5(ServiceBackend.scala:460)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at is.hail.services.package$.retryTransientErrors(package.scala:124)
at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$4(ServiceBackend.scala:460)
at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$4$adapted(ServiceBackend.scala:458)
at is.hail.utils.package$.using(package.scala:635)
at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$3(ServiceBackend.scala:458)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at is.hail.services.package$.retryTransientErrors(package.scala:124)
at is.hail.backend.service.ServiceBackendSocketAPI2$.main(ServiceBackend.scala:458)
at is.hail.backend.service.Main$.main(Main.scala:33)
at is.hail.backend.service.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at is.hail.JVMEntryway$1.run(JVMEntryway.java:105)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:210)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:464)
at sun.security.ssl.SSLSocketInputRecord.decodeInputRecord(SSLSocketInputRecord.java:237)
at sun.security.ssl.SSLSocketInputRecord.decode(SSLSocketInputRecord.java:190)
at sun.security.ssl.SSLTransport.decode(SSLTransport.java:109)
at sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1400)
at sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1368)
at sun.security.ssl.SSLSocketImpl.access$300(SSLSocketImpl.java:73)
at sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:962)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at sun.net.www.MeteredStream.read(MeteredStream.java:134)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3456)
at com.google.api.client.http.javanet.NetHttpResponse$SizeValidatingInputStream.read(NetHttpResponse.java:164)
at java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
at is.hail.relocated.com.google.cloud.storage.StorageByteChannels$ScatteringByteChannelFacade.read(StorageByteChannels.java:226)
at is.hail.relocated.com.google.cloud.storage.ApiaryUnbufferedReadableByteChannel.read(ApiaryUnbufferedReadableByteChannel.java:104)
at is.hail.relocated.com.google.cloud.storage.UnbufferedReadableByteChannelSession$UnbufferedReadableByteChannel.read(UnbufferedReadableByteChannelSession.java:36)
at is.hail.relocated.com.google.cloud.storage.DefaultBufferedReadableByteChannel.read(DefaultBufferedReadableByteChannel.java:106)
at is.hail.relocated.com.google.cloud.storage.StorageByteChannels$SynchronizedBufferedReadableByteChannel.read(StorageByteChannels.java:84)
at is.hail.relocated.com.google.cloud.storage.BaseStorageReadChannel.read(BaseStorageReadChannel.java:91)
at is.hail.io.fs.GoogleStorageFS$$anon$1.readHandlingRequesterPays(GoogleStorageFS.scala:205)
at is.hail.io.fs.GoogleStorageFS$$anon$1.fill(GoogleStorageFS.scala:242)
at is.hail.io.fs.FSSeekableInputStream.read(FS.scala:164)
at java.io.DataInputStream.read(DataInputStream.java:100)
at is.hail.expr.ir.GenericLines$$anon$2.loadBuffer(GenericLines.scala:84)
at is.hail.expr.ir.GenericLines$$anon$2.readLine(GenericLines.scala:194)
at is.hail.expr.ir.GenericLines$$anon$2.hasNext(GenericLines.scala:214)
at __C18collect_distributed_array_shuffle_initial_write.apply_region1_42(Unknown Source)
at __C18collect_distributed_array_shuffle_initial_write.apply(Unknown Source)
at __C18collect_distributed_array_shuffle_initial_write.apply(Unknown Source)
at is.hail.backend.BackendUtils.$anonfun$collectDArray$2(BackendUtils.scala:38)
at is.hail.utils.package$.using(package.scala:635)
at is.hail.backend.BackendUtils.$anonfun$collectDArray$1(BackendUtils.scala:37)
at is.hail.utils.package$.using(package.scala:635)
at is.hail.backend.BackendUtils.collectDArray(BackendUtils.scala:36)
at __C5Compiled.__m7split_Let(Emit.scala)
at __C5Compiled.apply(Emit.scala)
at is.hail.expr.ir.CompileAndEvaluate$.$anonfun$_apply$7(CompileAndEvaluate.scala:74)
at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:23)
at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
at is.hail.expr.ir.CompileAndEvaluate$._apply(CompileAndEvaluate.scala:74)
at is.hail.expr.ir.CompileAndEvaluate$.$anonfun$apply$1(CompileAndEvaluate.scala:19)
at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
at is.hail.expr.ir.CompileAndEvaluate$.apply(CompileAndEvaluate.scala:19)
at is.hail.expr.ir.lowering.LowerDistributedSort$.distributedSort(LowerDistributedSort.scala:163)
at is.hail.backend.service.ServiceBackend.lowerDistributedSort(ServiceBackend.scala:356)
at is.hail.backend.Backend.lowerDistributedSort(Backend.scala:100)
at is.hail.expr.ir.lowering.LowerAndExecuteShuffles$.$anonfun$apply$1(LowerAndExecuteShuffles.scala:23)
at is.hail.expr.ir.RewriteBottomUp$.$anonfun$apply$4(RewriteBottomUp.scala:26)
at is.hail.utils.StackSafe$More.advance(StackSafe.scala:60)
at is.hail.utils.StackSafe$.run(StackSafe.scala:16)
at is.hail.utils.StackSafe$StackFrame.run(StackSafe.scala:32)
at is.hail.expr.ir.RewriteBottomUp$.apply(RewriteBottomUp.scala:36)
at is.hail.expr.ir.lowering.LowerAndExecuteShuffles$.apply(LowerAndExecuteShuffles.scala:20)
at is.hail.expr.ir.lowering.LowerAndExecuteShufflesPass.transform(LoweringPass.scala:157)
at is.hail.expr.ir.lowering.LoweringPass.$anonfun$apply$3(LoweringPass.scala:16)
at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
at is.hail.expr.ir.lowering.LoweringPass.$anonfun$apply$1(LoweringPass.scala:16)
at is.hail.utils.ExecutionTimer.time(ExecutionTimer.scala:81)
at is.hail.expr.ir.lowering.LoweringPass.apply(LoweringPass.scala:14)
at is.hail.expr.ir.lowering.LoweringPass.apply$(LoweringPass.scala:13)
at is.hail.expr.ir.lowering.LowerAndExecuteShufflesPass.apply(LoweringPass.scala:151)
at is.hail.expr.ir.lowering.LoweringPipeline.$anonfun$apply$1(LoweringPipeline.scala:22)
at is.hail.expr.ir.lowering.LoweringPipeline.$anonfun$apply$1$adapted(LoweringPipeline.scala:20)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at is.hail.expr.ir.lowering.LoweringPipeline.apply(LoweringPipeline.scala:20)
at is.hail.backend.service.ServiceBackend.execute(ServiceBackend.scala:312)
at is.hail.backend.service.ServiceBackend.execute(ServiceBackend.scala:348)
at is.hail.backend.service.ServiceBackendSocketAPI2.$anonfun$executeOneCommand$12(ServiceBackend.scala:700)
at is.hail.backend.service.ServiceBackendSocketAPI2.withIRFunctionsReadFromInput(ServiceBackend.scala:803)
at is.hail.backend.service.ServiceBackendSocketAPI2.$anonfun$executeOneCommand$11(ServiceBackend.scala:698)
at is.hail.backend.service.ServiceBackendSocketAPI2.$anonfun$executeOneCommand$2(ServiceBackend.scala:656)
at is.hail.backend.ExecuteContext$.$anonfun$scoped$3(ExecuteContext.scala:75)
at is.hail.utils.package$.using(package.scala:635)
at is.hail.backend.ExecuteContext$.$anonfun$scoped$2(ExecuteContext.scala:75)
at is.hail.utils.package$.using(package.scala:635)
at is.hail.annotations.RegionPool$.scoped(RegionPool.scala:17)
at is.hail.backend.ExecuteContext$.scoped(ExecuteContext.scala:63)
at is.hail.backend.service.ServiceBackendSocketAPI2.$anonfun$executeOneCommand$1(ServiceBackend.scala:646)
at is.hail.utils.ExecutionTimer$.time(ExecutionTimer.scala:52)
at is.hail.utils.ExecutionTimer$.logTime(ExecutionTimer.scala:59)
at is.hail.backend.service.ServiceBackendSocketAPI2.withExecuteContext$1(ServiceBackend.scala:633)
at is.hail.backend.service.ServiceBackendSocketAPI2.executeOneCommand(ServiceBackend.scala:695)
at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$6(ServiceBackend.scala:461)
at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$6$adapted(ServiceBackend.scala:460)
at is.hail.utils.package$.using(package.scala:635)
at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$5(ServiceBackend.scala:460)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at is.hail.services.package$.retryTransientErrors(package.scala:124)
at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$4(ServiceBackend.scala:460)
at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$4$adapted(ServiceBackend.scala:458)
at is.hail.utils.package$.using(package.scala:635)
at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$3(ServiceBackend.scala:458)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at is.hail.services.package$.retryTransientErrors(package.scala:124)
at is.hail.backend.service.ServiceBackendSocketAPI2$.main(ServiceBackend.scala:458)
at is.hail.backend.service.Main$.main(Main.scala:33)
at is.hail.backend.service.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at is.hail.JVMEntryway$1.run(JVMEntryway.java:105)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
2023-05-04 01:04:37.742 GoogleStorageFS$: INFO: close: gs://cpg-acute-care-hail/batch-tmp/tmp/hail/pV2Mgy4FVKSGKMwZGafyTh/dRRY6iUfFz/out
2023-05-04 01:04:38.077 GoogleStorageFS$: INFO: closed: gs://cpg-acute-care-hail/batch-tmp/tmp/hail/pV2Mgy4FVKSGKMwZGafyTh/dRRY6iUfFz/out
The text was updated successfully, but these errors were encountered:
Fixes#12983
---
After an `FSSeekableInputStream` method (successfully) returns,
`getPosition` always represents theintended location within the object.
We can entirely eliminate `lazyPosition` because it tracks the
same value as `getPosition` when `reader == null`.
For retryable reads, we just seek back to the known correct location of
the stream before attempting to read again.
Fixeshail-is#12983
---
After an `FSSeekableInputStream` method (successfully) returns,
`getPosition` always represents theintended location within the object.
We can entirely eliminate `lazyPosition` because it tracks the
same value as `getPosition` when `reader == null`.
For retryable reads, we just seek back to the known correct location of
the stream before attempting to read again.
What happened?
On the driver (but this could happen anywhere), a
read
call failed in GoogleStorageFS. In particular line 205:We don't retry transient errors here or below in the other call to
read
. We only retry on the initial creation of the stream.I think we are concerned that the stream is in a bad state, possible advanced a few bytes. If we were to read from it, we might drop some data. The safe thing to do is to
seek
to the correct position. This will likely initiate a new HTTP request to GCS, which is fine, because we almost certainly lost the old connection due to the transient error.I also think we need to remove
lazyPosition
. I think we can achieve the requester pays nonsense by just relying on thepos
from the parent class (see FS.scala).Version
0.2.115-71fc978b5c22
Relevant log output
The text was updated successfully, but these errors were encountered: