Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky Hive product tests: Error committing write to Hive #4936

Closed
ebyhr opened this issue Aug 23, 2020 · 21 comments
Closed

Flaky Hive product tests: Error committing write to Hive #4936

ebyhr opened this issue Aug 23, 2020 · 21 comments
Labels
bug Something isn't working test

Comments

@ebyhr
Copy link
Member

ebyhr commented Aug 23, 2020

2020-08-23T02:43:55.0287769Z presto-master       | 2020-08-23T08:28:55.018+0545	ERROR	remote-task-callback-7	io.prestosql.execution.StageStateMachine	Stage 20200823_024354_00034_zq34q.1 failed
2020-08-23T02:43:55.0288266Z presto-master       | io.prestosql.spi.PrestoException: Error committing write to Hive
2020-08-23T02:43:55.0288614Z presto-master       | 	at io.prestosql.plugin.hive.orc.OrcFileWriter.commit(OrcFileWriter.java:155)
2020-08-23T02:43:55.0288938Z presto-master       | 	at io.prestosql.plugin.hive.HiveWriter.commit(HiveWriter.java:86)
2020-08-23T02:43:55.0289270Z presto-master       | 	at io.prestosql.plugin.hive.HivePageSink.doFinish(HivePageSink.java:190)
2020-08-23T02:43:55.0289665Z presto-master       | 	at io.prestosql.plugin.hive.authentication.UserGroupInformationUtils.lambda$executeActionInDoAs$0(UserGroupInformationUtils.java:29)
2020-08-23T02:43:55.0290001Z presto-master       | 	at java.base/java.security.AccessController.doPrivileged(Native Method)
2020-08-23T02:43:55.0290302Z presto-master       | 	at java.base/javax.security.auth.Subject.doAs(Subject.java:361)
2020-08-23T02:43:55.0290652Z presto-master       | 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710)
2020-08-23T02:43:55.0291036Z presto-master       | 	at io.prestosql.plugin.hive.authentication.UserGroupInformationUtils.executeActionInDoAs(UserGroupInformationUtils.java:27)
2020-08-23T02:43:55.0291421Z presto-master       | 	at io.prestosql.plugin.hive.authentication.ImpersonatingHdfsAuthentication.doAs(ImpersonatingHdfsAuthentication.java:39)
2020-08-23T02:43:55.0291760Z presto-master       | 	at io.prestosql.plugin.hive.HdfsEnvironment.doAs(HdfsEnvironment.java:89)
2020-08-23T02:43:55.0292082Z presto-master       | 	at io.prestosql.plugin.hive.HivePageSink.finish(HivePageSink.java:181)
2020-08-23T02:43:55.0292453Z presto-master       | 	at io.prestosql.plugin.base.classloader.ClassLoaderSafeConnectorPageSink.finish(ClassLoaderSafeConnectorPageSink.java:77)
2020-08-23T02:43:55.0292799Z presto-master       | 	at io.prestosql.operator.TableWriterOperator.finish(TableWriterOperator.java:203)
2020-08-23T02:43:55.0293124Z presto-master       | 	at io.prestosql.operator.Driver.processInternal(Driver.java:397)
2020-08-23T02:43:55.0293482Z presto-master       | 	at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283)
2020-08-23T02:43:55.0293801Z presto-master       | 	at io.prestosql.operator.Driver.tryWithLock(Driver.java:675)
2020-08-23T02:43:55.0294105Z presto-master       | 	at io.prestosql.operator.Driver.processFor(Driver.java:276)
2020-08-23T02:43:55.0294439Z presto-master       | 	at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1076)
2020-08-23T02:43:55.0294791Z presto-master       | 	at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
2020-08-23T02:43:55.0295127Z presto-master       | 	at io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
2020-08-23T02:43:55.0295460Z presto-master       | 	at io.prestosql.$gen.Presto_1cc5db2____20200823_024202_2.run(Unknown Source)
2020-08-23T02:43:55.0295815Z presto-master       | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
2020-08-23T02:43:55.0296262Z presto-master       | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
2020-08-23T02:43:55.0296531Z presto-master       | 	at java.base/java.lang.Thread.run(Thread.java:834)
2020-08-23T02:43:55.0297045Z presto-master       | Caused by: org.apache.hadoop.ipc.RemoteException: File /tmp/presto-hive/569fc3af-2a52-4a1b-85d4-38e844339646/20200823_024354_00034_zq34q_82a58b59-5216-45f7-84ed-c6af8b18f34a could only be written to 0 of the 1 minReplication nodes. There are 1 datanode(s) running and no node(s) are excluded in this operation.
2020-08-23T02:43:55.0297401Z presto-master       | 	at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2121)
2020-08-23T02:43:55.0297735Z presto-master       | 	at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:286)
2020-08-23T02:43:55.0298184Z presto-master       | 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2706)
2020-08-23T02:43:55.0298515Z presto-master       | 	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:875)
2020-08-23T02:43:55.0298877Z presto-master       | 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:561)
2020-08-23T02:43:55.0299250Z presto-master       | 	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
2020-08-23T02:43:55.0299576Z presto-master       | 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
2020-08-23T02:43:55.0299850Z presto-master       | 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
2020-08-23T02:43:55.0300130Z presto-master       | 	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
2020-08-23T02:43:55.0300399Z presto-master       | 	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
2020-08-23T02:43:55.0300668Z presto-master       | 	at java.security.AccessController.doPrivileged(Native Method)
2020-08-23T02:43:55.0300930Z presto-master       | 	at javax.security.auth.Subject.doAs(Subject.java:422)
2020-08-23T02:43:55.0301225Z presto-master       | 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
2020-08-23T02:43:55.0301484Z presto-master       | 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
2020-08-23T02:43:55.0301696Z presto-master       | 
2020-08-23T02:43:55.0301955Z presto-master       | 	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1511)
2020-08-23T02:43:55.0302216Z presto-master       | 	at org.apache.hadoop.ipc.Client.call(Client.java:1457)
2020-08-23T02:43:55.0302476Z presto-master       | 	at org.apache.hadoop.ipc.Client.call(Client.java:1367)
2020-08-23T02:43:55.0302773Z presto-master       | 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
2020-08-23T02:43:55.0303066Z presto-master       | 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
2020-08-23T02:43:55.0303329Z presto-master       | 	at com.sun.proxy.$Proxy295.addBlock(Unknown Source)
2020-08-23T02:43:55.0303660Z presto-master       | 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:513)
2020-08-23T02:43:55.0303962Z presto-master       | 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2020-08-23T02:43:55.0304273Z presto-master       | 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2020-08-23T02:43:55.0304596Z presto-master       | 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2020-08-23T02:43:55.0304870Z presto-master       | 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
2020-08-23T02:43:55.0305179Z presto-master       | 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
2020-08-23T02:43:55.0305494Z presto-master       | 	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
2020-08-23T02:43:55.0305802Z presto-master       | 	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
2020-08-23T02:43:55.0306112Z presto-master       | 	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
2020-08-23T02:43:55.0306415Z presto-master       | 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
2020-08-23T02:43:55.0306680Z presto-master       | 	at com.sun.proxy.$Proxy296.addBlock(Unknown Source)
2020-08-23T02:43:55.0307244Z presto-master       | 	at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1081)
2020-08-23T02:43:55.0307596Z presto-master       | 	at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1865)
2020-08-23T02:43:55.0307932Z presto-master       | 	at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1668)
2020-08-23T02:43:55.0308248Z presto-master       | 	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:716)
2020-08-23T02:43:55.0308490Z presto-master       | 
2020-08-23T02:43:55.0308702Z presto-master       | 
2020-08-23T02:43:55.0582129Z tests               | 2020-08-23 08:28:55 INFO: FAILURE     /    io.prestosql.tests.hive.TestHiveCreateTable.testCreateTable (Groups: storage_formats) took 0.7 seconds
2020-08-23T02:43:55.0636265Z tests               | 2020-08-23 08:28:55 SEVERE: Failure cause:
2020-08-23T02:43:55.0636502Z tests               | io.prestosql.tempto.query.QueryExecutionException: java.sql.SQLException: Query failed (#20200823_024354_00034_zq34q): Error committing write to Hive
2020-08-23T02:43:55.0636726Z tests               | 	at io.prestosql.tempto.query.JdbcQueryExecutor.execute(JdbcQueryExecutor.java:114)
2020-08-23T02:43:55.0636895Z tests               | 	at io.prestosql.tempto.query.JdbcQueryExecutor.executeQuery(JdbcQueryExecutor.java:82)
2020-08-23T02:43:55.0637068Z tests               | 	at io.prestosql.tests.hive.TestHiveCreateTable.testCreateTable(TestHiveCreateTable.java:42)
2020-08-23T02:43:55.0637236Z tests               | 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2020-08-23T02:43:55.0637410Z tests               | 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2020-08-23T02:43:55.0637596Z tests               | 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2020-08-23T02:43:55.0637758Z tests               | 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
2020-08-23T02:43:55.0637932Z tests               | 	at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:104)
2020-08-23T02:43:55.0638090Z tests               | 	at org.testng.internal.Invoker.invokeMethod(Invoker.java:645)
2020-08-23T02:43:55.0638241Z tests               | 	at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:851)
2020-08-23T02:43:55.0638392Z tests               | 	at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1177)
2020-08-23T02:43:55.0638550Z tests               | 	at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:129)
2020-08-23T02:43:55.0638711Z tests               | 	at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:112)
2020-08-23T02:43:55.0638874Z tests               | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
2020-08-23T02:43:55.0639025Z tests               | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
2020-08-23T02:43:55.0639185Z tests               | 	at java.base/java.lang.Thread.run(Thread.java:834)
2020-08-23T02:43:55.0639353Z tests               | Caused by: java.sql.SQLException: Query failed (#20200823_024354_00034_zq34q): Error committing write to Hive
2020-08-23T02:43:55.0639526Z tests               | 	at io.prestosql.jdbc.AbstractPrestoResultSet.resultsException(AbstractPrestoResultSet.java:1757)
2020-08-23T02:43:55.0639695Z tests               | 	at io.prestosql.jdbc.PrestoResultSet$ResultsPageIterator.computeNext(PrestoResultSet.java:225)
2020-08-23T02:43:55.0639856Z tests               | 	at io.prestosql.jdbc.PrestoResultSet$ResultsPageIterator.computeNext(PrestoResultSet.java:185)
2020-08-23T02:43:55.0640024Z tests               | 	at io.prestosql.jdbc.$internal.guava.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141)
2020-08-23T02:43:55.0640299Z tests               | 	at io.prestosql.jdbc.$internal.guava.collect.AbstractIterator.hasNext(AbstractIterator.java:136)
2020-08-23T02:43:55.0640719Z tests               | 	at java.base/java.util.Spliterators$IteratorSpliterator.tryAdvance(Spliterators.java:1811)
2020-08-23T02:43:55.0640887Z tests               | 	at java.base/java.util.stream.StreamSpliterators$WrappingSpliterator.lambda$initPartialTraversalState$0(StreamSpliterators.java:294)
2020-08-23T02:43:55.0641050Z tests               | 	at java.base/java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:206)
2020-08-23T02:43:55.0641210Z tests               | 	at java.base/java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:161)
2020-08-23T02:43:55.0641370Z tests               | 	at java.base/java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:300)
2020-08-23T02:43:55.0641522Z tests               | 	at java.base/java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681)
2020-08-23T02:43:55.0641675Z tests               | 	at io.prestosql.jdbc.PrestoResultSet$AsyncIterator.lambda$new$0(PrestoResultSet.java:131)
2020-08-23T02:43:55.0641816Z tests               | 	at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736)
2020-08-23T02:43:55.0641947Z tests               | 	... 3 more
2020-08-23T02:43:55.0642082Z tests               | Caused by: io.prestosql.spi.PrestoException: Error committing write to Hive
2020-08-23T02:43:55.0642231Z tests               | 	at io.prestosql.plugin.hive.orc.OrcFileWriter.commit(OrcFileWriter.java:155)
2020-08-23T02:43:55.0642375Z tests               | 	at io.prestosql.plugin.hive.HiveWriter.commit(HiveWriter.java:86)
2020-08-23T02:43:55.0642520Z tests               | 	at io.prestosql.plugin.hive.HivePageSink.doFinish(HivePageSink.java:190)
2020-08-23T02:43:55.0642684Z tests               | 	at io.prestosql.plugin.hive.authentication.UserGroupInformationUtils.lambda$executeActionInDoAs$0(UserGroupInformationUtils.java:29)
2020-08-23T02:43:55.0642846Z tests               | 	at java.base/java.security.AccessController.doPrivileged(Native Method)
2020-08-23T02:43:55.0642991Z tests               | 	at java.base/javax.security.auth.Subject.doAs(Subject.java:361)
2020-08-23T02:43:55.0643140Z tests               | 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710)
2020-08-23T02:43:55.0643296Z tests               | 	at io.prestosql.plugin.hive.authentication.UserGroupInformationUtils.executeActionInDoAs(UserGroupInformationUtils.java:27)
2020-08-23T02:43:55.0643463Z tests               | 	at io.prestosql.plugin.hive.authentication.ImpersonatingHdfsAuthentication.doAs(ImpersonatingHdfsAuthentication.java:39)
2020-08-23T02:43:55.0643622Z tests               | 	at io.prestosql.plugin.hive.HdfsEnvironment.doAs(HdfsEnvironment.java:89)
2020-08-23T02:43:55.0643751Z tests               | 	at io.prestosql.plugin.hive.HivePageSink.finish(HivePageSink.java:181)
2020-08-23T02:43:55.0643909Z tests               | 	at io.prestosql.plugin.base.classloader.ClassLoaderSafeConnectorPageSink.finish(ClassLoaderSafeConnectorPageSink.java:77)
2020-08-23T02:43:55.0644276Z tests               | 	at io.prestosql.operator.TableWriterOperator.finish(TableWriterOperator.java:203)
2020-08-23T02:43:55.0644428Z tests               | 	at io.prestosql.operator.Driver.processInternal(Driver.java:397)
2020-08-23T02:43:55.0644573Z tests               | 	at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283)
2020-08-23T02:43:55.0644715Z tests               | 	at io.prestosql.operator.Driver.tryWithLock(Driver.java:675)
2020-08-23T02:43:55.0644855Z tests               | 	at io.prestosql.operator.Driver.processFor(Driver.java:276)
2020-08-23T02:43:55.0645004Z tests               | 	at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1076)
2020-08-23T02:43:55.0645158Z tests               | 	at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
2020-08-23T02:43:55.0645458Z tests               | 	at io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
2020-08-23T02:43:55.0645609Z tests               | 	at io.prestosql.$gen.Presto_1cc5db2____20200823_024202_2.run(Unknown Source)
2020-08-23T02:43:55.0645760Z tests               | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
2020-08-23T02:43:55.0645898Z tests               | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
2020-08-23T02:43:55.0646039Z tests               | 	at java.base/java.lang.Thread.run(Thread.java:834)
2020-08-23T02:43:55.0646742Z tests               | Caused by: org.apache.hadoop.ipc.RemoteException: File /tmp/presto-hive/569fc3af-2a52-4a1b-85d4-38e844339646/20200823_024354_00034_zq34q_82a58b59-5216-45f7-84ed-c6af8b18f34a could only be written to 0 of the 1 minReplication nodes. There are 1 datanode(s) running and no node(s) are excluded in this operation.
2020-08-23T02:43:55.0646969Z tests               | 	at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2121)
2020-08-23T02:43:55.0647144Z tests               | 	at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:286)
2020-08-23T02:43:55.0647313Z tests               | 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2706)
2020-08-23T02:43:55.0647482Z tests               | 	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:875)
2020-08-23T02:43:55.0647657Z tests               | 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:561)
2020-08-23T02:43:55.0647837Z tests               | 	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
2020-08-23T02:43:55.0648019Z tests               | 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
2020-08-23T02:43:55.0648164Z tests               | 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
2020-08-23T02:43:55.0648304Z tests               | 	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
2020-08-23T02:43:55.0648447Z tests               | 	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
2020-08-23T02:43:55.0648589Z tests               | 	at java.security.AccessController.doPrivileged(Native Method)
2020-08-23T02:43:55.0648729Z tests               | 	at javax.security.auth.Subject.doAs(Subject.java:422)
2020-08-23T02:43:55.0648875Z tests               | 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
2020-08-23T02:43:55.0649018Z tests               | 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
2020-08-23T02:43:55.0649142Z tests               | 
2020-08-23T02:43:55.0649272Z tests               | 	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1511)
2020-08-23T02:43:55.0649415Z tests               | 	at org.apache.hadoop.ipc.Client.call(Client.java:1457)
2020-08-23T02:43:55.0649538Z tests               | 	at org.apache.hadoop.ipc.Client.call(Client.java:1367)
2020-08-23T02:43:55.0649680Z tests               | 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
2020-08-23T02:43:55.0649827Z tests               | 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
2020-08-23T02:43:55.0649967Z tests               | 	at com.sun.proxy.$Proxy295.addBlock(Unknown Source)
2020-08-23T02:43:55.0650123Z tests               | 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:513)
2020-08-23T02:43:55.0650286Z tests               | 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2020-08-23T02:43:55.0650497Z tests               | 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2020-08-23T02:43:55.0650699Z tests               | 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2020-08-23T02:43:55.0650850Z tests               | 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
2020-08-23T02:43:55.0651005Z tests               | 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
2020-08-23T02:43:55.0651164Z tests               | 	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
2020-08-23T02:43:55.0651322Z tests               | 	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
2020-08-23T02:43:55.0651466Z tests               | 	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
2020-08-23T02:43:55.0651626Z tests               | 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
2020-08-23T02:43:55.0651767Z tests               | 	at com.sun.proxy.$Proxy296.addBlock(Unknown Source)
2020-08-23T02:43:55.0651908Z tests               | 	at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1081)
2020-08-23T02:43:55.0652057Z tests               | 	at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1865)
2020-08-23T02:43:55.0652205Z tests               | 	at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1668)
2020-08-23T02:43:55.0652351Z tests               | 	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:716)

https://github.com/prestosql/presto/pull/4846/checks?check_run_id=1017285394

@ebyhr ebyhr added bug Something isn't working test labels Aug 23, 2020
@ebyhr ebyhr changed the title Flaky TestHiveMaterializedView.testWrite Flaky io.prestosql.tests.product.launcher.cli.TestRun Aug 23, 2020
@ebyhr ebyhr changed the title Flaky io.prestosql.tests.product.launcher.cli.TestRun Flaky TestHiveCreateTable.testCreateTable Aug 23, 2020
@findepi findepi changed the title Flaky TestHiveCreateTable.testCreateTable Flaky Hive product tests: Error committing write to Hive Oct 4, 2020
@findepi
Copy link
Member

findepi commented Oct 4, 2020

TestHivePartitionsTable.testShowPartitionsFromHiveTableWithTooManyPartitions failed in https://github.com/prestosql/presto/runs/1205417196

2020-10-04T13:17:25.2762652Z tests               | 2020-10-04 19:02:25 INFO: FAILURE     /    io.prestosql.tests.hive.TestHivePartitionsTable.testShowPartitionsFromHiveTableWithTooManyPartitions (Groups: hive_partitioning) took 4.1 seconds
2020-10-04T13:17:25.2814457Z tests               | 2020-10-04 19:02:25 SEVERE: Failure cause:
2020-10-04T13:17:25.2841985Z tests               | io.prestosql.tempto.query.QueryExecutionException: java.sql.SQLException: Query failed (#20201004_131721_00842_pt8xg): Error committing write to Hive
2020-10-04T13:17:25.2844345Z tests               | 	at io.prestosql.tempto.query.JdbcQueryExecutor.execute(JdbcQueryExecutor.java:114)
2020-10-04T13:17:25.2846441Z tests               | 	at io.prestosql.tempto.query.JdbcQueryExecutor.executeQuery(JdbcQueryExecutor.java:82)
2020-10-04T13:17:25.2848427Z tests               | 	at io.prestosql.tempto.query.QueryExecutor.query(QueryExecutor.java:57)
2020-10-04T13:17:25.2850467Z tests               | 	at io.prestosql.tests.hive.TestHivePartitionsTable.lambda$createPartitions$4(TestHivePartitionsTable.java:171)
2020-10-04T13:17:25.2852496Z tests               | 	at java.base/java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:104)
2020-10-04T13:17:25.2853830Z tests               | 	at java.base/java.util.stream.IntPipeline$Head.forEach(IntPipeline.java:593)
2020-10-04T13:17:25.2855806Z tests               | 	at io.prestosql.tests.hive.TestHivePartitionsTable.createPartitions(TestHivePartitionsTable.java:168)
2020-10-04T13:17:25.2860649Z tests               | 	at io.prestosql.tests.hive.TestHivePartitionsTable.testShowPartitionsFromHiveTableWithTooManyPartitions(TestHivePartitionsTable.java:135)
2020-10-04T13:17:25.2864788Z tests               | 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2020-10-04T13:17:25.2867051Z tests               | 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2020-10-04T13:17:25.2869979Z tests               | 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2020-10-04T13:17:25.2871800Z tests               | 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
2020-10-04T13:17:25.2873552Z tests               | 	at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:104)
2020-10-04T13:17:25.2875345Z tests               | 	at org.testng.internal.Invoker.invokeMethod(Invoker.java:645)
2020-10-04T13:17:25.2876760Z tests               | 	at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:851)
2020-10-04T13:17:25.2878192Z tests               | 	at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1177)
2020-10-04T13:17:25.2879962Z tests               | 	at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:129)
2020-10-04T13:17:25.2881716Z tests               | 	at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:112)
2020-10-04T13:17:25.2883415Z tests               | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
2020-10-04T13:17:25.2885438Z tests               | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
2020-10-04T13:17:25.2886677Z tests               | 	at java.base/java.lang.Thread.run(Thread.java:834)
2020-10-04T13:17:25.2887530Z tests               | Caused by: java.sql.SQLException: Query failed (#20201004_131721_00842_pt8xg): Error committing write to Hive
2020-10-04T13:17:25.2889358Z tests               | 	at io.prestosql.jdbc.AbstractPrestoResultSet.resultsException(AbstractPrestoResultSet.java:1761)
2020-10-04T13:17:25.2891644Z tests               | 	at io.prestosql.jdbc.PrestoResultSet$ResultsPageIterator.computeNext(PrestoResultSet.java:228)
2020-10-04T13:17:25.2893538Z tests               | 	at io.prestosql.jdbc.PrestoResultSet$ResultsPageIterator.computeNext(PrestoResultSet.java:188)
2020-10-04T13:17:25.2895798Z tests               | 	at io.prestosql.jdbc.$internal.guava.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141)
2020-10-04T13:17:25.2898171Z tests               | 	at io.prestosql.jdbc.$internal.guava.collect.AbstractIterator.hasNext(AbstractIterator.java:136)
2020-10-04T13:17:25.2899986Z tests               | 	at java.base/java.util.Spliterators$IteratorSpliterator.tryAdvance(Spliterators.java:1811)
2020-10-04T13:17:25.2901951Z tests               | 	at java.base/java.util.stream.StreamSpliterators$WrappingSpliterator.lambda$initPartialTraversalState$0(StreamSpliterators.java:294)
2020-10-04T13:17:25.2904351Z tests               | 	at java.base/java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:206)
2020-10-04T13:17:25.2906729Z tests               | 	at java.base/java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:161)
2020-10-04T13:17:25.2908940Z tests               | 	at java.base/java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:300)
2020-10-04T13:17:25.2910514Z tests               | 	at java.base/java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681)
2020-10-04T13:17:25.2911849Z tests               | 	at io.prestosql.jdbc.PrestoResultSet$AsyncIterator.lambda$new$0(PrestoResultSet.java:134)
2020-10-04T13:17:25.2993976Z tests               | 	at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736)
2020-10-04T13:17:25.2994977Z tests               | 	... 3 more
2020-10-04T13:17:25.2995653Z tests               | Caused by: io.prestosql.spi.PrestoException: Error committing write to Hive
2020-10-04T13:17:25.2997022Z tests               | 	at io.prestosql.plugin.hive.orc.OrcFileWriter.commit(OrcFileWriter.java:155)
2020-10-04T13:17:25.2998544Z tests               | 	at io.prestosql.plugin.hive.HiveWriter.commit(HiveWriter.java:86)
2020-10-04T13:17:25.3000214Z tests               | 	at io.prestosql.plugin.hive.HivePageSink.doFinish(HivePageSink.java:190)
2020-10-04T13:17:25.3002499Z tests               | 	at io.prestosql.plugin.hive.authentication.NoHdfsAuthentication.doAs(NoHdfsAuthentication.java:23)
2020-10-04T13:17:25.3004672Z tests               | 	at io.prestosql.plugin.hive.HdfsEnvironment.doAs(HdfsEnvironment.java:96)
2020-10-04T13:17:25.3006219Z tests               | 	at io.prestosql.plugin.hive.HivePageSink.finish(HivePageSink.java:181)
2020-10-04T13:17:25.3009043Z tests               | 	at io.prestosql.plugin.base.classloader.ClassLoaderSafeConnectorPageSink.finish(ClassLoaderSafeConnectorPageSink.java:77)
2020-10-04T13:17:25.3011991Z tests               | 	at io.prestosql.operator.TableWriterOperator.finish(TableWriterOperator.java:208)
2020-10-04T13:17:25.3013613Z tests               | 	at io.prestosql.operator.Driver.processInternal(Driver.java:397)
2020-10-04T13:17:25.3014873Z tests               | 	at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283)
2020-10-04T13:17:25.3016054Z tests               | 	at io.prestosql.operator.Driver.tryWithLock(Driver.java:675)
2020-10-04T13:17:25.3017240Z tests               | 	at io.prestosql.operator.Driver.processFor(Driver.java:276)
2020-10-04T13:17:25.3018776Z tests               | 	at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1076)
2020-10-04T13:17:25.3021073Z tests               | 	at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
2020-10-04T13:17:25.3023252Z tests               | 	at io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
2020-10-04T13:17:25.3024393Z tests               | 	at io.prestosql.$gen.Presto_84cbc9e____20201004_131106_2.run(Unknown Source)
2020-10-04T13:17:25.3025680Z tests               | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
2020-10-04T13:17:25.3027471Z tests               | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
2020-10-04T13:17:25.3028619Z tests               | 	at java.base/java.lang.Thread.run(Thread.java:834)
2020-10-04T13:17:25.3031162Z tests               | Caused by: org.apache.hadoop.ipc.RemoteException: File /tmp/presto-hive/ddbad1e4-d67c-49bf-9b39-cd862cfe5cf2/part_col=3/20201004_131721_00842_pt8xg_a2bc03d2-4b6c-49a0-a481-1d61d15e62d5 could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and no node(s) are excluded in this operation.
2020-10-04T13:17:25.3034568Z tests               | 	at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1719)
2020-10-04T13:17:25.3037791Z tests               | 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3508)
2020-10-04T13:17:25.3040562Z tests               | 	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:694)
2020-10-04T13:17:25.3044980Z tests               | 	at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:219)
2020-10-04T13:17:25.3050723Z tests               | 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:507)
2020-10-04T13:17:25.3055612Z tests               | 	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
2020-10-04T13:17:25.3058809Z tests               | 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
2020-10-04T13:17:25.3060116Z tests               | 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
2020-10-04T13:17:25.3060951Z tests               | 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2281)
2020-10-04T13:17:25.3061918Z tests               | 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2277)
2020-10-04T13:17:25.3063033Z tests               | 	at java.security.AccessController.doPrivileged(Native Method)
2020-10-04T13:17:25.3064328Z tests               | 	at javax.security.auth.Subject.doAs(Subject.java:422)
2020-10-04T13:17:25.3065937Z tests               | 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
2020-10-04T13:17:25.3067545Z tests               | 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2275)
2020-10-04T13:17:25.3068170Z tests               | 
2020-10-04T13:17:25.3069066Z tests               | 	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1511)
2020-10-04T13:17:25.3070300Z tests               | 	at org.apache.hadoop.ipc.Client.call(Client.java:1457)
2020-10-04T13:17:25.3071324Z tests               | 	at org.apache.hadoop.ipc.Client.call(Client.java:1367)
2020-10-04T13:17:25.3072679Z tests               | 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
2020-10-04T13:17:25.3095596Z tests               | 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
2020-10-04T13:17:25.3096863Z tests               | 	at com.sun.proxy.$Proxy241.addBlock(Unknown Source)
2020-10-04T13:17:25.3099220Z tests               | 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:513)
2020-10-04T13:17:25.3102226Z tests               | 	at jdk.internal.reflect.GeneratedMethodAccessor411.invoke(Unknown Source)
2020-10-04T13:17:25.3104494Z tests               | 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2020-10-04T13:17:25.3106385Z tests               | 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
2020-10-04T13:17:25.3108210Z tests               | 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
2020-10-04T13:17:25.3110631Z tests               | 	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
2020-10-04T13:17:25.3112734Z tests               | 	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
2020-10-04T13:17:25.3114910Z tests               | 	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
2020-10-04T13:17:25.3117136Z tests               | 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
2020-10-04T13:17:25.3118629Z tests               | 	at com.sun.proxy.$Proxy242.addBlock(Unknown Source)
2020-10-04T13:17:25.3119792Z tests               | 	at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1081)
2020-10-04T13:17:25.3121653Z tests               | 	at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1865)
2020-10-04T13:17:25.3123766Z tests               | 	at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1668)
2020-10-04T13:17:25.3125482Z tests               | 	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:716)
2020-10-04T13:17:25.3126222Z tests               | 

@findepi
Copy link
Member

findepi commented Oct 9, 2020

@losipiuk
Copy link
Member

#5747

@findepi
Copy link
Member

findepi commented Nov 4, 2020

@findepi
Copy link
Member

findepi commented Nov 4, 2020

@findepi
Copy link
Member

findepi commented Nov 4, 2020

I marked two example failures with @Flaky -- #5811 and #5829.
however, since the nature of the problem seems only so much related to actual test, we may need a more generic approach... unless, of course, we learn how to fix the problem.

@findepi
Copy link
Member

findepi commented Jan 13, 2021

@MiguelWeezardo would it be possible to add a retry to onHive() so that these are automatically retried without need for @Flaky on sooo many tests?
(many are still not covered yet)

@kokosing
Copy link
Member

it be possible to add a retry to onHive()

That should not be enabled by default. There are plenty of queries that should not be retried or they are simply not retryable.

How about.

onHive()
  .retryWhen(ex -> ex.getMessages().contains("Hive is flaky again...")
  .execute("select 1 from \"$dummy\"");

?

Another option is wrapping query executors like:


    /**
     * This method could be only used for state-less JDBC connections - i.e. Presto.
     * Otherwise connection will be surely broken.
     */
    public static QueryExecutor retryOnNetworkIssues(QueryExecutor queryExecutor)
    {
        RetryPolicy<QueryResult> queryRetryPolicy = new RetryPolicy<QueryResult>()
                .withBackoff(1, 60, ChronoUnit.SECONDS)
                .withMaxRetries(10)
                .handleIf(QueryExecutors::isNetworkIssue)
                .onRetry(event -> log.warn("Query failed on attempt %s, will retry. Exception: %s", event.getAttemptCount(), event.getLastFailure().getMessage()));

        return new QueryExecutor()
        {
            @Override
            public QueryResult executeQuery(String sql, QueryParam... params)
                    throws QueryExecutionException
            {
                return Failsafe.with(queryRetryPolicy)
                        .get(() -> queryExecutor.executeQuery(sql, params));
            }

            @Override
            public Connection getConnection()
            {
                return queryExecutor.getConnection();
            }

            @Override
            public void close()
            {
                queryExecutor.close();
            }
        };
    }

    private static boolean isNetworkIssue(Throwable throwable)
    {
        return Throwables.getCausalChain(throwable).stream()
                .anyMatch(exception -> exception instanceof SocketException
                        || exception instanceof SocketTimeoutException);
    }

@findepi
Copy link
Member

findepi commented Jan 14, 2021

That should not be enabled by default. There are plenty of queries that should not be retried or they are simply not retryable.

@kokosing i meant to only retry those that fail with

static final String ERROR_COMMITTING_WRITE_TO_HIVE_MATCH = "Error committing write to Hive(?s:.*)" +
// "could only be written to 0 of the 1 minReplication" is the error wording used by e.g. HDP 3
"(could only be replicated to 0 nodes instead of minReplication|could only be written to 0 of the 1 minReplication)";

plus, we can match on the query being done, eg it's not safe to retry INSERT statements

How about.

onHive()
  .retryWhen(ex -> ex.getMessages().contains("Hive is flaky again...")
  .execute("select 1 from \"$dummy\"");

if i were to change each test's code, i'd just continue with the current approach

@Flaky(issue = ERROR_COMMITTING_WRITE_TO_HIVE_ISSUE, match = ERROR_COMMITTING_WRITE_TO_HIVE_MATCH)

@MiguelWeezardo
Copy link
Member

Should be fixed by #6601

@findepi
Copy link
Member

findepi commented Jan 15, 2021

Hopefully solved by #6601 thanks @MiguelWeezardo

if that turns out not to be sufficient in some cases, we may need a more focused approach in those cases, so let me close the issue

@findepi findepi closed this as completed Jan 15, 2021
@ksobolew
Copy link
Contributor

Well, I saw what looks like this in one of my PRs:

https://github.com/trinodb/trino/runs/1911881714?check_suite_focus=true#step:6:12515

@MiguelWeezardo
Copy link
Member

MiguelWeezardo commented Feb 18, 2021

Well, I saw what looks like this in one of my PRs:

https://github.com/trinodb/trino/runs/1911881714?check_suite_focus=true#step:6:12515

The Failsafe code is on the failure stack, but it looks like no retries were attempted.
This might mean the regular expression failed to match the exception message.

@kokosing
Copy link
Member

kokosing commented Feb 18, 2021

Is there any issues with message regex?

@kokosing kokosing reopened this Feb 18, 2021
@MiguelWeezardo
Copy link
Member

MiguelWeezardo commented Feb 18, 2021

Is there any issues with message regex?

The regex is multiline, and focuses on messages from two different exceptions in the causal chain, which may be why it didn't match.
HiveProductTest#isErrorCommittingToHive matches it against the getMessage() string of each exception in the causal chain. I assumed the getMessage() includes the messages of it's causes just as we see it in the logs, but this was probably wrong.
I guess this code should generate the entire recursive stacktrace as a string and match against that.

@MiguelWeezardo
Copy link
Member

Throwables#getStackTraceAsString might be useful here

@findepi
Copy link
Member

findepi commented Feb 18, 2021

The regex is multiline, and focuses on messages from two different exceptions in the causal chain, ...

That's why it has (?s:.*) and not just .*.

@MiguelWeezardo
Copy link
Member

The regex is multiline, and focuses on messages from two different exceptions in the causal chain, ...

That's why it has (?s:.*) and not just .*.

I thought it's a simple non-capturing group, and failed to notice the s flag until now.

@MiguelWeezardo
Copy link
Member

I copied the matching from FlakyTestRetryAnalizer in #6944

@losipiuk
Copy link
Member

Happens also to Iceberg+Spark tests when trying to write to HDFS:
https://github.com/trinodb/trino/runs/5280873825?check_suite_focus=true

@findepi
Copy link
Member

findepi commented Jan 27, 2023

We have retries in place, so no more work planned to be done here, I am closing the issue

Please reopen if you see this again. Be sure to provide failure link AND stacktraces. Github actions logs are not kept for ever.

@findepi findepi closed this as completed Jan 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working test
Development

No branches or pull requests

6 participants