Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI][C++] Fix arrow-s3fs-test timeouts on macOS C++ job #40410

Open
amoeba opened this issue Mar 7, 2024 · 18 comments
Open

[CI][C++] Fix arrow-s3fs-test timeouts on macOS C++ job #40410

amoeba opened this issue Mar 7, 2024 · 18 comments

Comments

@amoeba
Copy link
Member

amoeba commented Mar 7, 2024

Describe the bug, including details regarding any error messages, version, and platform.

The arrow-s3fs-test has been failing due to what appears to be a timeout on the "AMD64 macOS 12 C++" job for a while.

A recent example is: https://github.com/apache/arrow/actions/runs/8179809257/job/22366589254?pr=40373#step:13:265 and the relevant output is:

2024-03-06T23:05:01.0379130Z 97/97 Test #73: arrow-s3fs-test ..............................***Timeout 300.04 sec
2024-03-06T23:05:01.0381760Z Running arrow-s3fs-test, redirecting output into /Users/runner/work/arrow/arrow/build/cpp/build/test-logs/arrow-s3fs-test.txt (attempt 1/1)
2024-03-06T23:05:01.0383120Z 
2024-03-06T23:05:01.0383360Z       Start 73: arrow-s3fs-test
2024-03-06T23:10:01.0805520Z     Test #73: arrow-s3fs-test ..............................***Timeout 300.04 sec
2024-03-06T23:10:01.0808220Z Running arrow-s3fs-test, redirecting output into /Users/runner/work/arrow/arrow/build/cpp/build/test-logs/arrow-s3fs-test.txt (attempt 1/1)
2024-03-06T23:10:01.0810650Z 
2024-03-06T23:10:01.0810890Z       Start 73: arrow-s3fs-test
2024-03-06T23:15:01.1167650Z     Test #73: arrow-s3fs-test ..............................***Timeout 300.04 sec
2024-03-06T23:15:01.1171300Z Running arrow-s3fs-test, redirecting output into /Users/runner/work/arrow/arrow/build/cpp/build/test-logs/arrow-s3fs-test.txt (attempt 1/1)
2024-03-06T23:15:01.1172580Z 
2024-03-06T23:15:01.1175960Z 
2024-03-06T23:15:01.1176500Z 99% tests passed, 1 tests failed out of 97
2024-03-06T23:15:01.1176970Z 
2024-03-06T23:15:01.1177690Z Label Time Summary:
2024-03-06T23:15:01.1178690Z arrow-compute-tests    =  89.33 sec*proc (13 tests)
2024-03-06T23:15:01.1179450Z arrow-tests            = 449.67 sec*proc (41 tests)
2024-03-06T23:15:01.1183940Z arrow_acero            = 148.43 sec*proc (13 tests)
2024-03-06T23:15:01.1184610Z arrow_dataset          = 101.76 sec*proc (14 tests)
2024-03-06T23:15:01.1185230Z Errors while running CTest
2024-03-06T23:15:01.1185750Z arrow_flight           =  14.60 sec*proc (2 tests)
2024-03-06T23:15:01.1186400Z arrow_substrait        =   1.58 sec*proc (1 test)
2024-03-06T23:15:01.1187050Z filesystem             = 334.44 sec*proc (5 tests)
2024-03-06T23:15:01.1187830Z gandiva-tests          =  28.20 sec*proc (4 tests)
2024-03-06T23:15:01.1188560Z parquet-tests          =  59.08 sec*proc (9 tests)
2024-03-06T23:15:01.1189210Z unittest               = 892.66 sec*proc (97 tests)
2024-03-06T23:15:01.1189650Z 
2024-03-06T23:15:01.1189820Z Total Test time (real) = 1008.26 sec
2024-03-06T23:15:01.1190170Z 
2024-03-06T23:15:01.1190330Z The following tests FAILED:
2024-03-06T23:15:01.1190830Z 	 73 - arrow-s3fs-test (Timeout)

gist for posterity.

I expect this causes a lot of noise for maintainers so figuring out why this test is timing out would be good.

Component(s)

C++, Continuous Integration

@amoeba
Copy link
Member Author

amoeba commented Mar 7, 2024

When running this test locally on my non-AMD64 Mac, I actually get three failures:

[ RUN      ] TestS3FS.CreateDir
/Users/bryce/src/apache/arrow/cpp/src/arrow/filesystem/s3fs_test.cc:934: Failure
Failed
Expected 'fs_->CreateDir("bucket/somefile")' to fail with IOError, but got OK
[  FAILED  ] TestS3FS.CreateDir (37 ms)
[ RUN      ] TestS3FSGeneric.CopyFile
Warning: The standard parity is set to 0. This can lead to data loss.
/Users/bryce/src/apache/arrow/cpp/src/arrow/filesystem/test_util.cc:570: Failure
Failed
Expected 'fs->CopyFile("AB/abc", "def/mno")' to fail with IOError, but got OK
[  FAILED  ] TestS3FSGeneric.CopyFile (66 ms)
[ RUN      ] TestS3FSGeneric.MoveFile
Warning: The standard parity is set to 0. This can lead to data loss.
/Users/bryce/src/apache/arrow/cpp/src/arrow/filesystem/test_util.cc:448: Failure
Failed
Expected 'fs->Move("AB/pqr", "xxx/mno")' to fail with IOError, but got OK
[  FAILED  ] TestS3FSGeneric.MoveFile (81 ms)

I see in this job we run ctest with,

ctest --label-regex unittest --output-on-failure --parallel 4 --repeat until-pass:3 --timeout 300

My hunch is that it's possible the test is failing and ctest is retrying. My local machine is fast enough that it can get 3 repeats in before the timeout but it's possible GitHub's runner is slow enough that the timeout happens first.

Another thing I noticed is that, when ctest automatically re-attempts the test as requested, it still prints "attempt 1/1" each time:

Running arrow-s3fs-test, redirecting output into /Users/bryce/src/apache/arrow/cpp/build/build/test-logs/arrow-s3fs-test.txt (attempt 1/1)

@amoeba
Copy link
Member Author

amoeba commented Mar 7, 2024

I reverted my minio and mc binaries to the ones we pin for the tests and the above failures go away so I don't think that's what's going on. Though the above errors may be something to be aware of we ever need to upgrade minio/mc.

@felipecrv
Copy link
Contributor

Relevant link -> #34671

@amoeba
Copy link
Member Author

amoeba commented Mar 8, 2024

I haven't been able to reproduce locally (without modifying anything) but I have been able to reproduce once on CI with a stripped-down workflow in this run https://github.com/amoeba/arrow/actions/runs/8194916281/job/22416431654 so I'm going to try debugging that with tmate.

@pitrou
Copy link
Member

pitrou commented Mar 10, 2024

Good luck with that, my attempts at reproducing while logging in using tmate actually failed here:
#40401

kou added a commit that referenced this issue Mar 11, 2024
### Rationale for this change

We can use GitHub hosted M1 macOS runner.

### What changes are included in this PR?

* Add a job on macos-14
* Update expected L2 CPU cache range to 32KiB-12MiB from 32KiB-8MiB because M1 macOS runner has 12MiB
* Disable arrow-s3fs-test for now. It'll be fixed by GH-40410
 
### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: #40082

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
@amoeba
Copy link
Member Author

amoeba commented Mar 14, 2024

I'm having success reproducing the timeout by SSH'ing into GHA so this has now turned into me figuring out how to get useful information out of the hanging test with dtrace. A did a bit of searching to find out how to get thread stack traces from a running process and ended up trying sudo dtrace -p $PID -n 'profile-997 /pid == '$PID'/ { @[ustack(100)] = count();}' -o somefile which I don'k think is close but not quite what I'm after. It behaves like it captures just a single stack trace so I'm going to keep experimenting.

But so far, it produces this more often than not, which is interesting,

libsystem_kernel.dylib`__pipe+0xa
libcurl.4.dylib`curl_easy_perform+0x77
libaws-cpp-sdk-core.dylib`Aws::Http::CurlHttpClient::MakeRequest(std::__1::shared_ptr<Aws::Http::HttpRequest> const&, Aws::Utils::RateLimits::RateLimiterInterface*, Aws::Utils::RateLimits::RateLimiterInterface*) const+0xda6
libaws-cpp-sdk-core.dylib`std::__1::__function::__func<Aws::Client::AWSClient::AttemptOneRequest(std::__1::shared_ptr<Aws::Http::HttpRequest> const&, Aws::AmazonWebServiceRequest const&, char const*, char const*, char const*) const::$_24, std::__1::allocator<Aws::Client::AWSClient::AttemptOneRequest(std::__1::shared_ptr<Aws::Http::HttpRequest> const&, Aws::AmazonWebServiceRequest const&, char const*, char const*, char const*) const::$_24>, std::__1::shared_ptr<Aws::Http::HttpResponse> ()>::operator()()+0x26
libaws-cpp-sdk-core.dylib`std::__1::shared_ptr<Aws::Http::HttpResponse> smithy::components::tracing::TracingUtils::MakeCallWithTiming<std::__1::shared_ptr<Aws::Http::HttpResponse> >(std::__1::function<std::__1::shared_ptr<Aws::Http::HttpResponse> ()>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, smithy::components::tracing::Meter const&, std::__1::map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > >&&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)+0x48
libaws-cpp-sdk-core.dylib`Aws::Client::AWSClient::AttemptOneRequest(std::__1::shared_ptr<Aws::Http::HttpRequest> const&, Aws::AmazonWebServiceRequest const&, char const*, char const*, char const*) const+0x779
libaws-cpp-sdk-core.dylib`Aws::Client::AWSClient::AttemptExhaustively(Aws::Http::URI const&, Aws::AmazonWebServiceRequest const&, Aws::Http::HttpMethod, char const*, char const*, char const*) const+0x43b
libaws-cpp-sdk-core.dylib`Aws::Client::AWSXMLClient::MakeRequest(Aws::Http::URI const&, Aws::AmazonWebServiceRequest const&, Aws::Http::HttpMethod, char const*, char const*, char const*) const+0x42
libarrow.1600.0.0.dylib`arrow::fs::(anonymous namespace)::S3Client::GetBucketRegion(Aws::S3::Model::HeadBucketRequest const&)+0xf2
libarrow.1600.0.0.dylib`arrow::fs::(anonymous namespace)::S3Client::GetBucketRegion(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)+0xb8
libarrow.1600.0.0.dylib`arrow::fs::(anonymous namespace)::RegionResolver::ResolveRegionUncached(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)+0xda
libarrow.1600.0.0.dylib`arrow::fs::(anonymous namespace)::RegionResolver::ResolveRegion(d::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)+0x12c
libarrow.1600.0.0.dylib`arrow::fs::ResolveS3BucketRegion(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)+0x265a
libarrow_gtestd.1.11.0.dylib`void testing::internal::HandleSehExceptionsInMethodIfSuppord<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)+0x
libarrow_gtestd.1.11.0.dylib`void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)+0x6a
libarrow_gtestd.1.11.0.dylib`testing::TestInfo::Run()+0xlibarrow_gtestd.1.11.0.dylib`testing::TestSuite::Run()+0x104

@pitrou
Copy link
Member

pitrou commented Mar 14, 2024

More easily perhaps, you can use gdb -p to take control of a running process and debug it.
lldb has something similar.

@amoeba
Copy link
Member Author

amoeba commented Mar 14, 2024

Thanks. A bt all gives something similar. In an attempt to catch when the test was approaching timeout, I got a baseline for how long the test takes to complete normally which was pretty consistently ~25s. I ran the test by itself until it took longer and waited until ~60s and attached using lldb -p and then ran bt all:

* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x00007ff8179a40aa libsystem_kernel.dylib`__ulock_wait + 10
    frame #1: 0x00007ff8179e0c95 libsystem_pthread.dylib`_pthread_join + 358
    frame #2: 0x00007ff82c299cda libcurl.4.dylib`Curl_thread_join + 23
    frame #3: 0x00007ff82c28c45c libcurl.4.dylib`Curl_resolver_kill + 41
    frame #4: 0x00007ff82c2c21a0 libcurl.4.dylib`multi_done + 75
    frame #5: 0x00007ff82c2c5876 libcurl.4.dylib`multi_handle_timeout + 291
    frame #6: 0x00007ff82c2c4254 libcurl.4.dylib`multi_runsingle + 3542
    frame #7: 0x00007ff82c2c3391 libcurl.4.dylib`curl_multi_perform + 212
    frame #8: 0x00007ff82c29e7eb libcurl.4.dylib`curl_easy_perform + 299
    frame #9: 0x0000000102139df2 libaws-cpp-sdk-core.dylib`Aws::Http::CurlHttpClient::MakeRequest(std::__1::shared_ptr<Aws::Http::HttpRequest> const&, Aws::Utils::RateLimits::RateLimiterInterface*, Aws::Utils::RateLimits::RateLimiterInterface*) const + 3494
    frame #10: 0x000000010212b61c libaws-cpp-sdk-core.dylib`std::__1::__function::__func<Aws::Client::AWSClient::AttemptOneRequest(std::__1::shared_ptr<Aws::Http::HttpRequest> const&, Aws::AmazonWebServiceRequest const&, char const*, char const*, char const*) const::$_24, std::__1::allocator<Aws::Client::AWSClient::AttemptOneRequest(std::__1::shared_ptr<Aws::Http::HttpRequest> const&, Aws::AmazonWebServiceRequest const&, char const*, char const*, char const*) const::$_24>, std::__1::shared_ptr<Aws::Http::HttpResponse> ()>::operator()() + 38
    frame #11: 0x0000000102116e78 libaws-cpp-sdk-core.dylib`std::__1::shared_ptr<Aws::Http::HttpResponse> smithy::components::tracing::TracingUtils::MakeCallWithTiming<std::__1::shared_ptr<Aws::Http::HttpResponse> >(std::__1::function<std::__1::shared_ptr<Aws::Http::HttpResponse> ()>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, smithy::components::tracing::Meter const&, std::__1::map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > >&&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) + 72
    frame #12: 0x0000000102113883 libaws-cpp-sdk-core.dylib`Aws::Client::AWSClient::AttemptOneRequest(std::__1::shared_ptr<Aws::Http::HttpRequest> const&, Aws::AmazonWebServiceRequest const&, char const*, char const*, char const*) const + 1913
    frame #13: 0x00000001021116af libaws-cpp-sdk-core.dylib`Aws::Client::AWSClient::AttemptExhaustively(Aws::Http::URI const&, Aws::AmazonWebServiceRequest const&, Aws::Http::HttpMethod, char const*, char const*, char const*) const + 1083
    frame #14: 0x000000010211c612 libaws-cpp-sdk-core.dylib`Aws::Client::AWSXMLClient::MakeRequest(Aws::Http::URI const&, Aws::AmazonWebServiceRequest const&, Aws::Http::HttpMethod, char const*, char const*, char const*) const + 66
    frame #15: 0x0000000120bbb4d2 libarrow.1600.0.0.dylib`arrow::fs::(anonymous namespace)::S3Client::GetBucketRegion(Aws::S3::Model::HeadBucketRequest const&) + 242
    frame #16: 0x0000000120bbb368 libarrow.1600.0.0.dylib`arrow::fs::(anonymous namespace)::S3Client::GetBucketRegion(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) + 184
    frame #17: 0x0000000120bbac9a libarrow.1600.0.0.dylib`arrow::fs::(anonymous namespace)::RegionResolver::ResolveRegionUncached(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) + 218
    frame #18: 0x0000000120b56d3c libarrow.1600.0.0.dylib`arrow::fs::(anonymous namespace)::RegionResolver::ResolveRegion(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) + 300
    frame #19: 0x0000000120b49ab5 libarrow.1600.0.0.dylib`arrow::fs::ResolveS3BucketRegion(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) + 613
    frame #20: 0x0000000100c7cfd4 arrow-s3fs-test`arrow::fs::S3RegionResolutionTest_NonExistentBucket_Test::TestBody() + 68
    frame #21: 0x00000001026db47b libarrow_gtestd.1.11.0.dylib`void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) + 123
    frame #22: 0x000000010269dcba libarrow_gtestd.1.11.0.dylib`void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) + 106
    frame #23: 0x000000010269dc03 libarrow_gtestd.1.11.0.dylib`testing::Test::Run() + 195
    frame #24: 0x000000010269ed81 libarrow_gtestd.1.11.0.dylib`testing::TestInfo::Run() + 241
    frame #25: 0x000000010269fe24 libarrow_gtestd.1.11.0.dylib`testing::TestSuite::Run() + 260
    frame #26: 0x00000001026af16b libarrow_gtestd.1.11.0.dylib`testing::internal::UnitTestImpl::RunAllTests() + 987
    frame #27: 0x00000001026e04bb libarrow_gtestd.1.11.0.dylib`bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) + 123
    frame #28: 0x00000001026aeb2a libarrow_gtestd.1.11.0.dylib`bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) + 106
    frame #29: 0x00000001026aea15 libarrow_gtestd.1.11.0.dylib`testing::UnitTest::Run() + 197
    frame #30: 0x00000001011e0f21 libarrow_gtest_maind.1.11.0.dylib`RUN_ALL_TESTS() + 17
    frame #31: 0x00000001011e0efd libarrow_gtest_maind.1.11.0.dylib`main + 61
    frame #32: 0x000000010afe352e dyld`start + 462
  thread #2
    frame #0: 0x00007ff8179a401a libsystem_kernel.dylib`__workq_kernreturn + 10
    frame #1: 0x00007ff8179dc034 libsystem_pthread.dylib`_pthread_wqthread + 426
    frame #2: 0x00007ff8179daf57 libsystem_pthread.dylib`start_wqthread + 15
  thread #3, name = 'AwsEventLoop 1'
    frame #0: 0x00007ff8179a730e libsystem_kernel.dylib`kevent + 10
    frame #1: 0x0000000101fe99ec libaws-c-io.1.0.0.dylib`aws_event_loop_thread + 404
    frame #2: 0x00000001024ffd3d libaws-c-common.1.0.0.dylib`thread_fn + 345
    frame #3: 0x00007ff8179df4e1 libsystem_pthread.dylib`_pthread_start + 125
    frame #4: 0x00007ff8179daf6b libsystem_pthread.dylib`thread_start + 15
  thread #4
    frame #0: 0x00007ff8179a730e libsystem_kernel.dylib`kevent + 10
    frame #1: 0x00007ff817a052ee libsystem_info.dylib`_mdns_search_ex + 1217
    frame #2: 0x00007ff817a07c1c libsystem_info.dylib`mdns_addrinfo + 483
    frame #3: 0x00007ff817a079e4 libsystem_info.dylib`search_addrinfo + 164
    frame #4: 0x00007ff817a00659 libsystem_info.dylib`si_addrinfo + 1811
    frame #5: 0x00007ff8179ffeac libsystem_info.dylib`getaddrinfo + 176
    frame #6: 0x00007ff82c296186 libcurl.4.dylib`Curl_getaddrinfo_ex + 36
    frame #7: 0x00007ff82c28ca24 libcurl.4.dylib`getaddrinfo_thread + 83
    frame #8: 0x00007ff82c299c8d libcurl.4.dylib`curl_thread_create_thunk + 29
    frame #9: 0x00007ff8179df4e1 libsystem_pthread.dylib`_pthread_start + 125
    frame #10: 0x00007ff8179daf6b libsystem_pthread.dylib`thread_start + 15

@pitrou
Copy link
Member

pitrou commented Mar 15, 2024

Thanks a lot @amoeba . It appears therefore that it is timing out when trying to join a thread using Curl_thread_join. This is quite unexpected, though there seems to be a similar issue reported on the Internet:
nim-lang/Nim#15019 (comment)

Edit: at second sight, the sentence above is incorrect. There is a thread pending on Curl_getaddrinfo_ex, which means that the DNS request is still ongoing. Presumably DNS requests on macOS can take a lot of time to time out?

@pitrou
Copy link
Member

pitrou commented Mar 15, 2024

It would be nice if you could try again to catch another backtrace, so that we can see if it's always the same test timing out. If it is, we could just disable that test on macOS...

@pitrou
Copy link
Member

pitrou commented Mar 15, 2024

Even better: can you try getting a traceback later? It seems DNS resolution timeout on macOS is 30 seconds.

@pitrou
Copy link
Member

pitrou commented Sep 12, 2024

I've tried reenabling it but it still times out

@pitrou
Copy link
Member

pitrou commented Sep 18, 2024

@amoeba Could you make progress on the diagnosis here? Otherwise, perhaps we can simply skip the NonExistentBucket test on macOS, which would hopefully allow re-enabling the other S3 tests on CI.

@amoeba
Copy link
Member Author

amoeba commented Sep 18, 2024

Yeah, I can pick this back up this week. Sorry I dropped it!

@amoeba
Copy link
Member Author

amoeba commented Sep 23, 2024

I spent some time Saturday trying to reproduce the timeout like I'd done before and couldn't until I was about to head out so I'll keep trying this week.

@amoeba
Copy link
Member Author

amoeba commented Sep 23, 2024

I think I was able to reproduce the hang this morning and I collected 6 backtraces spread out of 1-2min that were all essentially identical:

bt all output
(lldb) bt all
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x000000019213bee8 libsystem_kernel.dylib`__select + 8
    frame #1: 0x00000001a9c3cc64 libcurl.4.dylib`Curl_poll + 516
    frame #2: 0x00000001a9c3501c libcurl.4.dylib`multi_wait + 640
    frame #3: 0x00000001a9c0eda8 libcurl.4.dylib`curl_easy_perform + 268
    frame #4: 0x0000000105b7f720 libaws-cpp-sdk-core.dylib`Aws::Http::CurlHttpClient::MakeRequest(std::__1::shared_ptr<Aws::Http::HttpRequest> const&, Aws::Utils::RateLimits::RateLimiterInterface*, Aws::Utils::RateLimits::RateLimiterInterface*) const + 3524
    frame #5: 0x0000000105b425a8 libaws-cpp-sdk-core.dylib`std::__1::shared_ptr<Aws::Http::HttpResponse> smithy::components::tracing::TracingUtils::MakeCallWithTiming<std::__1::shared_ptr<Aws::Http::HttpResponse>>(std::__1::function<std::__1::shared_ptr<Aws::Http::HttpResponse> ()>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, smithy::components::tracing::Meter const&, std::__1::map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>>&&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) + 84
    frame #6: 0x0000000105b4e24c libaws-cpp-sdk-core.dylib`Aws::Client::AWSClient::AttemptOneRequest(std::__1::shared_ptr<Aws::Http::HttpRequest> const&, Aws::AmazonWebServiceRequest const&, char const*, char const*, char const*) const + 1772
    frame #7: 0x0000000105b4c3e8 libaws-cpp-sdk-core.dylib`Aws::Client::AWSClient::AttemptExhaustively(Aws::Http::URI const&, Aws::AmazonWebServiceRequest const&, Aws::Http::HttpMethod, char const*, char const*, char const*) const + 948
    frame #8: 0x0000000105b55a0c libaws-cpp-sdk-core.dylib`Aws::Client::AWSXMLClient::MakeRequest(Aws::Http::URI const&, Aws::AmazonWebServiceRequest const&, Aws::Http::HttpMethod, char const*, char const*, char const*) const + 64
    frame #9: 0x0000000105eec284 libaws-cpp-sdk-s3.dylib`std::__1::__function::__func<Aws::S3::S3Client::PutObject(Aws::S3::Model::PutObjectRequest const&) const::$_105, std::__1::allocator<Aws::S3::S3Client::PutObject(Aws::S3::Model::PutObjectRequest const&) const::$_105>, Aws::Utils::Outcome<Aws::S3::Model::PutObjectResult, Aws::S3::S3Error> ()>::operator()() + 1624
    frame #10: 0x0000000105e94050 libaws-cpp-sdk-s3.dylib`Aws::Utils::Outcome<Aws::S3::Model::PutObjectResult, Aws::S3::S3Error> smithy::components::tracing::TracingUtils::MakeCallWithTiming<Aws::Utils::Outcome<Aws::S3::Model::PutObjectResult, Aws::S3::S3Error>>(std::__1::function<Aws::Utils::Outcome<Aws::S3::Model::PutObjectResult, Aws::S3::S3Error> ()>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, smithy::components::tracing::Meter const&, std::__1::map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>>&&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) + 84
    frame #11: 0x0000000105e936c0 libaws-cpp-sdk-s3.dylib`Aws::S3::S3Client::PutObject(Aws::S3::Model::PutObjectRequest const&) const + 1512
    frame #12: 0x0000000110ef0060 libarrow.1800.dylib`arrow::fs::S3FileSystem::Impl::CreateEmptyDir(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string_view<char, std::__1::char_traits<char>>) + 524
    frame #13: 0x0000000110eeecf4 libarrow.1800.dylib`arrow::fs::S3FileSystem::CreateDir(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, bool) + 1820
    frame #14: 0x00000001046e0a30 arrow-s3fs-test`arrow::fs::FileSystem::CreateDir(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) + 56
    frame #15: 0x00000001046e349c arrow-s3fs-test`arrow::fs::TestS3FS_CreateDir_Test::TestBody() + 4432
    frame #16: 0x00000001060dfad8 libarrow_gtestd.1.11.0.dylib`void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) + 132
    frame #17: 0x000000010609d8b8 libarrow_gtestd.1.11.0.dylib`void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) + 96
    frame #18: 0x000000010609d808 libarrow_gtestd.1.11.0.dylib`testing::Test::Run() + 192
    frame #19: 0x000000010609e8ac libarrow_gtestd.1.11.0.dylib`testing::TestInfo::Run() + 244
    frame #20: 0x000000010609f99c libarrow_gtestd.1.11.0.dylib`testing::TestSuite::Run() + 276
    frame #21: 0x00000001060ad70c libarrow_gtestd.1.11.0.dylib`testing::internal::UnitTestImpl::RunAllTests() + 1008
    frame #22: 0x00000001060e6d6c libarrow_gtestd.1.11.0.dylib`bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) + 132
    frame #23: 0x00000001060ad0e0 libarrow_gtestd.1.11.0.dylib`bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) + 96
    frame #24: 0x00000001060acfcc libarrow_gtestd.1.11.0.dylib`testing::UnitTest::Run() + 216
    frame #25: 0x0000000104dabef0 libarrow_gtest_maind.1.11.0.dylib`RUN_ALL_TESTS() + 16
    frame #26: 0x0000000104dabed4 libarrow_gtest_maind.1.11.0.dylib`main + 76
    frame #27: 0x0000000191de7154 dyld`start + 2476
  thread #2
    frame #0: 0x0000000192132aa4 libsystem_kernel.dylib`__workq_kernreturn + 8
  thread #3, name = 'AwsEventLoop 1'
    frame #0: 0x0000000192136f40 libsystem_kernel.dylib`kevent + 8
    frame #1: 0x0000000105cea0d4 libaws-c-io.1.0.0.dylib`aws_event_loop_thread + 412
    frame #2: 0x0000000105d4cd18 libaws-c-common.1.dylib`thread_fn + 340
    frame #3: 0x0000000192171f94 libsystem_pthread.dylib`_pthread_start + 136
  thread #4
    frame #0: 0x00000001921345ec libsystem_kernel.dylib`__psynch_cvwait + 8
    frame #1: 0x000000019217255c libsystem_pthread.dylib`_pthread_cond_wait + 1228
    frame #2: 0x0000000192097b14 libc++.1.dylib`std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) + 28
    frame #3: 0x000000011135129c libarrow.1800.dylib`arrow::internal::WorkerLoop(std::__1::shared_ptr<arrow::internal::ThreadPool::State>, std::__1::__list_iterator<std::__1::thread, void*>) + 1040
    frame #4: 0x0000000111350e54 libarrow.1800.dylib`arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6::operator()() const + 88
    frame #5: 0x0000000111350dc8 libarrow.1800.dylib`decltype(std::declval<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>()()) std::__1::__invoke[abi:ue170006]<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>(arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6&&) + 24
    frame #6: 0x0000000111350da4 libarrow.1800.dylib`void std::__1::__thread_execute[abi:ue170006]<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>&, std::__1::__tuple_indices<>) + 28
    frame #7: 0x0000000111350a1c libarrow.1800.dylib`void* std::__1::__thread_proxy[abi:ue170006]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>>(void*) + 84
    frame #8: 0x0000000192171f94 libsystem_pthread.dylib`_pthread_start + 136
  thread #5
    frame #0: 0x00000001921345ec libsystem_kernel.dylib`__psynch_cvwait + 8
    frame #1: 0x000000019217255c libsystem_pthread.dylib`_pthread_cond_wait + 1228
    frame #2: 0x0000000192097b14 libc++.1.dylib`std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) + 28
    frame #3: 0x000000011135129c libarrow.1800.dylib`arrow::internal::WorkerLoop(std::__1::shared_ptr<arrow::internal::ThreadPool::State>, std::__1::__list_iterator<std::__1::thread, void*>) + 1040
    frame #4: 0x0000000111350e54 libarrow.1800.dylib`arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6::operator()() const + 88
    frame #5: 0x0000000111350dc8 libarrow.1800.dylib`decltype(std::declval<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>()()) std::__1::__invoke[abi:ue170006]<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>(arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6&&) + 24
    frame #6: 0x0000000111350da4 libarrow.1800.dylib`void std::__1::__thread_execute[abi:ue170006]<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>&, std::__1::__tuple_indices<>) + 28
    frame #7: 0x0000000111350a1c libarrow.1800.dylib`void* std::__1::__thread_proxy[abi:ue170006]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>>(void*) + 84
    frame #8: 0x0000000192171f94 libsystem_pthread.dylib`_pthread_start + 136
  thread #6
    frame #0: 0x00000001921345ec libsystem_kernel.dylib`__psynch_cvwait + 8
    frame #1: 0x000000019217255c libsystem_pthread.dylib`_pthread_cond_wait + 1228
    frame #2: 0x0000000192097b14 libc++.1.dylib`std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) + 28
    frame #3: 0x000000011135129c libarrow.1800.dylib`arrow::internal::WorkerLoop(std::__1::shared_ptr<arrow::internal::ThreadPool::State>, std::__1::__list_iterator<std::__1::thread, void*>) + 1040
    frame #4: 0x0000000111350e54 libarrow.1800.dylib`arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6::operator()() const + 88
    frame #5: 0x0000000111350dc8 libarrow.1800.dylib`decltype(std::declval<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>()()) std::__1::__invoke[abi:ue170006]<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>(arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6&&) + 24
    frame #6: 0x0000000111350da4 libarrow.1800.dylib`void std::__1::__thread_execute[abi:ue170006]<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>&, std::__1::__tuple_indices<>) + 28
    frame #7: 0x0000000111350a1c libarrow.1800.dylib`void* std::__1::__thread_proxy[abi:ue170006]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>>(void*) + 84
    frame #8: 0x0000000192171f94 libsystem_pthread.dylib`_pthread_start + 136
  thread #7
    frame #0: 0x00000001921345ec libsystem_kernel.dylib`__psynch_cvwait + 8
    frame #1: 0x000000019217255c libsystem_pthread.dylib`_pthread_cond_wait + 1228
    frame #2: 0x0000000192097b14 libc++.1.dylib`std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) + 28
    frame #3: 0x000000011135129c libarrow.1800.dylib`arrow::internal::WorkerLoop(std::__1::shared_ptr<arrow::internal::ThreadPool::State>, std::__1::__list_iterator<std::__1::thread, void*>) + 1040
    frame #4: 0x0000000111350e54 libarrow.1800.dylib`arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6::operator()() const + 88
    frame #5: 0x0000000111350dc8 libarrow.1800.dylib`decltype(std::declval<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>()()) std::__1::__invoke[abi:ue170006]<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>(arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6&&) + 24
    frame #6: 0x0000000111350da4 libarrow.1800.dylib`void std::__1::__thread_execute[abi:ue170006]<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>&, std::__1::__tuple_indices<>) + 28
    frame #7: 0x0000000111350a1c libarrow.1800.dylib`void* std::__1::__thread_proxy[abi:ue170006]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>>(void*) + 84
    frame #8: 0x0000000192171f94 libsystem_pthread.dylib`_pthread_start + 136
  thread #8
    frame #0: 0x00000001921345ec libsystem_kernel.dylib`__psynch_cvwait + 8
    frame #1: 0x000000019217255c libsystem_pthread.dylib`_pthread_cond_wait + 1228
    frame #2: 0x0000000192097b14 libc++.1.dylib`std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) + 28
    frame #3: 0x000000011135129c libarrow.1800.dylib`arrow::internal::WorkerLoop(std::__1::shared_ptr<arrow::internal::ThreadPool::State>, std::__1::__list_iterator<std::__1::thread, void*>) + 1040
    frame #4: 0x0000000111350e54 libarrow.1800.dylib`arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6::operator()() const + 88
    frame #5: 0x0000000111350dc8 libarrow.1800.dylib`decltype(std::declval<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>()()) std::__1::__invoke[abi:ue170006]<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>(arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6&&) + 24
    frame #6: 0x0000000111350da4 libarrow.1800.dylib`void std::__1::__thread_execute[abi:ue170006]<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>&, std::__1::__tuple_indices<>) + 28
    frame #7: 0x0000000111350a1c libarrow.1800.dylib`void* std::__1::__thread_proxy[abi:ue170006]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>>(void*) + 84
    frame #8: 0x0000000192171f94 libsystem_pthread.dylib`_pthread_start + 136
  thread #9
    frame #0: 0x00000001921345ec libsystem_kernel.dylib`__psynch_cvwait + 8
    frame #1: 0x000000019217255c libsystem_pthread.dylib`_pthread_cond_wait + 1228
    frame #2: 0x0000000192097b14 libc++.1.dylib`std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) + 28
    frame #3: 0x000000011135129c libarrow.1800.dylib`arrow::internal::WorkerLoop(std::__1::shared_ptr<arrow::internal::ThreadPool::State>, std::__1::__list_iterator<std::__1::thread, void*>) + 1040
    frame #4: 0x0000000111350e54 libarrow.1800.dylib`arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6::operator()() const + 88
    frame #5: 0x0000000111350dc8 libarrow.1800.dylib`decltype(std::declval<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>()()) std::__1::__invoke[abi:ue170006]<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>(arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6&&) + 24
    frame #6: 0x0000000111350da4 libarrow.1800.dylib`void std::__1::__thread_execute[abi:ue170006]<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>&, std::__1::__tuple_indices<>) + 28
    frame #7: 0x0000000111350a1c libarrow.1800.dylib`void* std::__1::__thread_proxy[abi:ue170006]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>>(void*) + 84
    frame #8: 0x0000000192171f94 libsystem_pthread.dylib`_pthread_start + 136
  thread #10
    frame #0: 0x00000001921345ec libsystem_kernel.dylib`__psynch_cvwait + 8
    frame #1: 0x000000019217255c libsystem_pthread.dylib`_pthread_cond_wait + 1228
    frame #2: 0x0000000192097b14 libc++.1.dylib`std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) + 28
    frame #3: 0x000000011135129c libarrow.1800.dylib`arrow::internal::WorkerLoop(std::__1::shared_ptr<arrow::internal::ThreadPool::State>, std::__1::__list_iterator<std::__1::thread, void*>) + 1040
    frame #4: 0x0000000111350e54 libarrow.1800.dylib`arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6::operator()() const + 88
    frame #5: 0x0000000111350dc8 libarrow.1800.dylib`decltype(std::declval<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>()()) std::__1::__invoke[abi:ue170006]<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>(arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6&&) + 24
    frame #6: 0x0000000111350da4 libarrow.1800.dylib`void std::__1::__thread_execute[abi:ue170006]<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>&, std::__1::__tuple_indices<>) + 28
    frame #7: 0x0000000111350a1c libarrow.1800.dylib`void* std::__1::__thread_proxy[abi:ue170006]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>>(void*) + 84
    frame #8: 0x0000000192171f94 libsystem_pthread.dylib`_pthread_start + 136
  thread #11
    frame #0: 0x00000001921345ec libsystem_kernel.dylib`__psynch_cvwait + 8
    frame #1: 0x000000019217255c libsystem_pthread.dylib`_pthread_cond_wait + 1228
    frame #2: 0x0000000192097b14 libc++.1.dylib`std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) + 28
    frame #3: 0x000000011135129c libarrow.1800.dylib`arrow::internal::WorkerLoop(std::__1::shared_ptr<arrow::internal::ThreadPool::State>, std::__1::__list_iterator<std::__1::thread, void*>) + 1040
    frame #4: 0x0000000111350e54 libarrow.1800.dylib`arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6::operator()() const + 88
    frame #5: 0x0000000111350dc8 libarrow.1800.dylib`decltype(std::declval<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>()()) std::__1::__invoke[abi:ue170006]<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>(arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6&&) + 24
    frame #6: 0x0000000111350da4 libarrow.1800.dylib`void std::__1::__thread_execute[abi:ue170006]<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>&, std::__1::__tuple_indices<>) + 28
    frame #7: 0x0000000111350a1c libarrow.1800.dylib`void* std::__1::__thread_proxy[abi:ue170006]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_6>>(void*) + 84
    frame #8: 0x0000000192171f94 libsystem_pthread.dylib`_pthread_start + 136
(lldb)

This trace is sufficiently different from the previous one that I'm going to try to capture another trace later today. The earlier trace was in a different test and stuck on a different syscall. Will report back.

@amoeba
Copy link
Member Author

amoeba commented Sep 24, 2024

I wasn't able to reproduce the hang and get a debugger attached after numerous attempts today. I can keep trying but I'm not sure what a good next step here would be. The process I've been taking has been to run a modified cpp.yml workflow file, conditionally start a tmate session if the tests timeout, ssh into the VM, re-run the tests and wait until it appears things have timed out, then finally attach via lldb.

@pitrou
Copy link
Member

pitrou commented Sep 24, 2024

Thanks @amoeba . I think we're quite clear now that every hang that could be reproduced was simply stalled waiting for a network response, so it's probably not a bug in our S3 filesystem at least.

We could perhaps try to disable the non-existent bucket tests on macOS to see if that makes our CI more reliable...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants