Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2024.2.0.0-b116][AlmaLinux9][Connection Pooling][YBM] Connections erroring out during query execution with error 'SSL SYSCALL error: EOF detected'. #24979

Open
yugabyte-ci opened this issue Nov 19, 2024 · 0 comments
Assignees
Labels
jira-originated kind/bug This issue is a bug priority/highest Highest priority issue

Comments

@yugabyte-ci
Copy link
Contributor

yugabyte-ci commented Nov 19, 2024

Jira Link: DB-13388

@yugabyte-ci yugabyte-ci added jira-originated kind/bug This issue is a bug priority/highest Highest priority issue labels Nov 19, 2024
manav-yb pushed a commit that referenced this issue Nov 20, 2024
…tion manager

Summary:
Set pthread_attr_setstacksize to 512 KB in ysql connection manager. This is to fix crashes involving Alma 9 machines.
```
#0  0x000055e430191fd7 in tcmalloc::tcmalloc_internal::PageTracker::Get(tcmalloc::tcmalloc_internal::Length) ()
#1  0x000055e430192774 in tcmalloc::tcmalloc_internal::HugePageFiller<tcmalloc::tcmalloc_internal::PageTracker>::TryGet(tcmalloc::tcmalloc_internal::Length, unsigned long)
    ()
#2  0x000055e430160b3a in tcmalloc::tcmalloc_internal::HugePageAwareAllocator::New(tcmalloc::tcmalloc_internal::Length, unsigned long) ()
#3  0x000055e4301437a2 in void* tcmalloc::tcmalloc_internal::SampleifyAllocation<tcmalloc::tcmalloc_internal::Static, tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::MallocOomPolicy, tcmalloc::tcmalloc_internal::MallocAlignPolicy, tcmalloc::tcmalloc_internal::AllocationAccessHotPolicy, tcmalloc::tcmalloc_internal::InvokeHooksPolicy, tcmalloc::tcmalloc_internal::LocalNumaPartitionPolicy> >(tcmalloc::tcmalloc_internal::Static&, tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::MallocOomPolicy, tcmalloc::tcmalloc_internal::MallocAlignPolicy, tcmalloc::tcmalloc_internal::AllocationAccessHotPolicy, tcmalloc::tcmalloc_internal::InvokeHooksPolicy, tcmalloc::tcmalloc_internal::LocalNumaPartitionPolicy>, unsigned long, unsigned long, unsigned long, void*, tcmalloc::tcmalloc_internal::Span*, unsigned long*) ()
#4  0x000055e43014339b in void* slow_alloc<tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::MallocOomPolicy, tcmalloc::tcmalloc_internal::MallocAlignPolicy, tcmalloc::tcmalloc_internal::AllocationAccessHotPolicy, tcmalloc::tcmalloc_internal::InvokeHooksPolicy, tcmalloc::tcmalloc_internal::LocalNumaPartitionPolicy>, decltype(nullptr)>(tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::MallocOomPolicy, tcmalloc::tcmalloc_internal::MallocAlignPolicy, tcmalloc::tcmalloc_internal::AllocationAccessHotPolicy, tcmalloc::tcmalloc_internal::InvokeHooksPolicy, tcmalloc::tcmalloc_internal::LocalNumaPartitionPolicy>, unsigned long, decltype(nullptr))
    ()
#5  0x000055e4301404e6 in malloc ()
#6  0x00007fc3a67b1d7e in ssl3_setup_write_buffer () from /home/centos/code/local_testing/conn_manager_ssl_auth/yugabyte-b124/bin/../lib/yb-thirdparty/libssl.so.3
#7  0x00007fc3a67ae824 in do_ssl3_write () from /home/centos/code/local_testing/conn_manager_ssl_auth/yugabyte-b124/bin/../lib/yb-thirdparty/libssl.so.3
#8  0x00007fc3a67ae3e1 in ssl3_write_bytes () from /home/centos/code/local_testing/conn_manager_ssl_auth/yugabyte-b124/bin/../lib/yb-thirdparty/libssl.so.3
#9  0x00007fc3a67cc251 in ssl3_do_write () from /home/centos/code/local_testing/conn_manager_ssl_auth/yugabyte-b124/bin/../lib/yb-thirdparty/libssl.so.3
#10 0x00007fc3a67c1a6a in state_machine () from /home/centos/code/local_testing/conn_manager_ssl_auth/yugabyte-b124/bin/../lib/yb-thirdparty/libssl.so.3
#11 0x000055e43013d303 in mm_tls_handshake_cb (handle=<optimized out>) at ../../src/odyssey/third_party/machinarium/sources/tls.c:453
#12 0x000055e43013b9e7 in mm_epoll_step (poll=0x37beffd71a60, timeout=<optimized out>) at ../../src/odyssey/third_party/machinarium/sources/epoll.c:79
#13 0x000055e43013b386 in mm_loop_step (loop=0x37beffd72980) at ../../src/odyssey/third_party/machinarium/sources/loop.c:64
#14 machine_main (arg=0x37beffd72780) at ../../src/odyssey/third_party/machinarium/sources/machine.c:56
#15 0x00007fc3a5e89c02 in start_thread (arg=<optimized out>) at pthread_create.c:443
#16 0x00007fc3a5f0ec40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81```
Note that, this 512 KB value is the same as the value used in tserver and master processes via the min_thread_stack_size_bytes GFlag introduced in https://phorge.dev.yugabyte.com/D38053.
Jira: DB-13388

Test Plan: Jenkins: enable connection manager, all tests

Reviewers: skumar, stiwary

Reviewed By: stiwary

Subscribers: yql

Differential Revision: https://phorge.dev.yugabyte.com/D40087
manav-yb pushed a commit that referenced this issue Nov 20, 2024
…KB in ysql connection manager

Summary:
Original commit: None / D40087
Set pthread_attr_setstacksize to 512 KB in ysql connection manager. This is to fix crashes involving Alma 9 machines.
```
#0  0x000055e430191fd7 in tcmalloc::tcmalloc_internal::PageTracker::Get(tcmalloc::tcmalloc_internal::Length) ()
#1  0x000055e430192774 in tcmalloc::tcmalloc_internal::HugePageFiller<tcmalloc::tcmalloc_internal::PageTracker>::TryGet(tcmalloc::tcmalloc_internal::Length, unsigned long)
    ()
#2  0x000055e430160b3a in tcmalloc::tcmalloc_internal::HugePageAwareAllocator::New(tcmalloc::tcmalloc_internal::Length, unsigned long) ()
#3  0x000055e4301437a2 in void* tcmalloc::tcmalloc_internal::SampleifyAllocation<tcmalloc::tcmalloc_internal::Static, tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::MallocOomPolicy, tcmalloc::tcmalloc_internal::MallocAlignPolicy, tcmalloc::tcmalloc_internal::AllocationAccessHotPolicy, tcmalloc::tcmalloc_internal::InvokeHooksPolicy, tcmalloc::tcmalloc_internal::LocalNumaPartitionPolicy> >(tcmalloc::tcmalloc_internal::Static&, tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::MallocOomPolicy, tcmalloc::tcmalloc_internal::MallocAlignPolicy, tcmalloc::tcmalloc_internal::AllocationAccessHotPolicy, tcmalloc::tcmalloc_internal::InvokeHooksPolicy, tcmalloc::tcmalloc_internal::LocalNumaPartitionPolicy>, unsigned long, unsigned long, unsigned long, void*, tcmalloc::tcmalloc_internal::Span*, unsigned long*) ()
#4  0x000055e43014339b in void* slow_alloc<tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::MallocOomPolicy, tcmalloc::tcmalloc_internal::MallocAlignPolicy, tcmalloc::tcmalloc_internal::AllocationAccessHotPolicy, tcmalloc::tcmalloc_internal::InvokeHooksPolicy, tcmalloc::tcmalloc_internal::LocalNumaPartitionPolicy>, decltype(nullptr)>(tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::MallocOomPolicy, tcmalloc::tcmalloc_internal::MallocAlignPolicy, tcmalloc::tcmalloc_internal::AllocationAccessHotPolicy, tcmalloc::tcmalloc_internal::InvokeHooksPolicy, tcmalloc::tcmalloc_internal::LocalNumaPartitionPolicy>, unsigned long, decltype(nullptr))
    ()
#5  0x000055e4301404e6 in malloc ()
#6  0x00007fc3a67b1d7e in ssl3_setup_write_buffer () from /home/centos/code/local_testing/conn_manager_ssl_auth/yugabyte-b124/bin/../lib/yb-thirdparty/libssl.so.3
#7  0x00007fc3a67ae824 in do_ssl3_write () from /home/centos/code/local_testing/conn_manager_ssl_auth/yugabyte-b124/bin/../lib/yb-thirdparty/libssl.so.3
#8  0x00007fc3a67ae3e1 in ssl3_write_bytes () from /home/centos/code/local_testing/conn_manager_ssl_auth/yugabyte-b124/bin/../lib/yb-thirdparty/libssl.so.3
#9  0x00007fc3a67cc251 in ssl3_do_write () from /home/centos/code/local_testing/conn_manager_ssl_auth/yugabyte-b124/bin/../lib/yb-thirdparty/libssl.so.3
#10 0x00007fc3a67c1a6a in state_machine () from /home/centos/code/local_testing/conn_manager_ssl_auth/yugabyte-b124/bin/../lib/yb-thirdparty/libssl.so.3
#11 0x000055e43013d303 in mm_tls_handshake_cb (handle=<optimized out>) at ../../src/odyssey/third_party/machinarium/sources/tls.c:453
#12 0x000055e43013b9e7 in mm_epoll_step (poll=0x37beffd71a60, timeout=<optimized out>) at ../../src/odyssey/third_party/machinarium/sources/epoll.c:79
#13 0x000055e43013b386 in mm_loop_step (loop=0x37beffd72980) at ../../src/odyssey/third_party/machinarium/sources/loop.c:64
#14 machine_main (arg=0x37beffd72780) at ../../src/odyssey/third_party/machinarium/sources/machine.c:56
#15 0x00007fc3a5e89c02 in start_thread (arg=<optimized out>) at pthread_create.c:443
#16 0x00007fc3a5f0ec40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81```
Note that, this 512 KB value is the same as the value used in tserver and master processes via the min_thread_stack_size_bytes GFlag introduced in https://phorge.dev.yugabyte.com/D38053.
Jira: DB-13388

Test Plan: Jenkins: enable connection manager, all tests

Reviewers: skumar, stiwary

Reviewed By: stiwary

Subscribers: yql

Differential Revision: https://phorge.dev.yugabyte.com/D40088
manav-yb pushed a commit that referenced this issue Nov 20, 2024
…2 KB in ysql connection manager

Summary:
Original commit: None / D40088
Set pthread_attr_setstacksize to 512 KB in ysql connection manager. This is to fix crashes involving Alma 9 machines.
```
#0  0x000055e430191fd7 in tcmalloc::tcmalloc_internal::PageTracker::Get(tcmalloc::tcmalloc_internal::Length) ()
#1  0x000055e430192774 in tcmalloc::tcmalloc_internal::HugePageFiller<tcmalloc::tcmalloc_internal::PageTracker>::TryGet(tcmalloc::tcmalloc_internal::Length, unsigned long)
    ()
#2  0x000055e430160b3a in tcmalloc::tcmalloc_internal::HugePageAwareAllocator::New(tcmalloc::tcmalloc_internal::Length, unsigned long) ()
#3  0x000055e4301437a2 in void* tcmalloc::tcmalloc_internal::SampleifyAllocation<tcmalloc::tcmalloc_internal::Static, tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::MallocOomPolicy, tcmalloc::tcmalloc_internal::MallocAlignPolicy, tcmalloc::tcmalloc_internal::AllocationAccessHotPolicy, tcmalloc::tcmalloc_internal::InvokeHooksPolicy, tcmalloc::tcmalloc_internal::LocalNumaPartitionPolicy> >(tcmalloc::tcmalloc_internal::Static&, tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::MallocOomPolicy, tcmalloc::tcmalloc_internal::MallocAlignPolicy, tcmalloc::tcmalloc_internal::AllocationAccessHotPolicy, tcmalloc::tcmalloc_internal::InvokeHooksPolicy, tcmalloc::tcmalloc_internal::LocalNumaPartitionPolicy>, unsigned long, unsigned long, unsigned long, void*, tcmalloc::tcmalloc_internal::Span*, unsigned long*) ()
#4  0x000055e43014339b in void* slow_alloc<tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::MallocOomPolicy, tcmalloc::tcmalloc_internal::MallocAlignPolicy, tcmalloc::tcmalloc_internal::AllocationAccessHotPolicy, tcmalloc::tcmalloc_internal::InvokeHooksPolicy, tcmalloc::tcmalloc_internal::LocalNumaPartitionPolicy>, decltype(nullptr)>(tcmalloc::tcmalloc_internal::TCMallocPolicy<tcmalloc::tcmalloc_internal::MallocOomPolicy, tcmalloc::tcmalloc_internal::MallocAlignPolicy, tcmalloc::tcmalloc_internal::AllocationAccessHotPolicy, tcmalloc::tcmalloc_internal::InvokeHooksPolicy, tcmalloc::tcmalloc_internal::LocalNumaPartitionPolicy>, unsigned long, decltype(nullptr))
    ()
#5  0x000055e4301404e6 in malloc ()
#6  0x00007fc3a67b1d7e in ssl3_setup_write_buffer () from /home/centos/code/local_testing/conn_manager_ssl_auth/yugabyte-b124/bin/../lib/yb-thirdparty/libssl.so.3
#7  0x00007fc3a67ae824 in do_ssl3_write () from /home/centos/code/local_testing/conn_manager_ssl_auth/yugabyte-b124/bin/../lib/yb-thirdparty/libssl.so.3
#8  0x00007fc3a67ae3e1 in ssl3_write_bytes () from /home/centos/code/local_testing/conn_manager_ssl_auth/yugabyte-b124/bin/../lib/yb-thirdparty/libssl.so.3
#9  0x00007fc3a67cc251 in ssl3_do_write () from /home/centos/code/local_testing/conn_manager_ssl_auth/yugabyte-b124/bin/../lib/yb-thirdparty/libssl.so.3
#10 0x00007fc3a67c1a6a in state_machine () from /home/centos/code/local_testing/conn_manager_ssl_auth/yugabyte-b124/bin/../lib/yb-thirdparty/libssl.so.3
#11 0x000055e43013d303 in mm_tls_handshake_cb (handle=<optimized out>) at ../../src/odyssey/third_party/machinarium/sources/tls.c:453
#12 0x000055e43013b9e7 in mm_epoll_step (poll=0x37beffd71a60, timeout=<optimized out>) at ../../src/odyssey/third_party/machinarium/sources/epoll.c:79
#13 0x000055e43013b386 in mm_loop_step (loop=0x37beffd72980) at ../../src/odyssey/third_party/machinarium/sources/loop.c:64
#14 machine_main (arg=0x37beffd72780) at ../../src/odyssey/third_party/machinarium/sources/machine.c:56
#15 0x00007fc3a5e89c02 in start_thread (arg=<optimized out>) at pthread_create.c:443
#16 0x00007fc3a5f0ec40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81```
Note that, this 512 KB value is the same as the value used in tserver and master processes via the min_thread_stack_size_bytes GFlag introduced in https://phorge.dev.yugabyte.com/D38053.
Jira: DB-13388

Test Plan: Jenkins: enable connection manager, all tests

Reviewers: skumar, stiwary

Reviewed By: stiwary

Subscribers: yql

Differential Revision: https://phorge.dev.yugabyte.com/D40089
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira-originated kind/bug This issue is a bug priority/highest Highest priority issue
Projects
None yet
Development

No branches or pull requests

2 participants