-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix](brpc) coredump caused by brpc checking #44047
Conversation
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
TeamCity be ut coverage result: |
``` /root/doris/be/src/runtime/fragment_mgr.cpp:1064:20: runtime error: member call on null pointer of type 'doris::PBackendService_Stub' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /root/doris/be/src/runtime/fragment_mgr.cpp:1064:20 in *** Query id: 0-0 *** *** is nereids: 0 *** *** tablet id: 0 *** *** Aborted at 1731663847 (unix time) try "date -d @1731663847" if you are using GNU date *** *** Current BE git commitID: b663df0e50 *** *** SIGSEGV address not mapped to object (@0x0) received by PID 17169 (TID 17463 OR 0x7f746d21a700) from PID 0; stack trace: *** 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris/be/src/common/signal_handler.h:421 1# PosixSignals::chained_handler(int, siginfo_t*, void*) [clone .part.0] in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so 2# JVM_handle_linux_signal in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so 3# 0x00007F7601263090 in /lib/x86_64-linux-gnu/libc.so.6 4# doris::FragmentMgr::_check_brpc_available(std::shared_ptr<doris::PBackendService_Stub> const&, doris::FragmentMgr::BrpcItem const&) in /mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be 5# doris::FragmentMgr::cancel_worker() at /root/doris/be/src/runtime/fragment_mgr.cpp:1022 6# doris::Thread::supervise_thread(void*) at /root/doris/be/src/util/thread.cpp:499 7# start_thread at /build/glibc-SzIz7B/glibc-2.31/nptl/pthread_create.c:478 8# __clone at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97 ```
pick #44047 ``` /root/doris/be/src/runtime/fragment_mgr.cpp:1064:20: runtime error: member call on null pointer of type 'doris::PBackendService_Stub' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /root/doris/be/src/runtime/fragment_mgr.cpp:1064:20 in *** Query id: 0-0 *** *** is nereids: 0 *** *** tablet id: 0 *** *** Aborted at 1731663847 (unix time) try "date -d @1731663847" if you are using GNU date *** *** Current BE git commitID: b663df0e50 *** *** SIGSEGV address not mapped to object (@0x0) received by PID 17169 (TID 17463 OR 0x7f746d21a700) from PID 0; stack trace: *** 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris/be/src/common/signal_handler.h:421 1# PosixSignals::chained_handler(int, siginfo_t*, void*) [clone .part.0] in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so 2# JVM_handle_linux_signal in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so 3# 0x00007F7601263090 in /lib/x86_64-linux-gnu/libc.so.6 4# doris::FragmentMgr::_check_brpc_available(std::shared_ptr<doris::PBackendService_Stub> const&, doris::FragmentMgr::BrpcItem const&) in /mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be 5# doris::FragmentMgr::cancel_worker() at /root/doris/be/src/runtime/fragment_mgr.cpp:1022 6# doris::Thread::supervise_thread(void*) at /root/doris/be/src/util/thread.cpp:499 7# start_thread at /build/glibc-SzIz7B/glibc-2.31/nptl/pthread_create.c:478 8# __clone at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97 ``` ### What problem does this PR solve? Issue Number: close #xxx Related PR: #xxx Problem Summary: ### Release note None ### Check List (For Author) - Test <!-- At least one of them must be included. --> - [ ] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [ ] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [ ] Yes. <!-- Add document PR link here. eg: apache/doris-website#1214 --> ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into -->
### What problem does this PR solve? ``` /root/doris/be/src/runtime/fragment_mgr.cpp:1064:20: runtime error: member call on null pointer of type 'doris::PBackendService_Stub' #0 0x55bd899c9aaa in doris::FragmentMgr::_check_brpc_available(std::shared_ptr<doris::PBackendService_Stub> const&, doris::FragmentMgr::BrpcItem const&) /root/doris/be/src/runtime/fragment_mgr.cpp:1064:20 apache#1 0x55bd899c521f in doris::FragmentMgr::cancel_worker() /root/doris/be/src/runtime/fragment_mgr.cpp:1021:13 apache#2 0x55bd8a4c97ae in std::function<void ()>::operator()() const /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:560:9 apache#3 0x55bd8a4c97ae in doris::Thread::supervise_thread(void*) /root/doris/be/src/util/thread.cpp:498:5 apache#4 0x7f7601092608 in start_thread /build/glibc-SzIz7B/glibc-2.31/nptl/pthread_create.c:477:8 apache#5 0x7f760133f132 in __clone /build/glibc-SzIz7B/glibc-2.31/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /root/doris/be/src/runtime/fragment_mgr.cpp:1064:20 in *** Query id: 0-0 *** *** is nereids: 0 *** *** tablet id: 0 *** *** Aborted at 1731663847 (unix time) try "date -d @1731663847" if you are using GNU date *** *** Current BE git commitID: b663df0e50 *** *** SIGSEGV address not mapped to object (@0x0) received by PID 17169 (TID 17463 OR 0x7f746d21a700) from PID 0; stack trace: *** 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris/be/src/common/signal_handler.h:421 1# PosixSignals::chained_handler(int, siginfo_t*, void*) [clone .part.0] in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so 2# JVM_handle_linux_signal in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so 3# 0x00007F7601263090 in /lib/x86_64-linux-gnu/libc.so.6 4# doris::FragmentMgr::_check_brpc_available(std::shared_ptr<doris::PBackendService_Stub> const&, doris::FragmentMgr::BrpcItem const&) in /mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be 5# doris::FragmentMgr::cancel_worker() at /root/doris/be/src/runtime/fragment_mgr.cpp:1022 6# doris::Thread::supervise_thread(void*) at /root/doris/be/src/util/thread.cpp:499 7# start_thread at /build/glibc-SzIz7B/glibc-2.31/nptl/pthread_create.c:478 8# __clone at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97 ```
### What problem does this PR solve? ``` /root/doris/be/src/runtime/fragment_mgr.cpp:1064:20: runtime error: member call on null pointer of type 'doris::PBackendService_Stub' #0 0x55bd899c9aaa in doris::FragmentMgr::_check_brpc_available(std::shared_ptr<doris::PBackendService_Stub> const&, doris::FragmentMgr::BrpcItem const&) /root/doris/be/src/runtime/fragment_mgr.cpp:1064:20 apache#1 0x55bd899c521f in doris::FragmentMgr::cancel_worker() /root/doris/be/src/runtime/fragment_mgr.cpp:1021:13 apache#2 0x55bd8a4c97ae in std::function<void ()>::operator()() const /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:560:9 apache#3 0x55bd8a4c97ae in doris::Thread::supervise_thread(void*) /root/doris/be/src/util/thread.cpp:498:5 apache#4 0x7f7601092608 in start_thread /build/glibc-SzIz7B/glibc-2.31/nptl/pthread_create.c:477:8 apache#5 0x7f760133f132 in __clone /build/glibc-SzIz7B/glibc-2.31/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /root/doris/be/src/runtime/fragment_mgr.cpp:1064:20 in *** Query id: 0-0 *** *** is nereids: 0 *** *** tablet id: 0 *** *** Aborted at 1731663847 (unix time) try "date -d @1731663847" if you are using GNU date *** *** Current BE git commitID: b663df0e50 *** *** SIGSEGV address not mapped to object (@0x0) received by PID 17169 (TID 17463 OR 0x7f746d21a700) from PID 0; stack trace: *** 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris/be/src/common/signal_handler.h:421 1# PosixSignals::chained_handler(int, siginfo_t*, void*) [clone .part.0] in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so 2# JVM_handle_linux_signal in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so 3# 0x00007F7601263090 in /lib/x86_64-linux-gnu/libc.so.6 4# doris::FragmentMgr::_check_brpc_available(std::shared_ptr<doris::PBackendService_Stub> const&, doris::FragmentMgr::BrpcItem const&) in /mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be 5# doris::FragmentMgr::cancel_worker() at /root/doris/be/src/runtime/fragment_mgr.cpp:1022 6# doris::Thread::supervise_thread(void*) at /root/doris/be/src/util/thread.cpp:499 7# start_thread at /build/glibc-SzIz7B/glibc-2.31/nptl/pthread_create.c:478 8# __clone at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97 ```
What problem does this PR solve?
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)