-
Notifications
You must be signed in to change notification settings - Fork 884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add stacktrace into cudf exception types #13298
Merged
Merged
Changes from all commits
Commits
Show all changes
60 commits
Select commit
Hold shift + click to select a range
7d7cb1e
Add `stacktrace`
ttnghia 6b51a29
Update docs
ttnghia 22e6553
Adjust frame index
ttnghia f4ae832
Adopt `get_stacktrace` in `identify_stream_usage.cpp`
ttnghia f641add
Rewrite stacktrace
ttnghia 8d45ee3
Fix format
ttnghia aa2b963
Fix comment
ttnghia 78d39f5
Update cmake
ttnghia cde4ec5
Merge branch 'branch-23.06' into stacktrace
ttnghia a03a497
Add header
ttnghia 1d291fa
Rewrite stacktrace.cpp
ttnghia c88dc7f
Update cmake
ttnghia 9085058
Rewrite stacktrace accessor
ttnghia b158d65
Update copyright year
ttnghia 9b04caa
Update JNI to adopt stacktrace
ttnghia 14ab74f
Change `stacktrace` return type
ttnghia c1d8dee
Update JNI
ttnghia 4747704
Remove redundant definition
ttnghia 97c8046
Rename variable
ttnghia 353cac8
Optimize JNI
ttnghia eb324ca
Move header into `cudf/detail/utilities/`
ttnghia 1ebde06
Fix `meta.yaml`
ttnghia a2c830b
Add compile option
ttnghia 9b816e5
Change cmake
ttnghia 0ca1bba
Change cast type
ttnghia ef7ebb9
Merge branch 'branch-23.06' into stacktrace
ttnghia b820f14
Update cpp/CMakeLists.txt
ttnghia e247cb8
Update copyright year
ttnghia 4112288
Merge branch 'branch-23.06' into stacktrace
ttnghia 3d7e337
Remove variable
ttnghia b24420b
Rewrite stacktrace
ttnghia 5da984a
Change comments
ttnghia 1033535
Using `c*` headers, and add `std::` prefix
ttnghia 991acbc
Encapsulate headers in `#ifdef`
ttnghia ba8d97b
Add `// __GNUC__`
ttnghia 06d7ce7
Reset
ttnghia 83bd318
Other approach for cmake
ttnghia 695bfa9
Link `cudf_backtrace` with stream test
ttnghia 0656b3b
Remove cmake string replace for `_CXX_FLAGS_`
ttnghia 3224559
Apply suggestions from code review
ttnghia 27da4e4
Merge branch 'branch-23.06' into stacktrace
ttnghia 7b2cca9
Merge branch 'branch-23.06' into stacktrace
ttnghia 72859b8
Fix JNI bugs
ttnghia 651ed1c
Add more default constructor for Java exception classes
ttnghia 5d86822
Merge branch 'branch-23.06' into stacktrace
ttnghia 7f1df5b
Merge branch 'branch-23.06' into stacktrace
ttnghia a36c191
Disable stacktrace completely by default
ttnghia f382ea6
Merge branch 'branch-23.08' into stacktrace
ttnghia 0c312c8
Update cpp/src/utilities/stacktrace.cpp
ttnghia f25bf64
Merge branch 'branch-23.08' into stacktrace
ttnghia bd4085d
Add more stacktrace usage
ttnghia e08ba04
Merge branch 'branch-23.08' into stacktrace
ttnghia e44f2a2
Merge branch 'branch-23.08' into stacktrace
ttnghia e7c5967
Update cpp/CMakeLists.txt
ttnghia 0e0399c
Change macro name
ttnghia 25ff793
Not throwing new exception
ttnghia dc03826
Fix typo
ttnghia 58ff5c7
Throw cuda exception but with error code
ttnghia 6ebee14
Add default stacktrace message
ttnghia 9ebe2a2
Add exception check macro
ttnghia File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
/* | ||
* Copyright (c) 2023, NVIDIA CORPORATION. | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
#pragma once | ||
|
||
#include <string> | ||
|
||
namespace cudf::detail { | ||
/** | ||
* @addtogroup utility_stacktrace | ||
* @{ | ||
* @file | ||
*/ | ||
|
||
/** | ||
* @brief Specify whether the last stackframe is included in the stacktrace. | ||
*/ | ||
enum class capture_last_stackframe : bool { YES, NO }; | ||
|
||
/** | ||
* @brief Query the current stacktrace and return the whole stacktrace as one string. | ||
* | ||
* Depending on the value of the flag `capture_last_frame`, the caller that executes stacktrace | ||
* retrieval can be included in the output result. | ||
* | ||
* @param capture_last_frame Flag to specify if the current stackframe will be included into | ||
* the output | ||
* @return A string storing the whole current stacktrace | ||
*/ | ||
std::string get_stacktrace(capture_last_stackframe capture_last_frame); | ||
|
||
/** @} */ // end of group | ||
|
||
} // namespace cudf::detail |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
/* | ||
* Copyright (c) 2023, NVIDIA CORPORATION. | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
#include <cudf/detail/utilities/stacktrace.hpp> | ||
|
||
#if defined(__GNUC__) && defined(CUDF_BUILD_STACKTRACE_DEBUG) | ||
#include <cxxabi.h> | ||
#include <execinfo.h> | ||
|
||
#include <cstdlib> | ||
#include <cstring> | ||
#include <sstream> | ||
#endif // defined(__GNUC__) && defined(CUDF_BUILD_STACKTRACE_DEBUG) | ||
|
||
namespace cudf::detail { | ||
|
||
std::string get_stacktrace(capture_last_stackframe capture_last_frame) | ||
{ | ||
#if defined(__GNUC__) && defined(CUDF_BUILD_STACKTRACE_DEBUG) | ||
constexpr int max_stack_depth = 64; | ||
void* stack[max_stack_depth]; | ||
|
||
auto const depth = backtrace(stack, max_stack_depth); | ||
auto const modules = backtrace_symbols(stack, depth); | ||
|
||
if (modules == nullptr) { return "No stacktrace could be captured!"; } | ||
|
||
std::stringstream ss; | ||
|
||
// Skip one more depth to avoid including the stackframe of this function. | ||
auto const skip_depth = 1 + (capture_last_frame == capture_last_stackframe::YES ? 0 : 1); | ||
for (auto i = skip_depth; i < depth; ++i) { | ||
// Each modules[i] string contains a mangled name in the format like following: | ||
// `module_name(function_name+0x012) [0x01234567890a]` | ||
// We need to extract function name and function offset. | ||
char* begin_func_name = std::strstr(modules[i], "("); | ||
char* begin_func_offset = std::strstr(modules[i], "+"); | ||
char* end_func_offset = std::strstr(modules[i], ")"); | ||
|
||
auto const frame_idx = i - skip_depth; | ||
if (begin_func_name && begin_func_offset && end_func_offset && | ||
begin_func_name < begin_func_offset) { | ||
// Split `modules[i]` into separate null-terminated strings. | ||
// After this, mangled function name will then be [begin_func_name, begin_func_offset), and | ||
// function offset is in [begin_func_offset, end_func_offset). | ||
*(begin_func_name++) = '\0'; | ||
*(begin_func_offset++) = '\0'; | ||
*end_func_offset = '\0'; | ||
|
||
// We need to demangle function name. | ||
int status{0}; | ||
char* func_name = abi::__cxa_demangle(begin_func_name, nullptr, nullptr, &status); | ||
|
||
ss << "#" << frame_idx << ": " << modules[i] << " : " | ||
<< (status == 0 /*demangle success*/ ? func_name : begin_func_name) << "+" | ||
<< begin_func_offset << "\n"; | ||
free(func_name); | ||
} else { | ||
ss << "#" << frame_idx << ": " << modules[i] << "\n"; | ||
} | ||
} | ||
|
||
free(modules); | ||
|
||
return ss.str(); | ||
#else | ||
#ifdef CUDF_BUILD_STACKTRACE_DEBUG | ||
return "Stacktrace is only supported when built with a GNU compiler."; | ||
#else | ||
return "libcudf was not built with stacktrace support."; | ||
#endif // CUDF_BUILD_STACKTRACE_DEBUG | ||
#endif // __GNUC__ | ||
} | ||
|
||
} // namespace cudf::detail |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Folks should be aware that this is changing the size of the exception type even when compiling in non-stacktrace mode.
That isn't to say it's a good idea to not inherit from
stacktrace_recorder
in non-stacktrace mode because there would be the possibility for weird ABI issues with different builds of libcudf having different sizes for exception types.The ideal way to do this without changing the size of the exception type would be to include the stack trace in the
what()
string such that there is only a singlestring
member.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "what()" string is accessed for logging. If we extend it, it may be very large for downstream applications to print out every time. This issue has been mentioned before, please see #12422 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the recorder need to be part of the signature?
The stack trace does not seem to have any real association with the exception object.
Could it not just be a free function used like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately not.
stacktrace()
gives you the call stack up to the point insidestacktrace()
. From there, you cannot trace back to where the exception was created.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, the stack trace is created and attached to the exception object. The exception classes now derive from
stacktrace_recorder
, which captures the call stack immediately upon construction.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative would be for each exception type to own a
stacktrace_recorder
that is constructed in its constructor. That approach would require ignoring 2 frames instead of just 1, though, and it would be more verbose. I don't have a strong opinion that that is a necessary change here.