You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We recently upgraded our DXC build from 0392e60 to ea3623f, and are now observing spurious SEGFAULTs. Backtraces lead to the -ftime-trace profiler introduced in #4873, even though we never set this command-line parameter and never initialize the function, and the non-null pointer value it reads from TimeTraceProfilerInstance is similarly garbage, hence the segfault.
We dlopen("libdxcompiler.so", RTLD_LAZY | RTLD_LOCAL) multiple times in our process (resulting in the same mapped library!), and access the library from many threads at once, and that has already been causing concurrency problems with statics such as #4818.
Backtrace
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007fc54ad0a0a8 in llvm::Entry::Entry (this=0x4864fe8941000000) at DirectXShaderCompiler/lib/Support/TimeProfiler.cpp:56
56 struct Entry {
[Current thread is 1 (Thread 0x7fc65cfff6c0 (LWP 178339))]
(gdb)
(gdb) bt
#0 0x00007fc54ad0a0a8 in llvm::Entry::Entry (this=0x4864fe8941000000) at DirectXShaderCompiler/lib/Support/TimeProfiler.cpp:56
#1 std::__new_allocator<llvm::Entry>::construct<llvm::Entry, llvm::Entry> (this=0x7fc54ad08ba0 <SignalHandler(int)>, __p=0x4864fe8941000000, __args=...) at /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.1.1/../../../../include/c++/13.1.1/bits/new_allocator.h:187
#2 std::allocator_traits<std::allocator<llvm::Entry> >::construct<llvm::Entry, llvm::Entry> (__a=..., __p=0x4864fe8941000000, __args=...) at /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.1.1/../../../../include/c++/13.1.1/bits/alloc_traits.h:537
#3 std::vector<llvm::Entry, std::allocator<llvm::Entry> >::emplace_back<llvm::Entry> (this=0x7fc54ad08ba0 <SignalHandler(int)>, __args=...) at /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.1.1/../../../../include/c++/13.1.1/bits/vector.tcc:117
#4 std::vector<llvm::Entry, std::allocator<llvm::Entry> >::push_back (this=0x7fc54ad08ba0 <SignalHandler(int)>, __x=...) at /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.1.1/../../../../include/c++/13.1.1/bits/stl_vector.h:1296
#5 llvm::TimeTraceProfiler::begin(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, llvm::function_ref<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > ()>) (
this=this@entry=0x7fc54ad08ba0 <SignalHandler(int)>, Name="Frontend", Detail=...) at DirectXShaderCompiler/lib/Support/TimeProfiler.cpp:72
#6 0x00007fc54ad09f85 in llvm::timeTraceProfilerBegin (Name=..., Detail=...) at DirectXShaderCompiler/lib/Support/TimeProfiler.cpp:172
#7 0x00007fc54b5d78eb in llvm::TimeTraceScope::TimeTraceScope (this=0x7fc65cff6b18, Name=..., Detail=...) at DirectXShaderCompiler/include/llvm/Support/TimeProfiler.h:58
#8 clang::ParseAST (S=..., PrintStats=false, SkipFunctionBodies=false) at DirectXShaderCompiler/tools/clang/lib/Parse/ParseAST.cpp:104
#9 0x00007fc54ad6aeb8 in clang::CodeGenAction::ExecuteAction (this=0x7fc65cff7148) at DirectXShaderCompiler/tools/clang/lib/CodeGen/CodeGenAction.cpp:807
#10 0x00007fc54aed10ae in clang::FrontendAction::Execute (this=0x7fc65cff7148) at DirectXShaderCompiler/tools/clang/lib/Frontend/FrontendAction.cpp:455
#11 0x00007fc54a929a7a in DxcCompiler::Compile (this=this@entry=0x7fc648973920, pSource=pSource@entry=0x7fc65cff7db0, pArguments=pArguments@entry=0x564f5b60b900, argCount=20, pIncludeHandler=0x7fc648838b20, riid=..., ppResult=0x7fc65cff7d60)
at DirectXShaderCompiler/tools/clang/tools/dxcompiler/dxcompilerobj.cpp:1061
#12 0x00007fc54a924afb in hlsl::DxcCompilerAdapter::WrapCompile (this=<optimized out>, bPreprocess=<optimized out>, pSource=<optimized out>, pSourceName=<optimized out>, pEntryPoint=<optimized out>, pTargetProfile=0x7fc648985610 L"cs_6_6", pArguments=<optimized out>,
argCount=<optimized out>, pDefines=<optimized out>, defineCount=<optimized out>, pIncludeHandler=<optimized out>, ppResult=<optimized out>, ppDebugBlobName=<optimized out>, ppDebugBlob=<optimized out>)
at DirectXShaderCompiler/tools/clang/tools/dxcompiler/dxcompilerobj.cpp:1869
#13 0x00007fc54a925b2f in hlsl::DxcCompilerAdapter::CompileWithDebug (this=0x7fc65cff6a20, pSource=0x8, pSourceName=0x7fc65cff6a30 L"\x48979600翆\x48978b80翆\x222b4300\xdc25a39c\b", pEntryPoint=0x0, pTargetProfile=0x1 <error: Cannot access memory at address 0x1>,
pArguments=0x7fc65cff6a20, argCount=<optimized out>, pDefines=<optimized out>, defineCount=<optimized out>, pIncludeHandler=<optimized out>, ppResult=<optimized out>, ppDebugBlobName=<optimized out>, ppDebugBlob=<optimized out>)
at DirectXShaderCompiler/tools/clang/tools/dxcompiler/dxcompilerobj.cpp:1773
#14 0x00007fc54a9264b8 in hlsl::DxcCompilerAdapter::Compile (this=0x7fc65cff6a20, pSource=0x7fc65cff6a90, pSourceName=0x8 <error: Cannot access memory at address 0x8>, pEntryPoint=0x7fc65cff6a30 L"\x48979600翆\x48978b80翆\x222b4300\xdc25a39c\b", pTargetProfile=0x0,
pArguments=0x1, argCount=11, pDefines=0x7fc55ce19830, defineCount=2, pIncludeHandler=0x7fc648838b20, ppResult=0x7fc65cff8970) at DirectXShaderCompiler/tools/clang/tools/dxcompiler/dxcompileradapter.h:69
#15 0x0000564f58ae170e in hassle_rs::wrapper::DxcCompiler::compile (self=<optimized out>, blob=<optimized out>, source_name=..., entry_point=..., target_profile=..., args=..., include_handler=..., defines=...) at src/wrapper.rs:271
Notice how the above stacktrace says the this pointer for TimeTraceProfiler (acquired from TimeTraceProfilerInstance) is suddenly a SignalHandler<int> object. Printing out and comparing the pointers shows that TimeTraceProfilerInstance sits directly after RegisteredSignalInfo in memory:
RegisteredSignalInfo is an array of 16 elements, yet when planting some printfs in RegisterHandler()NumRegisteredSignals runs up to 32 or 48 instead of 16, concurrently from two/three threads. This is not caught by the assert() because those are disabled in release builds (and causing other issues I'll file shortly). It works most (±60%) of the time, likely because one of the threads gets to updating NumRegisteredSignals quick enough before any other checks for 0:
// If the handlers are already registered, we're done.
if (NumRegisteredSignals != 0) return;
Presumably this needs to be guarded with a lock or call_once? Additionally the new time profiler pointer might have to be guarded by a lock or call_once as well.
I'm curious why this wasn't surfacing for us earlier: are the handlers registered differently/more often now? Signals.inc hasn't changed for a very long time, and perhaps RegisteredSignalInfo was previously followed by more-or-less irrelevant static data members?
(We do have a longstanding issue where ctrl+c'ing our application after it compiled lots of shaders runs into a jmp to an invalid (or null) memory address from a signal handler, but have never figured out where it was coming from: that could be related to this.)
And how about enabling asserts in release builds? That would have preempted lots of digging as I was originally dismissing this assert as irrelevant "debug" noise (🤦) while debugging on a release build to match/repro our CI environment.
The text was updated successfully, but these errors were encountered:
We recently upgraded our DXC build from 0392e60 to ea3623f, and are now observing spurious SEGFAULTs. Backtraces lead to the
-ftime-trace
profiler introduced in #4873, even though we never set this command-line parameter and never initialize the function, and the non-null pointer value it reads fromTimeTraceProfilerInstance
is similarly garbage, hence the segfault.We
dlopen("libdxcompiler.so", RTLD_LAZY | RTLD_LOCAL)
multiple times in our process (resulting in the same mapped library!), and access the library from many threads at once, and that has already been causing concurrency problems withstatic
s such as #4818.Backtrace
Notice how the above stacktrace says the
this
pointer forTimeTraceProfiler
(acquired fromTimeTraceProfilerInstance
) is suddenly aSignalHandler<int>
object. Printing out and comparing the pointers shows thatTimeTraceProfilerInstance
sits directly afterRegisteredSignalInfo
in memory:RegisteredSignalInfo
is an array of 16 elements, yet when planting someprintf
s inRegisterHandler()
NumRegisteredSignals
runs up to 32 or 48 instead of 16, concurrently from two/three threads. This is not caught by theassert()
because those are disabled in release builds (and causing other issues I'll file shortly). It works most (±60%) of the time, likely because one of the threads gets to updatingNumRegisteredSignals
quick enough before any other checks for 0:DirectXShaderCompiler/lib/Support/Unix/Signals.inc
Lines 121 to 122 in ea3623f
Presumably this needs to be guarded with a lock or
call_once
? Additionally the new time profiler pointer might have to be guarded by a lock orcall_once
as well.I'm curious why this wasn't surfacing for us earlier: are the handlers registered differently/more often now?
Signals.inc
hasn't changed for a very long time, and perhapsRegisteredSignalInfo
was previously followed by more-or-less irrelevantstatic
data members?(We do have a longstanding issue where ctrl+c'ing our application after it compiled lots of shaders runs into a
jmp
to an invalid (or null) memory address from a signal handler, but have never figured out where it was coming from: that could be related to this.)And how about enabling
assert
s in release builds? That would have preempted lots of digging as I was originally dismissing this assert as irrelevant "debug" noise (🤦) while debugging on a release build to match/repro our CI environment.The text was updated successfully, but these errors were encountered: