Skip to content

Commit

Permalink
[Clang] Add timeout for GPU detection utilities (#94751)
Browse files Browse the repository at this point in the history
Summary:
The utilities `nvptx-arch` and `amdgpu-arch` are used to support
`--offload-arch=native` among other utilities in clang. However, these
rely on the GPU drivers to query the features. In certain cases these
drivers can become locked up, which will lead to indefinate hangs on any
compiler jobs running in the meantime.

This patch adds a ten second timeout period for these utilities before
it kills the job and errors out.
  • Loading branch information
jhuber6 authored Jun 7, 2024
1 parent c5fcc2e commit 2981f3a
Show file tree
Hide file tree
Showing 4 changed files with 8 additions and 7 deletions.
3 changes: 2 additions & 1 deletion clang/include/clang/Driver/ToolChain.h
Original file line number Diff line number Diff line change
Expand Up @@ -205,7 +205,8 @@ class ToolChain {

/// Executes the given \p Executable and returns the stdout.
llvm::Expected<std::unique_ptr<llvm::MemoryBuffer>>
executeToolChainProgram(StringRef Executable) const;
executeToolChainProgram(StringRef Executable,
unsigned SecondsToWait = 0) const;

void setTripleEnvironment(llvm::Triple::EnvironmentType Env);

Expand Down
8 changes: 4 additions & 4 deletions clang/lib/Driver/ToolChain.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,8 @@ ToolChain::ToolChain(const Driver &D, const llvm::Triple &T,
}

llvm::Expected<std::unique_ptr<llvm::MemoryBuffer>>
ToolChain::executeToolChainProgram(StringRef Executable) const {
ToolChain::executeToolChainProgram(StringRef Executable,
unsigned SecondsToWait) const {
llvm::SmallString<64> OutputFile;
llvm::sys::fs::createTemporaryFile("toolchain-program", "txt", OutputFile);
llvm::FileRemover OutputRemover(OutputFile.c_str());
Expand All @@ -115,9 +116,8 @@ ToolChain::executeToolChainProgram(StringRef Executable) const {
};

std::string ErrorMessage;
if (llvm::sys::ExecuteAndWait(Executable, {}, {}, Redirects,
/* SecondsToWait */ 0,
/*MemoryLimit*/ 0, &ErrorMessage))
if (llvm::sys::ExecuteAndWait(Executable, {}, {}, Redirects, SecondsToWait,
/*MemoryLimit=*/0, &ErrorMessage))
return llvm::createStringError(std::error_code(),
Executable + ": " + ErrorMessage);

Expand Down
2 changes: 1 addition & 1 deletion clang/lib/Driver/ToolChains/AMDGPU.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -877,7 +877,7 @@ AMDGPUToolChain::getSystemGPUArchs(const ArgList &Args) const {
else
Program = GetProgramPath("amdgpu-arch");

auto StdoutOrErr = executeToolChainProgram(Program);
auto StdoutOrErr = executeToolChainProgram(Program, /*SecondsToWait=*/10);
if (!StdoutOrErr)
return StdoutOrErr.takeError();

Expand Down
2 changes: 1 addition & 1 deletion clang/lib/Driver/ToolChains/Cuda.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -826,7 +826,7 @@ NVPTXToolChain::getSystemGPUArchs(const ArgList &Args) const {
else
Program = GetProgramPath("nvptx-arch");

auto StdoutOrErr = executeToolChainProgram(Program);
auto StdoutOrErr = executeToolChainProgram(Program, /*SecondsToWait=*/10);
if (!StdoutOrErr)
return StdoutOrErr.takeError();

Expand Down

0 comments on commit 2981f3a

Please sign in to comment.