Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[New offload driver][llvm-link][SYCL] Use LTO instead of llvm-link to link device input bitcodes #13395

Draft
wants to merge 1 commit into
base: sycl
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions buildbot/configure.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ def do_configure(args):
libdevice_dir = os.path.join(abs_src_dir, "libdevice")
fusion_dir = os.path.join(abs_src_dir, "sycl-fusion")
llvm_targets_to_build = args.host_target
llvm_experimental_targets_to_build = 'SPIRV'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wonder if @bader has an opinion on the effect on CI/other users since basically everyone except this team isn't going to use this so its wasted time/cycles for them

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer SPIR-V backend to be enabled by the option, which is disabled by default to avoid the impact on other DPC++ compiler developers.
I'm okay with turning on this option in CI if some reasonable number of tests are passing already and we want to avoid regressions in these tests.

To be honest, I expected much more changes to make LTO framework usable for SYCL compilation flow. Specifically replacing SPIR-V translator with SPIR-V back-end should introduce some number of .

@VyacheslavLevytskyy, will it be useful for you if we enable built of SPIR-V target in our CI (e.g. gather SYCL tests pass rate on additional platforms)? NOTE: sycl branch is usually behind main branch (few days, but can be weeks).

Overall, SPIR-V backend today is relatively small, so the overhead it not significant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sarnex and @bader

Thanks so much for this discussion. One pointer is that I intend to use SPIR-V backend only for LTO with early exit condition, where we emit linked .bc file and exit early. I do not intend to use the SPIR-V backend for LLVM to SPIR-V translation at this stage.

Thanks
Sincerely

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bader Indeed, it'd be helpful to gather info about other environments easier, even with a delay of days/weeks. This move would make sense one day, but only on condition that we introduced a couple of minor changes into the translation chain under a new option. These would include (1) diagnostics from spirv-val about produced SPIR-V output, and (2) verification that llvm-spirv -r is able to produce LLVM IR from the SPIR-V output. Until this is changed all intermediate crashes are hidden, and we see only final numbers of fail/success cases, without any tips about possible causes of fails.

llvm_enable_projects = 'clang;' + llvm_external_projects
libclc_targets_to_build = ''
libclc_gen_remangled_variants = 'OFF'
Expand Down Expand Up @@ -154,6 +155,7 @@ def do_configure(args):
"-DCMAKE_BUILD_TYPE={}".format(args.build_type),
"-DLLVM_ENABLE_ASSERTIONS={}".format(llvm_enable_assertions),
"-DLLVM_TARGETS_TO_BUILD={}".format(llvm_targets_to_build),
"-DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD={}".format(llvm_experimental_targets_to_build),
"-DLLVM_EXTERNAL_PROJECTS={}".format(llvm_external_projects),
"-DLLVM_EXTERNAL_SYCL_SOURCE_DIR={}".format(sycl_dir),
"-DLLVM_EXTERNAL_LLVM_SPIRV_SOURCE_DIR={}".format(spirv_dir),
Expand Down
5 changes: 2 additions & 3 deletions clang/test/Driver/linker-wrapper-sycl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,9 @@

/// Check for list of commands for standalone clang-linker-wrapper run for sycl
// RUN: clang-linker-wrapper -sycl-device-library-location=%S/Inputs -sycl-device-libraries=libsycl-crt.o,libsycl-complex.o -sycl-post-link-options="SYCL_POST_LINK_OPTIONS" -llvm-spirv-options="LLVM_SPIRV_OPTIONS" "--host-triple=x86_64-unknown-linux-gnu" "--triple=spir64" "--linker-path=/usr/bin/ld" "--" HOST_LINKER_FLAGS "-dynamic-linker" HOST_DYN_LIB "-o" "a.out" HOST_LIB_PATH HOST_STAT_LIB %S/Inputs/test-sycl.o --dry-run 2>&1 | FileCheck -check-prefix=CHK-CMDS %s
// CHK-CMDS: "{{.*}}llvm-link" [[INPUT:.*]].bc -o [[FIRSTLLVMLINKOUT:.*]].bc --suppress-warnings
// CHK-CMDS-NEXT: "{{.*}}clang-offload-bundler" -type=o -targets=sycl-spir64-unknown-unknown -input={{.*}}libsycl-crt.o -output=[[FIRSTUNBUNDLEDLIB:.*]].bc -unbundle -allow-missing-bundles
// CHK-CMDS: "{{.*}}clang-offload-bundler" -type=o -targets=sycl-spir64-unknown-unknown -input={{.*}}libsycl-crt.o -output=[[FIRSTUNBUNDLEDLIB:.*]].bc -unbundle -allow-missing-bundles
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like some tests are failing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we need some upstream changes for SPIRV backend to be registered for both spir and spirv target triples. I have talked about this to the SPIR-V backend team and this is in their TODO list. I will update this PR with latest information.

Thanks

// CHK-CMDS-NEXT: "{{.*}}clang-offload-bundler" -type=o -targets=sycl-spir64-unknown-unknown -input={{.*}}libsycl-complex.o -output=[[SECONDUNBUNDLEDLIB:.*]].bc -unbundle -allow-missing-bundles
// CHK-CMDS-NEXT: "{{.*}}llvm-link" -only-needed [[FIRSTLLVMLINKOUT]].bc [[FIRSTUNBUNDLEDLIB]].bc [[SECONDUNBUNDLEDLIB]].bc -o [[SECONDLLVMLINKOUT:.*]].bc --suppress-warnings
// CHK-CMDS-NEXT: "{{.*}}llvm-link" -only-needed {{.*}}.bc [[FIRSTUNBUNDLEDLIB]].bc [[SECONDUNBUNDLEDLIB]].bc -o [[SECONDLLVMLINKOUT:.*]].bc --suppress-warnings
// CHK-CMDS-NEXT: "{{.*}}sycl-post-link" SYCL_POST_LINK_OPTIONS -o [[SYCLPOSTLINKOUT:.*]].table [[SECONDLLVMLINKOUT]].bc
// LLVM-SPIRV is not called in dry-run
// CHK-CMDS-NEXT: offload-wrapper: input: [[LLVMSPIRVOUT:.*]].table, output: [[WRAPPEROUT:.*]].bc
Expand Down
36 changes: 12 additions & 24 deletions clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -963,13 +963,6 @@ static Expected<StringRef> linkDevice(ArrayRef<StringRef> InputFiles,
SmallVector<StringRef, 16> InputFilesVec;
for (StringRef InputFile : InputFiles)
InputFilesVec.emplace_back(InputFile);
// First llvm-link step.
auto LinkedFile = sycl::linkDeviceInputFiles(InputFilesVec, Args);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you test a few basic examples to make sure the results execute successfully on a GPU?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly. Let me do that. I will keep this PR in draft mode till I can get to verify this. Thanks

if (!LinkedFile)
reportError(LinkedFile.takeError());

InputFilesVec.clear();
InputFilesVec.emplace_back(*LinkedFile);

// Get SYCL device library files
// Gathering device library files
Expand Down Expand Up @@ -1278,10 +1271,6 @@ Error linkBitcodeFiles(SmallVectorImpl<OffloadFile> &InputFiles,
const llvm::Triple Triple(Args.getLastArgValue(OPT_triple_EQ));
StringRef Arch = Args.getLastArgValue(OPT_arch_EQ);

// Early exit for SPIR targets
if (Triple.isSPIR())
return Error::success();

SmallVector<OffloadFile, 4> BitcodeInputFiles;
DenseSet<StringRef> StrongResolutions;
DenseSet<StringRef> UsedInRegularObj;
Expand Down Expand Up @@ -1360,14 +1349,13 @@ Error linkBitcodeFiles(SmallVectorImpl<OffloadFile> &InputFiles,
BitcodeOutput.push_back(*TempFileOrErr);
return false;
};

// We assume visibility of the whole program if every input file was bitcode.
auto Features = getTargetFeatures(BitcodeInputFiles);
auto LTOBackend = Args.hasArg(OPT_embed_bitcode) ||
Args.hasArg(OPT_builtin_bitcode_EQ) ||
Args.hasArg(OPT_clang_backend)
? createLTO(Args, Features, OutputBitcode)
: createLTO(Args, Features);
auto LTOBackend =
Args.hasArg(OPT_embed_bitcode) || Args.hasArg(OPT_builtin_bitcode_EQ) ||
Args.hasArg(OPT_clang_backend) || Triple.isSPIROrSPIRV()
? createLTO(Args, Features, OutputBitcode)
: createLTO(Args, Features);

// We need to resolve the symbols so the LTO backend knows which symbols need
// to be kept or can be internalized. This is a simplified symbol resolution
Expand Down Expand Up @@ -1477,10 +1465,11 @@ Error linkBitcodeFiles(SmallVectorImpl<OffloadFile> &InputFiles,

// Append the new inputs to the device linker input. If the user requested an
// internalizing link we need to pass the bitcode to clang.
for (StringRef File :
Args.hasArg(OPT_clang_backend) || Args.hasArg(OPT_builtin_bitcode_EQ)
? BitcodeOutput
: Files)
for (StringRef File : Args.hasArg(OPT_clang_backend) ||
Args.hasArg(OPT_builtin_bitcode_EQ) ||
Triple.isSPIROrSPIRV()
? BitcodeOutput
: Files)
OutputFiles.push_back(File);

return Error::success();
Expand Down Expand Up @@ -1770,7 +1759,6 @@ Expected<SmallVector<StringRef>> linkAndWrapDeviceFiles(
SmallVector<StringRef> InputFiles;
if (Error Err = linkBitcodeFiles(Input, InputFiles, LinkerArgs))
return Err;

// Write any remaining device inputs to an output file for the linker.
for (const OffloadFile &File : Input) {
auto FileNameOrErr = writeOffloadFile(File);
Expand All @@ -1780,8 +1768,8 @@ Expected<SmallVector<StringRef>> linkAndWrapDeviceFiles(
}

if (HasSYCLOffloadKind) {
// Link the remaining device files using the device linker for SYCL
// offload.
// Link in the remaining device library files using the device linker for
// SYCL offload.
auto TmpOutputOrErr = sycl::linkDevice(InputFiles, LinkerArgs);
if (!TmpOutputOrErr)
return TmpOutputOrErr.takeError();
Expand Down
Loading