Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Move SYCL Module Splitting to library. Part 2 #13282

Merged
merged 8 commits into from
Jun 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions llvm/include/llvm/SYCLLowerIR/ModuleSplitter.h
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
#include "llvm/ADT/StringRef.h"
#include "llvm/IR/Function.h"
#include "llvm/Support/Error.h"
#include "llvm/Support/PropertySetIO.h"

#include <memory>
#include <string>
Expand Down Expand Up @@ -196,6 +197,8 @@ class ModuleDesc {

ModuleDesc clone() const;

std::string makeSymbolTable() const;

const SYCLDeviceRequirements &getOrComputeDeviceRequirements() const {
if (!Reqs.has_value())
Reqs = computeDeviceRequirements(*this);
Expand Down Expand Up @@ -270,6 +273,33 @@ void dumpEntryPoints(const Module &M, bool OnlyKernelsAreEntryPoints = false,
const char *msg = "", int Tab = 0);
#endif // NDEBUG

struct SplitModule {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this new data structure? Can we not continue to use IrPropSymFilenameTriple?

Thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is possible.

std::string ModuleFilePath;
util::PropertySetRegistry Properties;
std::string Symbols;

SplitModule() = default;
SplitModule(const SplitModule &) = default;
SplitModule &operator=(const SplitModule &) = default;
SplitModule(SplitModule &&) = default;
SplitModule &operator=(SplitModule &&) = default;

SplitModule(std::string_view File, util::PropertySetRegistry Properties,
std::string Symbols)
: ModuleFilePath(File), Properties(std::move(Properties)),
Symbols(std::move(Symbols)) {}
};

struct ModuleSplitterSettings {
asudarsa marked this conversation as resolved.
Show resolved Hide resolved
IRSplitMode Mode;
bool OutputAssembly = false; // Bitcode or LLVM IR.
StringRef OutputPrefix;
};

/// Splits the given module \p M according to the given \p Settings.
Expected<std::vector<SplitModule>>
splitSYCLModule(std::unique_ptr<Module> M, ModuleSplitterSettings Settings);

} // namespace module_split

} // namespace llvm
Expand Down
1 change: 1 addition & 0 deletions llvm/lib/SYCLLowerIR/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@ add_llvm_component_library(LLVMSYCLLowerIR
LINK_COMPONENTS
Analysis
Core
Passes
Support
ipo
)
Expand Down
70 changes: 70 additions & 0 deletions llvm/lib/SYCLLowerIR/ModuleSplitter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,16 +12,20 @@
#include "llvm/ADT/SetVector.h"
#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/StringExtras.h"
#include "llvm/Bitcode/BitcodeWriterPass.h"
#include "llvm/IR/Constants.h"
#include "llvm/IR/Function.h"
#include "llvm/IR/InstIterator.h"
#include "llvm/IR/Instructions.h"
#include "llvm/IR/LegacyPassManager.h"
#include "llvm/IR/Module.h"
#include "llvm/IRPrinter/IRPrintingPasses.h"
#include "llvm/Passes/PassBuilder.h"
#include "llvm/SYCLLowerIR/DeviceGlobals.h"
#include "llvm/SYCLLowerIR/LowerInvokeSimd.h"
#include "llvm/SYCLLowerIR/SYCLUtils.h"
#include "llvm/Support/Error.h"
#include "llvm/Support/FileSystem.h"
#include "llvm/Transforms/IPO.h"
#include "llvm/Transforms/IPO/GlobalDCE.h"
#include "llvm/Transforms/IPO/StripDeadPrototypes.h"
Expand Down Expand Up @@ -733,6 +737,14 @@ void EntryPointGroup::rebuild(const Module &M) {
Functions.insert(const_cast<Function *>(&F));
}

std::string ModuleDesc::makeSymbolTable() const {
std::string ST;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

llvm::SmallString would be a better fit to reduce amount of re-allocations

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The total size of a string is rarely small to benefit from small string optimizations. C++ mangled names are very long. What size would you suggest for SmallString for this case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not that it would fit into a pre-allocated area on the stack, it is that SmallString is a wrapper around SmallVector, which does not fully re-allocate on every += operation, because its capacity grows at a different pace similar to std::vector's push_back

for (const Function *F : EntryPoints.Functions)
ST += (Twine(F->getName()) + "\n").str();

return ST;
}

namespace {
// This is a helper class, which allows to group/categorize function based on
// provided rules. It is intended to be used in device code split
Expand Down Expand Up @@ -1143,5 +1155,63 @@ SmallVector<ModuleDesc, 2> splitByESIMD(ModuleDesc &&MD,
return Result;
}

static Error saveModuleIRInFile(Module &M, StringRef FilePath,
bool OutputAssembly) {
int FD = -1;
if (std::error_code EC = sys::fs::openFileForWrite(FilePath, FD))
return errorCodeToError(EC);

raw_fd_ostream OS(FD, true);
ModulePassManager MPM;
ModuleAnalysisManager MAM;
PassBuilder PB;
PB.registerModuleAnalyses(MAM);
if (OutputAssembly)
MPM.addPass(PrintModulePass(OS));
else
MPM.addPass(BitcodeWriterPass(OS));

MPM.run(M, MAM);
return Error::success();
}

static Expected<SplitModule> saveModuleDesc(ModuleDesc &MD, std::string Prefix,
bool OutputAssembly) {
SplitModule SM;
Prefix += OutputAssembly ? ".ll" : ".bc";
Error E = saveModuleIRInFile(MD.getModule(), Prefix, OutputAssembly);
if (E)
return E;

SM.ModuleFilePath = Prefix;
SM.Symbols = MD.makeSymbolTable();
return SM;
}

Expected<std::vector<SplitModule>>
splitSYCLModule(std::unique_ptr<Module> M, ModuleSplitterSettings Settings) {
ModuleDesc MD = std::move(M); // makeModuleDesc() ?
// FIXME: false arguments are temporary for now.
auto Splitter =
getDeviceCodeSplitter(std::move(MD), Settings.Mode, false, false);
size_t ID = 0;
std::vector<SplitModule> OutputImages;
while (Splitter->hasMoreSplits()) {
ModuleDesc MD2 = Splitter->nextSplit();
MD2.fixupLinkageOfDirectInvokeSimdTargets();

std::string OutIRFileName = (Settings.OutputPrefix + "_" + Twine(ID)).str();
auto SplittedImageOrErr =
saveModuleDesc(MD2, OutIRFileName, Settings.OutputAssembly);
if (!SplittedImageOrErr)
return SplittedImageOrErr.takeError();

OutputImages.emplace_back(std::move(*SplittedImageOrErr));
++ID;
}

return OutputImages;
}

} // namespace module_split
} // namespace llvm
1 change: 1 addition & 0 deletions llvm/test/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,7 @@ set(LLVM_TEST_DEPENDS
sanstats
spirv-to-ir-wrapper
sycl-post-link
sycl-module-split
split-file
verify-uselistorder
yaml-bench
Expand Down
1 change: 1 addition & 0 deletions llvm/test/lit.cfg.py
Original file line number Diff line number Diff line change
Expand Up @@ -244,6 +244,7 @@ def get_asan_rtlib():
"sanstats",
"llvm-remarkutil",
"spirv-to-ir-wrapper",
"sycl-module-split",
]
)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,13 @@
; RUN: FileCheck %s -input-file=%t_0.sym --check-prefixes CHECK-TU0-TXT
; RUN: FileCheck %s -input-file=%t_1.sym --check-prefixes CHECK-TU1-TXT

; RUN: sycl-module-split -split=auto -S < %s -o %t2
; By default auto mode is equal to source mode
; RUN: FileCheck %s -input-file=%t2_0.ll --check-prefixes CHECK-TU0,CHECK
; RUN: FileCheck %s -input-file=%t2_1.ll --check-prefixes CHECK-TU1,CHECK
; RUN: FileCheck %s -input-file=%t2_0.sym --check-prefixes CHECK-TU0-TXT
; RUN: FileCheck %s -input-file=%t2_1.sym --check-prefixes CHECK-TU1-TXT

target datalayout = "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024"
target triple = "spir64-unknown-linux"

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,12 @@
; RUN: FileCheck %s -input-file=%t_0.sym --check-prefixes CHECK-TU0-TXT
; RUN: FileCheck %s -input-file=%t_1.sym --check-prefixes CHECK-TU1-TXT

; RUN: sycl-module-split -split=auto -S < %s -o %t2
; RUN: FileCheck %s -input-file=%t2_0.ll --check-prefixes CHECK-TU0,CHECK
; RUN: FileCheck %s -input-file=%t2_1.ll --check-prefixes CHECK-TU1,CHECK
; RUN: FileCheck %s -input-file=%t2_0.sym --check-prefixes CHECK-TU0-TXT
; RUN: FileCheck %s -input-file=%t2_1.sym --check-prefixes CHECK-TU1-TXT

target datalayout = "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024"
target triple = "spir64-unknown-linux"

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,18 @@
; RUN: FileCheck %s -input-file=%t_0.sym --check-prefixes CHECK-TU0-SYM
; RUN: FileCheck %s -input-file=%t_1.sym --check-prefixes CHECK-TU1-SYM
;
;
; RUN: sycl-module-split -split=auto -S < %s -o %t2
;
; RUN: FileCheck %s -input-file=%t2_0.ll --check-prefixes CHECK-TU0-IR \
; RUN: --implicit-check-not TU0_kernel --implicit-check-not _Z3foov \
; RUN: --implicit-check-not _Z4foo3v
; RUN: FileCheck %s -input-file=%t2_1.ll --check-prefixes CHECK-TU1-IR \
; RUN: --implicit-check-not TU1_kernel --implicit-check-not _Z4foo2v \
; RUN: --implicit-check-not _Z4foo1v
; RUN: FileCheck %s -input-file=%t2_0.sym --check-prefixes CHECK-TU0-SYM
; RUN: FileCheck %s -input-file=%t2_1.sym --check-prefixes CHECK-TU1-SYM

; CHECK-TU0-SYM: _ZTSZ4mainE11TU1_kernel0
; CHECK-TU0-SYM: _ZTSZ4mainE11TU1_kernel1
;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,12 @@
; RUN: FileCheck %s -input-file=%t_0.ll --check-prefix=CHECK-IR0
; RUN: FileCheck %s -input-file=%t_1.ll --check-prefix=CHECK-IR1

; RUN: sycl-module-split -split=auto -S < %s -o %t2
; RUN: FileCheck %s -input-file=%t2_0.sym --check-prefix=CHECK-SYM0
; RUN: FileCheck %s -input-file=%t2_1.sym --check-prefix=CHECK-SYM1
; RUN: FileCheck %s -input-file=%t2_0.ll --check-prefix=CHECK-IR0
; RUN: FileCheck %s -input-file=%t2_1.ll --check-prefix=CHECK-IR1

; This test checkes that we can properly perform device code split by tracking
; all uses of functions (not only direct calls)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,12 @@
; RUN: FileCheck %s -input-file=%t_1.ll --check-prefixes CHECK-TU1,CHECK
; RUN: FileCheck %s -input-file=%t_0.sym --check-prefixes CHECK-TU0-TXT
; RUN: FileCheck %s -input-file=%t_1.sym --check-prefixes CHECK-TU1-TXT

; RUN: sycl-module-split -split=source -S < %s -o %t2
; RUN: FileCheck %s -input-file=%t2_0.ll --check-prefixes CHECK-TU0,CHECK
; RUN: FileCheck %s -input-file=%t2_1.ll --check-prefixes CHECK-TU1,CHECK
; RUN: FileCheck %s -input-file=%t2_0.sym --check-prefixes CHECK-TU0-TXT
; RUN: FileCheck %s -input-file=%t2_1.sym --check-prefixes CHECK-TU1-TXT
; ModuleID = 'basic-module-split.ll'
source_filename = "basic-module-split.ll"
target datalayout = "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,17 @@
; RUN: --implicit-check-not @BAZ --implicit-check-not @kernel_B \
; RUN: --implicit-check-not @kernel_C
;
; RUN: sycl-module-split -split=auto -S < %s -o %t2
; RUN: FileCheck %s -input-file=%t2_0.ll --check-prefix CHECK0 \
; RUN: --implicit-check-not @foo --implicit-check-not @kernel_A \
; RUN: --implicit-check-not @kernel_B --implicit-check-not @baz
; RUN: FileCheck %s -input-file=%t2_1.ll --check-prefix CHECK1 \
; RUN: --implicit-check-not @kernel_A --implicit-check-not @kernel_C
; RUN: FileCheck %s -input-file=%t2_2.ll --check-prefix CHECK2 \
; RUN: --implicit-check-not @foo --implicit-check-not @bar \
; RUN: --implicit-check-not @BAZ --implicit-check-not @kernel_B \
; RUN: --implicit-check-not @kernel_C
;
; RUN: sycl-post-link -split=source -S < %s -o %t.table
; RUN: FileCheck %s -input-file=%t_0.ll --check-prefix CHECK0 \
; RUN: --implicit-check-not @foo --implicit-check-not @kernel_A \
Expand All @@ -23,6 +34,17 @@
; RUN: --implicit-check-not @BAZ --implicit-check-not @kernel_B \
; RUN: --implicit-check-not @kernel_C
;
; RUN: sycl-module-split -split=source -S < %s -o %t2
; RUN: FileCheck %s -input-file=%t2_0.ll --check-prefix CHECK0 \
; RUN: --implicit-check-not @foo --implicit-check-not @kernel_A \
; RUN: --implicit-check-not @kernel_B --implicit-check-not @baz
; RUN: FileCheck %s -input-file=%t2_1.ll --check-prefix CHECK1 \
; RUN: --implicit-check-not @kernel_A --implicit-check-not @kernel_C
; RUN: FileCheck %s -input-file=%t2_2.ll --check-prefix CHECK2 \
; RUN: --implicit-check-not @foo --implicit-check-not @bar \
; RUN: --implicit-check-not @BAZ --implicit-check-not @kernel_B \
; RUN: --implicit-check-not @kernel_C
;
; RUN: sycl-post-link -split=kernel -S < %s -o %t.table
; RUN: FileCheck %s -input-file=%t_0.ll --check-prefix CHECK0 \
; RUN: --implicit-check-not @foo --implicit-check-not @kernel_A \
Expand All @@ -33,6 +55,17 @@
; RUN: --implicit-check-not @foo --implicit-check-not @bar \
; RUN: --implicit-check-not @BAZ --implicit-check-not @kernel_B \
; RUN: --implicit-check-not @kernel_C
;
; RUN: sycl-module-split -split=kernel -S < %s -o %t2
; RUN: FileCheck %s -input-file=%t2_0.ll --check-prefix CHECK0 \
; RUN: --implicit-check-not @foo --implicit-check-not @kernel_A \
; RUN: --implicit-check-not @kernel_B --implicit-check-not @baz
; RUN: FileCheck %s -input-file=%t2_1.ll --check-prefix CHECK1 \
; RUN: --implicit-check-not @kernel_A --implicit-check-not @kernel_C
; RUN: FileCheck %s -input-file=%t2_2.ll --check-prefix CHECK2 \
; RUN: --implicit-check-not @foo --implicit-check-not @bar \
; RUN: --implicit-check-not @BAZ --implicit-check-not @kernel_B \
; RUN: --implicit-check-not @kernel_C

; CHECK0-DAG: define spir_kernel void @kernel_C
; CHECK0-DAG: define spir_func i32 @bar
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,15 @@
; RUN: FileCheck %s -input-file=%t.files_1.sym --check-prefixes CHECK-MODULE1-TXT
; RUN: FileCheck %s -input-file=%t.files_2.ll --check-prefixes CHECK-MODULE2,CHECK
; RUN: FileCheck %s -input-file=%t.files_2.sym --check-prefixes CHECK-MODULE2-TXT
;
; RUN: sycl-module-split -split=kernel -S < %s -o %t2.files
; RUN: FileCheck %s -input-file=%t2.files_0.ll --check-prefixes CHECK-MODULE0,CHECK
; RUN: FileCheck %s -input-file=%t2.files_0.sym --check-prefixes CHECK-MODULE0-TXT
; RUN: FileCheck %s -input-file=%t2.files_1.ll --check-prefixes CHECK-MODULE1,CHECK
; RUN: FileCheck %s -input-file=%t2.files_1.sym --check-prefixes CHECK-MODULE1-TXT
; RUN: FileCheck %s -input-file=%t2.files_2.ll --check-prefixes CHECK-MODULE2,CHECK
; RUN: FileCheck %s -input-file=%t2.files_2.sym --check-prefixes CHECK-MODULE2-TXT

; ModuleID = 'one-kernel-per-module.ll'
source_filename = "one-kernel-per-module.ll"
target datalayout = "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,20 @@
; RUN: FileCheck %s -input-file=%t_2.sym --check-prefixes CHECK-M2-SYMS \
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1

; RUN: sycl-module-split -split=auto -S < %s -o %t2
; RUN: FileCheck %s -input-file=%t2_0.ll --check-prefixes CHECK-M0-IR \
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
; RUN: FileCheck %s -input-file=%t2_1.ll --check-prefixes CHECK-M1-IR \
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
; RUN: FileCheck %s -input-file=%t2_2.ll --check-prefixes CHECK-M2-IR \
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
; RUN: FileCheck %s -input-file=%t2_0.sym --check-prefixes CHECK-M0-SYMS \
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
; RUN: FileCheck %s -input-file=%t2_1.sym --check-prefixes CHECK-M1-SYMS \
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
; RUN: FileCheck %s -input-file=%t2_2.sym --check-prefixes CHECK-M2-SYMS \
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1

; RUN: sycl-post-link -split=source -symbols -S < %s -o %t.table
; RUN: FileCheck %s -input-file=%t_0.ll --check-prefixes CHECK-M0-IR \
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
Expand All @@ -35,6 +49,20 @@
; RUN: FileCheck %s -input-file=%t_2.sym --check-prefixes CHECK-M2-SYMS \
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1

; RUN: sycl-module-split -split=source -S < %s -o %t2
; RUN: FileCheck %s -input-file=%t2_0.ll --check-prefixes CHECK-M0-IR \
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
; RUN: FileCheck %s -input-file=%t2_1.ll --check-prefixes CHECK-M1-IR \
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
; RUN: FileCheck %s -input-file=%t2_2.ll --check-prefixes CHECK-M2-IR \
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
; RUN: FileCheck %s -input-file=%t2_0.sym --check-prefixes CHECK-M0-SYMS \
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
; RUN: FileCheck %s -input-file=%t2_1.sym --check-prefixes CHECK-M1-SYMS \
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
; RUN: FileCheck %s -input-file=%t2_2.sym --check-prefixes CHECK-M2-SYMS \
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1

; RUN: sycl-post-link -split=kernel -symbols -S < %s -o %t.table
; RUN: FileCheck %s -input-file=%t_0.ll --check-prefixes CHECK-M0-IR \
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
Expand All @@ -49,6 +77,20 @@
; RUN: FileCheck %s -input-file=%t_2.sym --check-prefixes CHECK-M2-SYMS \
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1

; RUN: sycl-module-split -split=kernel -S < %s -o %t2
; RUN: FileCheck %s -input-file=%t2_0.ll --check-prefixes CHECK-M0-IR \
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
; RUN: FileCheck %s -input-file=%t2_1.ll --check-prefixes CHECK-M1-IR \
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
; RUN: FileCheck %s -input-file=%t2_2.ll --check-prefixes CHECK-M2-IR \
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
; RUN: FileCheck %s -input-file=%t2_0.sym --check-prefixes CHECK-M0-SYMS \
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
; RUN: FileCheck %s -input-file=%t2_1.sym --check-prefixes CHECK-M1-SYMS \
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1
; RUN: FileCheck %s -input-file=%t2_2.sym --check-prefixes CHECK-M2-SYMS \
; RUN: --implicit-check-not kernel0 --implicit-check-not kernel1

; Regardless of device code split mode, each kernel should go into a separate
; device image

Expand Down
Loading
Loading