Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HLSL] Apply NoRecurse attrib to all HLSL functions #105907

Merged
merged 1 commit into from
Aug 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 7 additions & 3 deletions clang/lib/CodeGen/CodeGenFunction.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1064,13 +1064,17 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy,
// OpenCL C 2.0 v2.2-11 s6.9.i:
// Recursion is not supported.
//
// HLSL
// Recursion is not supported.
//
// SYCL v1.2.1 s3.10:
// kernels cannot include RTTI information, exception classes,
// recursive code, virtual functions or make use of C++ libraries that
// are not compiled for the device.
if (FD && ((getLangOpts().CPlusPlus && FD->isMain()) ||
getLangOpts().OpenCL || getLangOpts().SYCLIsDevice ||
(getLangOpts().CUDA && FD->hasAttr<CUDAGlobalAttr>())))
if (FD &&
((getLangOpts().CPlusPlus && FD->isMain()) || getLangOpts().OpenCL ||
getLangOpts().HLSL || getLangOpts().SYCLIsDevice ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the norecurse attribute apply to direct recursion only or indirectly as well (e.g. A->B->A). We do have bounded recursion in raytracing in that TraceRay can invoke a ClosetHit shader which can then again call TraceRay.

I don't think that changes anything about this PR, but just throwing it out there in case it matters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This attribute is descriptive and used in a few places to determine if recursion is expected or not. There is not actually any place that checks for recursion and produces an error for HLSL nor for OpenCL. There is a check for it when inlining takes place, but that's too late for diagnostics. #105244 is meant to address this problem.

(getLangOpts().CUDA && FD->hasAttr<CUDAGlobalAttr>())))
Fn->addFnAttr(llvm::Attribute::NoRecurse);

llvm::RoundingMode RM = getLangOpts().getDefaultRoundingMode();
Expand Down
93 changes: 93 additions & 0 deletions clang/test/CodeGenHLSL/implicit-norecurse-attrib.hlsl
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
// RUN: %clang_cc1 -x hlsl -triple dxil-pc-shadermodel6.3-library -finclude-default-header %s -emit-llvm -disable-llvm-passes -o - | FileCheck %s
// RUN: %clang_cc1 -x hlsl -triple dxil-pc-shadermodel6.0-compute -finclude-default-header %s -emit-llvm -disable-llvm-passes -o - | FileCheck %s

// Verify that a few different function types all get the NoRecurse attribute

#define MAX 100

struct Node {
uint value;
uint key;
uint left, right;
};

// CHECK: Function Attrs:{{.*}}norecurse
// CHECK: define noundef i32 @"?Find@@YAIY0GE@UNode@@I@Z"(ptr noundef byval([100 x %struct.Node]) align 4 %SortedTree, i32 noundef %key) [[IntAttr:\#[0-9]+]]
// CHECK: ret i32
// Find and return value corresponding to key in the SortedTree
uint Find(Node SortedTree[MAX], uint key) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need such complicated functions for this test? Does the body matter here or is it really just the internal/external properties that matter. In my opinion, the function body is distracting from the essence of the test here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not. I admit I got carried away, but in the process I discovered the constraints of the current implementation and potentially introduced some incidentals that might catch future issues by creating a more representative shader. I realize that philosophies might differ here, but I'm reluctant to change it.

uint nix = 0; // head
while(true) {
if (nix < 0)
return 0.0; // Not found
Node n = SortedTree[nix];
if (n.key == key)
return n.value;
if (key < n.key)
nix = n.left;
else
nix = n.right;
}
}

// CHECK: Function Attrs:{{.*}}norecurse
// CHECK: define noundef i1 @"?InitTree@@YA_NY0GE@UNode@@V?$RWBuffer@T?$__vector@I$03@__clang@@@hlsl@@I@Z"(ptr noundef byval([100 x %struct.Node]) align 4 %tree, ptr noundef byval(%"class.hlsl::RWBuffer") align 4 %encodedTree, i32 noundef %maxDepth) [[ExtAttr:\#[0-9]+]]
// CHECK: ret i1
// Initialize tree with given buffer
// Imagine the inout works
export
bool InitTree(/*inout*/ Node tree[MAX], RWBuffer<uint4> encodedTree, uint maxDepth) {
uint size = pow(2.f, maxDepth) - 1;
if (size > MAX) return false;
for (uint i = 1; i < size; i++) {
tree[i].value = encodedTree[i].x;
tree[i].key = encodedTree[i].y;
tree[i].left = encodedTree[i].z;
tree[i].right = encodedTree[i].w;
}
return true;
}

RWBuffer<uint4> gTree;

// Mangled entry points are internal
// CHECK: Function Attrs:{{.*}}norecurse
// CHECK: define internal void @"?main@@YAXI@Z"(i32 noundef %GI) [[IntAttr]]
// CHECK: ret void

// Canonical entry points are external and shader attributed
// CHECK: Function Attrs:{{.*}}norecurse
// CHECK: define void @main() [[EntryAttr:\#[0-9]+]]
// CHECK: ret void

[numthreads(1,1,1)]
[shader("compute")]
void main(uint GI : SV_GroupIndex) {
Node haystack[MAX];
uint needle = 0;
if (InitTree(haystack, gTree, GI))
needle = Find(haystack, needle);
}

// Mangled entry points are internal
// CHECK: Function Attrs:{{.*}}norecurse
// CHECK: define internal void @"?defaultMain@@YAXXZ"() [[IntAttr]]
// CHECK: ret void

// Canonical entry points are external and shader attributed
// CHECK: Function Attrs:{{.*}}norecurse
// CHECK: define void @defaultMain() [[EntryAttr]]
// CHECK: ret void

[numthreads(1,1,1)]
[shader("compute")]
void defaultMain() {
Node haystack[MAX];
uint needle = 0;
if (InitTree(haystack, gTree, 4))
needle = Find(haystack, needle);
}

// CHECK: attributes [[IntAttr]] = {{.*}} norecurse
// CHECK: attributes [[ExtAttr]] = {{.*}} norecurse
// CHECK: attributes [[EntryAttr]] = {{.*}} norecurse
4 changes: 2 additions & 2 deletions clang/test/CodeGenHLSL/this-assignment-overload.hlsl
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ void main() {
}

// This test makes a probably safe assumption that HLSL 202x includes operator overloading for assignment operators.
// CHECK: define linkonce_odr noundef i32 @"?getFirst@Pair@@QAAHXZ"(ptr noundef nonnull align 4 dereferenceable(8) %this) #2 align 2 {
// CHECK: define linkonce_odr noundef i32 @"?getFirst@Pair@@QAAHXZ"(ptr noundef nonnull align 4 dereferenceable(8) %this) #0 align 2 {
// CHECK-NEXT:entry:
// CHECK-NEXT:%this.addr = alloca ptr, align 4
// CHECK-NEXT:%Another = alloca %struct.Pair, align 4
Expand All @@ -42,7 +42,7 @@ void main() {
// CHECK-NEXT:%0 = load i32, ptr %First2, align 4
// CHECK-NEXT:ret i32 %0

// CHECK: define linkonce_odr noundef i32 @"?getSecond@Pair@@QAAHXZ"(ptr noundef nonnull align 4 dereferenceable(8) %this) #2 align 2 {
// CHECK: define linkonce_odr noundef i32 @"?getSecond@Pair@@QAAHXZ"(ptr noundef nonnull align 4 dereferenceable(8) %this) #0 align 2 {
// CHECK-NEXT:entry:
// CHECK-NEXT:%this.addr = alloca ptr, align 4
// CHECK-NEXT:%agg.tmp = alloca %struct.Pair, align 4
Expand Down
4 changes: 2 additions & 2 deletions clang/test/CodeGenHLSL/this-assignment.hlsl
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ void main() {
}

// This tests reference like implicit this in HLSL
// CHECK: define linkonce_odr noundef i32 @"?getFirst@Pair@@QAAHXZ"(ptr noundef nonnull align 4 dereferenceable(8) %this) #3 align 2 {
// CHECK: define linkonce_odr noundef i32 @"?getFirst@Pair@@QAAHXZ"(ptr noundef nonnull align 4 dereferenceable(8) %this) #0 align 2 {
// CHECK-NEXT:entry:
// CHECK-NEXT:%this.addr = alloca ptr, align 4
// CHECK-NEXT:%Another = alloca %struct.Pair, align 4
Expand All @@ -34,7 +34,7 @@ void main() {
// CHECK-NEXT:call void @llvm.memcpy.p0.p0.i32(ptr align 4 %this1, ptr align 4 %Another, i32 8, i1 false)
// CHECK-NEXT:%First = getelementptr inbounds nuw %struct.Pair, ptr %this1, i32 0, i32 0

// CHECK: define linkonce_odr noundef i32 @"?getSecond@Pair@@QAAHXZ"(ptr noundef nonnull align 4 dereferenceable(8) %this) #3 align 2 {
// CHECK: define linkonce_odr noundef i32 @"?getSecond@Pair@@QAAHXZ"(ptr noundef nonnull align 4 dereferenceable(8) %this) #0 align 2 {
// CHECK-NEXT:entry:
// CHECK-NEXT:%this.addr = alloca ptr, align 4
// CHECK-NEXT:%ref.tmp = alloca %struct.Pair, align 4
Expand Down
Loading