Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HLSL] Apply NoRecurse attrib to all HLSL functions #105907

Merged
merged 1 commit into from
Aug 29, 2024

Conversation

pow2clk
Copy link
Contributor

@pow2clk pow2clk commented Aug 24, 2024

Previously, functions named "main" got the NoRecurse attribute consistent with the behavior of C++, which HLSL largely follows. However, standard recursion is not allowed in HLSL, so all functions should really have this attribute. This doesn't prevent recursion, but rather signals that these functions aren't expected to recurse.

Practically, this was done so that entry point functions named "main" would have all have the same attributes as otherwise identical entry points with other names.

This required small changes to the this assignment tests because they no longer generate so many attribute sets since more of them match.

related to #105244
but done to simplify testing for #89806

Previously, functions named "main" got the NoRecurse attribute consistent
with the behavior of C++, which HLSL largely follows. However,
standard recursion is not allowed in HLSL, so all functions should
really have this attribute. This doesn't prevent recursion, but
rather signals that these functions aren't expected to recurse.

Practically, this was done so that entry point functions named "main"
would have all have the same attributes as otherwise identical
entry points with other names.

This required small changes to the this assignment tests because they
no longer generate so many attribute sets since more of them match.

related to llvm#105244
but done to simplify testing for llvm#89806
@pow2clk pow2clk added the HLSL HLSL Language Support label Aug 24, 2024
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen labels Aug 24, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Aug 24, 2024

@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-hlsl

Author: Greg Roth (pow2clk)

Changes

Previously, functions named "main" got the NoRecurse attribute consistent with the behavior of C++, which HLSL largely follows. However, standard recursion is not allowed in HLSL, so all functions should really have this attribute. This doesn't prevent recursion, but rather signals that these functions aren't expected to recurse.

Practically, this was done so that entry point functions named "main" would have all have the same attributes as otherwise identical entry points with other names.

This required small changes to the this assignment tests because they no longer generate so many attribute sets since more of them match.

related to #105244
but done to simplify testing for #89806


Full diff: https://github.com/llvm/llvm-project/pull/105907.diff

4 Files Affected:

  • (modified) clang/lib/CodeGen/CodeGenFunction.cpp (+7-3)
  • (added) clang/test/CodeGenHLSL/implicit-norecurse-attrib.hlsl (+93)
  • (modified) clang/test/CodeGenHLSL/this-assignment-overload.hlsl (+2-2)
  • (modified) clang/test/CodeGenHLSL/this-assignment.hlsl (+2-2)
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp
index c89eaa0f4e3bfc..a5747283e98058 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -1064,13 +1064,17 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy,
   // OpenCL C 2.0 v2.2-11 s6.9.i:
   //     Recursion is not supported.
   //
+  // HLSL
+  //     Recursion is not supported.
+  //
   // SYCL v1.2.1 s3.10:
   //     kernels cannot include RTTI information, exception classes,
   //     recursive code, virtual functions or make use of C++ libraries that
   //     are not compiled for the device.
-  if (FD && ((getLangOpts().CPlusPlus && FD->isMain()) ||
-             getLangOpts().OpenCL || getLangOpts().SYCLIsDevice ||
-             (getLangOpts().CUDA && FD->hasAttr<CUDAGlobalAttr>())))
+  if (FD &&
+      ((getLangOpts().CPlusPlus && FD->isMain()) || getLangOpts().OpenCL ||
+       getLangOpts().HLSL || getLangOpts().SYCLIsDevice ||
+       (getLangOpts().CUDA && FD->hasAttr<CUDAGlobalAttr>())))
     Fn->addFnAttr(llvm::Attribute::NoRecurse);
 
   llvm::RoundingMode RM = getLangOpts().getDefaultRoundingMode();
diff --git a/clang/test/CodeGenHLSL/implicit-norecurse-attrib.hlsl b/clang/test/CodeGenHLSL/implicit-norecurse-attrib.hlsl
new file mode 100644
index 00000000000000..ae3a3b5f90199f
--- /dev/null
+++ b/clang/test/CodeGenHLSL/implicit-norecurse-attrib.hlsl
@@ -0,0 +1,93 @@
+// RUN: %clang_cc1 -x hlsl -triple dxil-pc-shadermodel6.3-library  -finclude-default-header %s -emit-llvm -disable-llvm-passes -o - | FileCheck %s
+// RUN: %clang_cc1 -x hlsl -triple dxil-pc-shadermodel6.0-compute  -finclude-default-header %s -emit-llvm -disable-llvm-passes -o - | FileCheck %s
+
+// Verify that a few different function types all get the NoRecurse attribute
+
+#define MAX 100
+
+struct Node {
+  uint value;
+  uint key;
+  uint left, right;
+};
+
+// CHECK: Function Attrs:{{.*}}norecurse
+// CHECK: define noundef i32 @"?Find@@YAIY0GE@UNode@@I@Z"(ptr noundef byval([100 x %struct.Node]) align 4 %SortedTree, i32 noundef %key) [[IntAttr:\#[0-9]+]]
+// CHECK: ret i32
+// Find and return value corresponding to key in the SortedTree
+uint Find(Node SortedTree[MAX], uint key) {
+  uint nix = 0; // head
+  while(true) {
+    if (nix < 0)
+      return 0.0; // Not found
+    Node n = SortedTree[nix];
+    if (n.key == key)
+      return n.value;
+    if (key < n.key)
+      nix = n.left;
+    else
+      nix = n.right;
+  }
+}
+
+// CHECK: Function Attrs:{{.*}}norecurse
+// CHECK: define noundef i1 @"?InitTree@@YA_NY0GE@UNode@@V?$RWBuffer@T?$__vector@I$03@__clang@@@hlsl@@I@Z"(ptr noundef byval([100 x %struct.Node]) align 4 %tree, ptr noundef byval(%"class.hlsl::RWBuffer") align 4 %encodedTree, i32 noundef %maxDepth) [[ExtAttr:\#[0-9]+]]
+// CHECK: ret i1
+// Initialize tree with given buffer
+// Imagine the inout works
+export
+bool InitTree(/*inout*/ Node tree[MAX], RWBuffer<uint4> encodedTree, uint maxDepth) {
+  uint size = pow(2.f, maxDepth) - 1;
+  if (size > MAX) return false;
+  for (uint i = 1; i < size; i++) {
+    tree[i].value = encodedTree[i].x;
+    tree[i].key   = encodedTree[i].y;
+    tree[i].left  = encodedTree[i].z;
+    tree[i].right = encodedTree[i].w;
+  }
+  return true;
+}
+
+RWBuffer<uint4> gTree;
+
+// Mangled entry points are internal
+// CHECK: Function Attrs:{{.*}}norecurse
+// CHECK: define internal void @"?main@@YAXI@Z"(i32 noundef %GI) [[IntAttr]]
+// CHECK: ret void
+
+// Canonical entry points are external and shader attributed
+// CHECK: Function Attrs:{{.*}}norecurse
+// CHECK: define void @main() [[EntryAttr:\#[0-9]+]]
+// CHECK: ret void
+
+[numthreads(1,1,1)]
+[shader("compute")]
+void main(uint GI : SV_GroupIndex) {
+  Node haystack[MAX];
+  uint needle = 0;
+  if (InitTree(haystack, gTree, GI))
+    needle = Find(haystack, needle);
+}
+
+// Mangled entry points are internal
+// CHECK: Function Attrs:{{.*}}norecurse
+// CHECK: define internal void @"?defaultMain@@YAXXZ"() [[IntAttr]]
+// CHECK: ret void
+
+// Canonical entry points are external and shader attributed
+// CHECK: Function Attrs:{{.*}}norecurse
+// CHECK: define void @defaultMain() [[EntryAttr]]
+// CHECK: ret void
+
+[numthreads(1,1,1)]
+[shader("compute")]
+void defaultMain() {
+  Node haystack[MAX];
+  uint needle = 0;
+  if (InitTree(haystack, gTree, 4))
+    needle = Find(haystack, needle);
+}
+
+// CHECK: attributes [[IntAttr]] = {{.*}} norecurse
+// CHECK: attributes [[ExtAttr]] = {{.*}} norecurse
+// CHECK: attributes [[EntryAttr]] = {{.*}} norecurse
diff --git a/clang/test/CodeGenHLSL/this-assignment-overload.hlsl b/clang/test/CodeGenHLSL/this-assignment-overload.hlsl
index 0c4905e0f45980..f0affcb69a3fcd 100644
--- a/clang/test/CodeGenHLSL/this-assignment-overload.hlsl
+++ b/clang/test/CodeGenHLSL/this-assignment-overload.hlsl
@@ -25,7 +25,7 @@ void main() {
 }
 
 // This test makes a probably safe assumption that HLSL 202x includes operator overloading for assignment operators.
-// CHECK:     define linkonce_odr noundef i32 @"?getFirst@Pair@@QAAHXZ"(ptr noundef nonnull align 4 dereferenceable(8) %this) #2 align 2 {
+// CHECK:     define linkonce_odr noundef i32 @"?getFirst@Pair@@QAAHXZ"(ptr noundef nonnull align 4 dereferenceable(8) %this) #0 align 2 {
 // CHECK-NEXT:entry:
 // CHECK-NEXT:%this.addr = alloca ptr, align 4
 // CHECK-NEXT:%Another = alloca %struct.Pair, align 4
@@ -42,7 +42,7 @@ void main() {
 // CHECK-NEXT:%0 = load i32, ptr %First2, align 4
 // CHECK-NEXT:ret i32 %0
 
-// CHECK:     define linkonce_odr noundef i32 @"?getSecond@Pair@@QAAHXZ"(ptr noundef nonnull align 4 dereferenceable(8) %this) #2 align 2 {
+// CHECK:     define linkonce_odr noundef i32 @"?getSecond@Pair@@QAAHXZ"(ptr noundef nonnull align 4 dereferenceable(8) %this) #0 align 2 {
 // CHECK-NEXT:entry:
 // CHECK-NEXT:%this.addr = alloca ptr, align 4
 // CHECK-NEXT:%agg.tmp = alloca %struct.Pair, align 4
diff --git a/clang/test/CodeGenHLSL/this-assignment.hlsl b/clang/test/CodeGenHLSL/this-assignment.hlsl
index 6916afcde40546..5c8de0a18ef7ca 100644
--- a/clang/test/CodeGenHLSL/this-assignment.hlsl
+++ b/clang/test/CodeGenHLSL/this-assignment.hlsl
@@ -24,7 +24,7 @@ void main() {
 }
 
 // This tests reference like implicit this in HLSL
-// CHECK:     define linkonce_odr noundef i32 @"?getFirst@Pair@@QAAHXZ"(ptr noundef nonnull align 4 dereferenceable(8) %this) #3 align 2 {
+// CHECK:     define linkonce_odr noundef i32 @"?getFirst@Pair@@QAAHXZ"(ptr noundef nonnull align 4 dereferenceable(8) %this) #0 align 2 {
 // CHECK-NEXT:entry:
 // CHECK-NEXT:%this.addr = alloca ptr, align 4
 // CHECK-NEXT:%Another = alloca %struct.Pair, align 4
@@ -34,7 +34,7 @@ void main() {
 // CHECK-NEXT:call void @llvm.memcpy.p0.p0.i32(ptr align 4 %this1, ptr align 4 %Another, i32 8, i1 false)
 // CHECK-NEXT:%First = getelementptr inbounds nuw %struct.Pair, ptr %this1, i32 0, i32 0
 
-// CHECK:     define linkonce_odr noundef i32 @"?getSecond@Pair@@QAAHXZ"(ptr noundef nonnull align 4 dereferenceable(8) %this) #3 align 2 {
+// CHECK:     define linkonce_odr noundef i32 @"?getSecond@Pair@@QAAHXZ"(ptr noundef nonnull align 4 dereferenceable(8) %this) #0 align 2 {
 // CHECK-NEXT:entry:
 // CHECK-NEXT:%this.addr = alloca ptr, align 4
 // CHECK-NEXT:%ref.tmp = alloca %struct.Pair, align 4

Copy link
Contributor

@dmpots dmpots left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, LGTM.

// CHECK: define noundef i32 @"?Find@@YAIY0GE@UNode@@I@Z"(ptr noundef byval([100 x %struct.Node]) align 4 %SortedTree, i32 noundef %key) [[IntAttr:\#[0-9]+]]
// CHECK: ret i32
// Find and return value corresponding to key in the SortedTree
uint Find(Node SortedTree[MAX], uint key) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need such complicated functions for this test? Does the body matter here or is it really just the internal/external properties that matter. In my opinion, the function body is distracting from the essence of the test here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not. I admit I got carried away, but in the process I discovered the constraints of the current implementation and potentially introduced some incidentals that might catch future issues by creating a more representative shader. I realize that philosophies might differ here, but I'm reluctant to change it.

(getLangOpts().CUDA && FD->hasAttr<CUDAGlobalAttr>())))
if (FD &&
((getLangOpts().CPlusPlus && FD->isMain()) || getLangOpts().OpenCL ||
getLangOpts().HLSL || getLangOpts().SYCLIsDevice ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the norecurse attribute apply to direct recursion only or indirectly as well (e.g. A->B->A). We do have bounded recursion in raytracing in that TraceRay can invoke a ClosetHit shader which can then again call TraceRay.

I don't think that changes anything about this PR, but just throwing it out there in case it matters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This attribute is descriptive and used in a few places to determine if recursion is expected or not. There is not actually any place that checks for recursion and produces an error for HLSL nor for OpenCL. There is a check for it when inlining takes place, but that's too late for diagnostics. #105244 is meant to address this problem.

@pow2clk pow2clk merged commit 2dc3b50 into llvm:main Aug 29, 2024
12 checks passed
@pow2clk pow2clk deleted the hlsl_norecurse branch August 29, 2024 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen clang Clang issues not falling into any other category HLSL HLSL Language Support
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

5 participants