Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CodeGen] -fsanitize=alignment: add cl::opt sanitize-alignment-builtin to disable memcpy instrumentation #69240

Merged
merged 1 commit into from
Oct 18, 2023

Conversation

MaskRay
Copy link
Member

@MaskRay MaskRay commented Oct 16, 2023

Deploying #67766 to a large internal codebase uncovers many bugs (many are
probably benign but need cleaning up). There are also issues in high-profile
open-source projects like v8. Add a cl::opt to disable builtin instrumentation
for -fsanitize=alignment to help large codebase users.

In the long term, this cl::opt option may still be useful to debug
-fsanitize=alignment instrumentation on builtins, so we probably want to keep it
around.

…n to disable memcpy instrumentation

Deploying llvm#67766 to a large internal codebase uncovers many bugs (many are
probably benign but need cleaning up). There are also issues in high-profile
open-source projects like v8. Add a cl::opt to disable builtin instrumentation
for -fsanitize=alignment to help large codebase users.

In the long term, this cl::opt option may still be useful to debug
-fsanitize=alignment instrumentation on builtins, so we probably want to keep it
around.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen labels Oct 16, 2023
@llvmbot
Copy link
Member

llvmbot commented Oct 16, 2023

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: Fangrui Song (MaskRay)

Changes

Deploying #67766 to a large internal codebase uncovers many bugs (many are
probably benign but need cleaning up). There are also issues in high-profile
open-source projects like v8. Add a cl::opt to disable builtin instrumentation
for -fsanitize=alignment to help large codebase users.

In the long term, this cl::opt option may still be useful to debug
-fsanitize=alignment instrumentation on builtins, so we probably want to keep it
around.


Full diff: https://github.com/llvm/llvm-project/pull/69240.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/CGBuiltin.cpp (+6-1)
  • (modified) clang/test/CodeGen/catch-undef-behavior.c (+31-14)
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 4d86e8a769846c4..2cf124efc6fb7f7 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -66,6 +66,11 @@ using namespace clang;
 using namespace CodeGen;
 using namespace llvm;
 
+static llvm::cl::opt<bool> ClSanitizeAlignmentBuiltin(
+    "sanitize-alignment-builtin", llvm::cl::Hidden,
+    llvm::cl::desc("Instrument builtin functions for -fsanitize=alignment"),
+    llvm::cl::init(true));
+
 static void initializeAlloca(CodeGenFunction &CGF, AllocaInst *AI, Value *Size,
                              Align AlignmentInBytes) {
   ConstantInt *Byte;
@@ -2788,7 +2793,7 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
     EmitNonNullArgCheck(RValue::get(Val), Arg->getType(), Arg->getExprLoc(), FD,
                         ParmNum);
 
-    if (SanOpts.has(SanitizerKind::Alignment)) {
+    if (SanOpts.has(SanitizerKind::Alignment) && ClSanitizeAlignmentBuiltin) {
       SanitizerSet SkippedChecks;
       SkippedChecks.set(SanitizerKind::All);
       SkippedChecks.clear(SanitizerKind::Alignment);
diff --git a/clang/test/CodeGen/catch-undef-behavior.c b/clang/test/CodeGen/catch-undef-behavior.c
index af37ef9e8565b1e..b33c13a68a5c66f 100644
--- a/clang/test/CodeGen/catch-undef-behavior.c
+++ b/clang/test/CodeGen/catch-undef-behavior.c
@@ -1,6 +1,8 @@
-// RUN: %clang_cc1 -fsanitize=alignment,null,object-size,shift-base,shift-exponent,return,signed-integer-overflow,vla-bound,float-cast-overflow,integer-divide-by-zero,bool,returns-nonnull-attribute,nonnull-attribute -fsanitize-recover=alignment,null,object-size,shift-base,shift-exponent,signed-integer-overflow,vla-bound,float-cast-overflow,integer-divide-by-zero,bool,returns-nonnull-attribute,nonnull-attribute -emit-llvm %s -o - -triple x86_64-linux-gnu | FileCheck %s --check-prefix=CHECK-COMMON --check-prefix=CHECK-UBSAN
-// RUN: %clang_cc1 -fsanitize-trap=alignment,null,object-size,shift-base,shift-exponent,return,signed-integer-overflow,vla-bound,float-cast-overflow,integer-divide-by-zero,bool,returns-nonnull-attribute,nonnull-attribute -fsanitize-recover=alignment,null,object-size,shift-base,shift-exponent,signed-integer-overflow,vla-bound,float-cast-overflow,integer-divide-by-zero,bool,returns-nonnull-attribute,nonnull-attribute -fsanitize=alignment,null,object-size,shift-base,shift-exponent,return,signed-integer-overflow,vla-bound,float-cast-overflow,integer-divide-by-zero,bool,returns-nonnull-attribute,nonnull-attribute -fsanitize-recover=alignment,null,object-size,shift-base,shift-exponent,signed-integer-overflow,vla-bound,float-cast-overflow,integer-divide-by-zero,bool,returns-nonnull-attribute,nonnull-attribute -emit-llvm %s -o - -triple x86_64-linux-gnu | FileCheck %s --check-prefix=CHECK-COMMON --check-prefix=CHECK-TRAP
+// RUN: %clang_cc1 -fsanitize=alignment,null,object-size,shift-base,shift-exponent,return,signed-integer-overflow,vla-bound,float-cast-overflow,integer-divide-by-zero,bool,returns-nonnull-attribute,nonnull-attribute -fsanitize-recover=alignment,null,object-size,shift-base,shift-exponent,signed-integer-overflow,vla-bound,float-cast-overflow,integer-divide-by-zero,bool,returns-nonnull-attribute,nonnull-attribute -emit-llvm %s -o - -triple x86_64-linux-gnu | FileCheck %s --check-prefixes=CHECK-COMMON,CHECK-UBSAN,CHECK-ALIGNMENT-BUILTIN
+// RUN: %clang_cc1 -fsanitize-trap=alignment,null,object-size,shift-base,shift-exponent,return,signed-integer-overflow,vla-bound,float-cast-overflow,integer-divide-by-zero,bool,returns-nonnull-attribute,nonnull-attribute -fsanitize-recover=alignment,null,object-size,shift-base,shift-exponent,signed-integer-overflow,vla-bound,float-cast-overflow,integer-divide-by-zero,bool,returns-nonnull-attribute,nonnull-attribute -fsanitize=alignment,null,object-size,shift-base,shift-exponent,return,signed-integer-overflow,vla-bound,float-cast-overflow,integer-divide-by-zero,bool,returns-nonnull-attribute,nonnull-attribute -fsanitize-recover=alignment,null,object-size,shift-base,shift-exponent,signed-integer-overflow,vla-bound,float-cast-overflow,integer-divide-by-zero,bool,returns-nonnull-attribute,nonnull-attribute -emit-llvm %s -o - -triple x86_64-linux-gnu | FileCheck %s --check-prefixes=CHECK-COMMON,CHECK-ALIGNMENT-BUILTIN,CHECK-TRAP
 // RUN: %clang_cc1 -fsanitize=signed-integer-overflow -emit-llvm %s -o - -triple x86_64-linux-gnu | FileCheck %s --check-prefix=CHECK-OVERFLOW
+/// A variant of CHECK-UBSAN with -sanitize-alignment-builtin disabled
+// RUN: %clang_cc1 -fsanitize=alignment,null,object-size,shift-base,shift-exponent,return,signed-integer-overflow,vla-bound,float-cast-overflow,integer-divide-by-zero,bool,returns-nonnull-attribute,nonnull-attribute -fsanitize-recover=alignment,null,object-size,shift-base,shift-exponent,signed-integer-overflow,vla-bound,float-cast-overflow,integer-divide-by-zero,bool,returns-nonnull-attribute,nonnull-attribute -emit-llvm %s -o - -triple x86_64-linux-gnu -mllvm -sanitize-alignment-builtin=0 | FileCheck %s --check-prefixes=CHECK-COMMON,CHECK-UBSAN-NO-ALIGNMENT-BUILTIN
 
 // CHECK-UBSAN: @[[INT:.*]] = private unnamed_addr constant { i16, i16, [6 x i8] } { i16 0, i16 11, [6 x i8] c"'int'\00" }
 
@@ -363,11 +365,13 @@ extern void *memcpy(void *, const void *, unsigned long) __attribute__((nonnull(
 void call_memcpy_nonnull(void *p, void *q, int sz) {
   // CHECK-COMMON: icmp ne ptr {{.*}}, null
   // CHECK-UBSAN: call void @__ubsan_handle_nonnull_arg
+  // CHECK-UBSAN-NO-ALIGNMENT-BUILTIN: call void @__ubsan_handle_nonnull_arg
   // CHECK-TRAP: call void @llvm.ubsantrap(i8 16)
   // CHECK-COMMON-NOT: call
 
   // CHECK-COMMON: icmp ne ptr {{.*}}, null
   // CHECK-UBSAN: call void @__ubsan_handle_nonnull_arg
+  // CHECK-UBSAN-NO-ALIGNMENT-BUILTIN: call void @__ubsan_handle_nonnull_arg
   // CHECK-TRAP: call void @llvm.ubsantrap(i8 16)
   // CHECK-COMMON-NOT: call
 
@@ -379,18 +383,23 @@ void call_memcpy_nonnull(void *p, void *q, int sz) {
 void call_memcpy(long *p, short *q, int sz) {
   // CHECK-COMMON: icmp ne ptr {{.*}}, null
   // CHECK-UBSAN: call void @__ubsan_handle_nonnull_arg(
+  // CHECK-UBSAN-NO-ALIGNMENT-BUILTIN: call void @__ubsan_handle_nonnull_arg(
   // CHECK-TRAP: call void @llvm.ubsantrap(i8 16)
-  // CHECK-COMMON: and i64 %[[#]], 7, !nosanitize
-  // CHECK-COMMON: icmp eq i64 %[[#]], 0, !nosanitize
+  // CHECK-ALIGNMENT-BUILTIN: and i64 %[[#]], 7, !nosanitize
+  // CHECK-ALIGNMENT-BUILTIN: icmp eq i64 %[[#]], 0, !nosanitize
   // CHECK-UBSAN: call void @__ubsan_handle_type_mismatch_v1(ptr @[[LINE_1600]]
+  // CHECK-UBSAN-NO-ALIGNMENT-BUILTIN-NOT: call void @__ubsan_handle_type_mismatch_v1(
   // CHECK-TRAP: call void @llvm.ubsantrap(i8 22)
 
   // CHECK-COMMON: icmp ne ptr {{.*}}, null
   // CHECK-UBSAN: call void @__ubsan_handle_nonnull_arg(
+  // CHECK-UBSAN-NO-ALIGNMENT-BUILTIN: call void @__ubsan_handle_nonnull_arg(
+  // CHECK-UBSAN-DISABLE-BUILTIN: call
   // CHECK-TRAP: call void @llvm.ubsantrap(i8 16)
-  // CHECK-COMMON: and i64 %[[#]], 1, !nosanitize
-  // CHECK-COMMON: icmp eq i64 %[[#]], 0, !nosanitize
+  // CHECK-ALIGNMENT-BUILTIN: and i64 %[[#]], 1, !nosanitize
+  // CHECK-ALIGNMENT-BUILTIN: icmp eq i64 %[[#]], 0, !nosanitize
   // CHECK-UBSAN: call void @__ubsan_handle_type_mismatch_v1(
+  // CHECK-UBSAN-NO-ALIGNMENT-BUILTIN-NOT: call
   // CHECK-TRAP: call void @llvm.ubsantrap(i8 22)
 
   // CHECK-COMMON: call void @llvm.memcpy.p0.p0.i64(ptr align 8 %0, ptr align 2 %1, i64 %conv, i1 false)
@@ -405,14 +414,16 @@ void call_memcpy(long *p, short *q, int sz) {
 
 // CHECK-COMMON-LABEL: define{{.*}} void @call_memcpy_inline(
 void call_memcpy_inline(long *p, short *q) {
-  // CHECK-COMMON: and i64 %[[#]], 7, !nosanitize
-  // CHECK-COMMON: icmp eq i64 %[[#]], 0, !nosanitize
+  // CHECK-ALIGNMENT-BUILTIN: and i64 %[[#]], 7, !nosanitize
+  // CHECK-ALIGNMENT-BUILTIN: icmp eq i64 %[[#]], 0, !nosanitize
   // CHECK-UBSAN: call void @__ubsan_handle_type_mismatch_v1(
+  // CHECK-UBSAN-NO-ALIGNMENT-BUILTIN-NOT: call
   // CHECK-TRAP: call void @llvm.ubsantrap(i8 22)
 
-  // CHECK-COMMON: and i64 %[[#]], 1, !nosanitize
-  // CHECK-COMMON: icmp eq i64 %[[#]], 0, !nosanitize
+  // CHECK-ALIGNMENT-BUILTIN: and i64 %[[#]], 1, !nosanitize
+  // CHECK-ALIGNMENT-BUILTIN: icmp eq i64 %[[#]], 0, !nosanitize
   // CHECK-UBSAN: call void @__ubsan_handle_type_mismatch_v1(
+  // CHECK-UBSAN-NO-ALIGNMENT-BUILTIN-NOT: call
   // CHECK-TRAP: call void @llvm.ubsantrap(i8 22)
 
   // CHECK-COMMON: call void @llvm.memcpy.inline.p0.p0.i64(ptr align 8 %0, ptr align 2 %1, i64 2, i1 false)
@@ -425,10 +436,12 @@ extern void *memmove(void *, const void *, unsigned long) __attribute__((nonnull
 void call_memmove_nonnull(void *p, void *q, int sz) {
   // CHECK-COMMON: icmp ne ptr {{.*}}, null
   // CHECK-UBSAN: call void @__ubsan_handle_nonnull_arg
+  // CHECK-UBSAN-NO-ALIGNMENT-BUILTIN: call void @__ubsan_handle_nonnull_arg(
   // CHECK-TRAP: call void @llvm.ubsantrap(i8 16)
 
   // CHECK-COMMON: icmp ne ptr {{.*}}, null
   // CHECK-UBSAN: call void @__ubsan_handle_nonnull_arg
+  // CHECK-UBSAN-NO-ALIGNMENT-BUILTIN: call void @__ubsan_handle_nonnull_arg(
   // CHECK-TRAP: call void @llvm.ubsantrap(i8 16)
   memmove(p, q, sz);
 }
@@ -437,18 +450,22 @@ void call_memmove_nonnull(void *p, void *q, int sz) {
 void call_memmove(long *p, short *q, int sz) {
   // CHECK-COMMON: icmp ne ptr {{.*}}, null
   // CHECK-UBSAN: call void @__ubsan_handle_nonnull_arg(
+  // CHECK-UBSAN-NO-ALIGNMENT-BUILTIN: call void @__ubsan_handle_nonnull_arg(
   // CHECK-TRAP: call void @llvm.ubsantrap(i8 16)
-  // CHECK-COMMON: and i64 %[[#]], 7, !nosanitize
-  // CHECK-COMMON: icmp eq i64 %[[#]], 0, !nosanitize
+  // CHECK-ALIGNMENT-BUILTIN: and i64 %[[#]], 7, !nosanitize
+  // CHECK-ALIGNMENT-BUILTIN: icmp eq i64 %[[#]], 0, !nosanitize
   // CHECK-UBSAN: call void @__ubsan_handle_type_mismatch_v1(
+  // CHECK-UBSAN-NO-ALIGNMENT-BUILTIN-NOT: call
   // CHECK-TRAP: call void @llvm.ubsantrap(i8 22)
 
   // CHECK-COMMON: icmp ne ptr {{.*}}, null
   // CHECK-UBSAN: call void @__ubsan_handle_nonnull_arg(
+  // CHECK-UBSAN-NO-ALIGNMENT-BUILTIN: call void @__ubsan_handle_nonnull_arg(
   // CHECK-TRAP: call void @llvm.ubsantrap(i8 16)
-  // CHECK-COMMON: and i64 %[[#]], 1, !nosanitize
-  // CHECK-COMMON: icmp eq i64 %[[#]], 0, !nosanitize
+  // CHECK-ALIGNMENT-BUILTIN: and i64 %[[#]], 1, !nosanitize
+  // CHECK-ALIGNMENT-BUILTIN: icmp eq i64 %[[#]], 0, !nosanitize
   // CHECK-UBSAN: call void @__ubsan_handle_type_mismatch_v1(
+  // CHECK-UBSAN-NO-ALIGNMENT-BUILTIN-NOT: call
   // CHECK-TRAP: call void @llvm.ubsantrap(i8 22)
 
   // CHECK-COMMON: call void @llvm.memmove.p0.p0.i64(ptr align 8 %0, ptr align 2 %1, i64 %conv, i1 false)

@efriedma-quic
Copy link
Collaborator

For reference, can you give a couple examples of code where this is triggering?

If this is triggering in practice, do we want a real driver option to control the sanitizer? The alignment attributes themselves?

@vitalybuka
Copy link
Collaborator

For reference, can you give a couple examples of code where this is triggering?

If this is triggering in practice, do we want a real driver option to control the sanitizer? The alignment attributes themselves?

I am not sure we need special driver flag for that.

@MaskRay After reading about amount of unique cases you see, maybe ignore list is easier?

@MaskRay
Copy link
Member Author

MaskRay commented Oct 18, 2023

For reference, can you give a couple examples of code where this is triggering?
If this is triggering in practice, do we want a real driver option to control the sanitizer? The alignment attributes themselves?

I am not sure we need special driver flag for that.

@MaskRay After reading about amount of unique cases you see, maybe ignore list is easier?

Given the scale of our internal code base, the identified unique failures are moderate, but still too many to fix for rolling releases. Eventually we can remove the cl::opt option, but having the option buys us time to clean up the code base.

Many problems identified look like this:
https://chromium.googlesource.com/v8/v8.git/+/bb2ac8991c64601be9852d88c22abac2d6a6c39b/src/bigint/bigint.h#120

 protected:
  friend class ShiftedDigits;
  digit_t* digits_;     // digits_ may be misaligned
  int len_;
 private:
  // We require externally-provided digits arrays to be 4-byte aligned, but
  // not necessarily 8-byte aligned; so on 64-bit platforms we use memcpy
  // to allow unaligned reads.
  digit_t read_4byte_aligned(int i) {
    if (sizeof(digit_t) == 4) {
      return digits_[i];
    } else {
      digit_t result;
      memcpy(&result, digits_ + i, sizeof(result));  // unspecified behavior identified here
      return result;
    }
  }
};

@vitalybuka
Copy link
Collaborator

Either way seems fine, but ignore list looks more usually to me.

@MaskRay MaskRay merged commit e8fe4de into llvm:main Oct 18, 2023
Comment on lines +69 to +73
static llvm::cl::opt<bool> ClSanitizeAlignmentBuiltin(
"sanitize-alignment-builtin", llvm::cl::Hidden,
llvm::cl::desc("Instrument builtin functions for -fsanitize=alignment"),
llvm::cl::init(true));

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you using a cl::opt here rather than a proper CodeGenOption? We generally don't use cl::opt in clang and want our options to be provided via our option structs instead, so that clang can be used as a library.

Copy link
Member Author

@MaskRay MaskRay Oct 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I acknowledge that cl::opt is discouraged (clang/lib has very few of them). This is a temporary workaround (so that our users can perform cleanups without being blocked) and should not be in Clang for too long. I'll revert this after ~two months, assuming that rolling update users (like us) have caught up on the change and tested their code bases. Given how large our code base is and how few (relatively) problems identified, I reasonably expect that the effect of instrumenting memcpy is small and we don't really need a driver option or cc1 option.

I modeled this after cl::opt<bool> ClSanitizeOnOptimizerEarlyEP in clang/lib/CodeGen/BackendUtil.cpp and thought that cl::opt might look more experimental than a cc1 option...

@MaskRay
Copy link
Member Author

MaskRay commented Nov 10, 2023

I plan to revert this workaround soon.

MaskRay added a commit that referenced this pull request Nov 13, 2023
…t-builtin to disable memcpy instrumentation (#69240)"

This reverts commit e8fe4de.

memcpy/memmove instrumentation for -fsanitize=alignment has been tested
on a huge code base. There were some cleanups but the number does not
justify a workaround.
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Nov 20, 2023
…t-builtin to disable memcpy instrumentation (llvm#69240)"

This reverts commit e8fe4de.

memcpy/memmove instrumentation for -fsanitize=alignment has been tested
on a huge code base. There were some cleanups but the number does not
justify a workaround.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants