__builtin_assume with embedded function calls are ignored #91612

davidben · 2024-05-09T16:02:02Z

I'm not sure if this is a bug or working as intended, but it surprised me. Consider this code:

bool f1(std::string_view sv, size_t n) {
    return n <= sv.size();
}

bool f2(std::string_view sv, size_t n) {
    __builtin_assume(n <= sv.size());
    return n <= sv.size();
}

bool f3(std::string_view sv, size_t n) {
    size_t size = sv.size();
    __builtin_assume(n <= size);
    return n <= sv.size();
}

https://godbolt.org/z/1G6cao4ss

While this is a toy function, this is a minimized version of plausible patterns in real world code that wishes to simultaneously use bounds-checked STL mode, but also needs to manually annotate a few performance-critical invariants that the compiler is unable to discover.

Although f2 and f3 look like they would be the same, Clang only deletes the check in f3. Now, Clang does emit a warning, so perhaps this is intended, but the warning says:

<source>:8:22: warning: the argument to '__builtin_assume' has side effects that will be discarded [-Wassume]
    8 |     __builtin_assume(n <= sv.size());
      |

This reads to me like saying that the side effects in sv.size() will be discarded at program runtime, but not that the compiler will also discard the stated invariant. As a programmer, I would read this and think, well, okay, sv.size() actually no side effects but the type system doesn't know that, so I can silence that warning, and assume that the optimizer will still figure it out.

However, it seems that it actually means that the whole invariant is discarded and the optimizer doesn't use the invariant. This is a bit non-obvious. It can be worked around with the pattern in f3, but:

The programmer needs to know to do that
It is more verbose
Presumably __builtin_assume will actually be a macro that expands to nothing on non-Clang compilers, but then we'll get a warning about unused size, which is extra tedious.

The text was updated successfully, but these errors were encountered:

danakj · 2024-05-09T16:51:16Z

Noting here that libc++'s _LIBCPP_ASSUME macro actually just turns off the warning before calling __builtin_assume so the developer will have no diagnostic at all to know it's doing nothing (if indeed that is always the outcome).

llvm-project/libcxx/include/__assert

Lines 29 to 35 in 2083e97

    
           #if 0 && __has_builtin(__builtin_assume) 
        
           #  define _LIBCPP_ASSUME(expression)                                                                                   \ 
        
               (_LIBCPP_DIAGNOSTIC_PUSH _LIBCPP_CLANG_DIAGNOSTIC_IGNORED("-Wassume")                                              \ 
        
                    __builtin_assume(static_cast<bool>(expression)) _LIBCPP_DIAGNOSTIC_POP) 
        
           #else 
        
           #  define _LIBCPP_ASSUME(expression) ((void)0) 
        
           #endif

davidben · 2024-05-09T18:21:17Z

@danakj Yeah, I think we should fix this so that _LIBCPP_ASSUME still does __builtin_assume but we turn off the automatic thing where disabled assertions become assumes. (Possibly indefinitely because some assertions are off even in hardened mode, but every __builtin_assume is a safety hazard. They're sometimes useful, but probably the bar for adding them should be higher than "someone wanted a debug assert somewhere".)

Sirraide · 2024-05-14T14:12:19Z

Yeah, this is a known issue. To summarise: LLVM is currently not equipped to handle assumptions with side-effects properly, so we just drop the assumption if it has side-effects. The same applies to C++23’s [[assume]] attribute.

Sirraide · 2024-05-14T14:13:00Z

See also: https://discourse.llvm.org/t/rfc-builtin-assume-assume-optimization-behavior/76943

llvmbot · 2024-05-14T14:14:36Z

@llvm/issue-subscribers-clang-frontend

Author: David Benjamin (davidben)

I'm not sure if this is a bug or working as intended, but it surprised me. Consider this code:

bool f1(std::string_view sv, size_t n) {
    return n &lt;= sv.size();
}

bool f2(std::string_view sv, size_t n) {
    __builtin_assume(n &lt;= sv.size());
    return n &lt;= sv.size();
}

bool f3(std::string_view sv, size_t n) {
    size_t size = sv.size();
    __builtin_assume(n &lt;= size);
    return n &lt;= sv.size();
}

https://godbolt.org/z/1G6cao4ss

While this is a toy function, this is a minimized version of plausible patterns in real world code that wishes to simultaneously use bounds-checked STL mode, but also needs to manually annotate a few performance-critical invariants that the compiler is unable to discover.

Although f2 and f3 look like they would be the same, Clang only deletes the check in f3. Now, Clang does emit a warning, so perhaps this is intended, but the warning says:

&lt;source&gt;:8:22: warning: the argument to '__builtin_assume' has side effects that will be discarded [-Wassume]
    8 |     __builtin_assume(n &lt;= sv.size());
      |

This reads to me like saying that the side effects in sv.size() will be discarded at program runtime, but not that the compiler will also discard the stated invariant. As a programmer, I would read this and think, well, okay, sv.size() actually no side effects but the type system doesn't know that, so I can silence that warning, and assume that the optimizer will still figure it out.

However, it seems that it actually means that the whole invariant is discarded and the optimizer doesn't use the invariant. This is a bit non-obvious. It can be worked around with the pattern in f3, but:

The programmer needs to know to do that
It is more verbose
Presumably __builtin_assume will actually be a macro that expands to nothing on non-Clang compilers, but then we'll get a warning about unused size, which is extra tedious.

llvmbot · 2024-05-14T14:14:39Z

@llvm/issue-subscribers-clang-frontend

Author: David Benjamin (davidben)

I'm not sure if this is a bug or working as intended, but it surprised me. Consider this code:

bool f1(std::string_view sv, size_t n) {
    return n &lt;= sv.size();
}

bool f2(std::string_view sv, size_t n) {
    __builtin_assume(n &lt;= sv.size());
    return n &lt;= sv.size();
}

bool f3(std::string_view sv, size_t n) {
    size_t size = sv.size();
    __builtin_assume(n &lt;= size);
    return n &lt;= sv.size();
}

https://godbolt.org/z/1G6cao4ss

While this is a toy function, this is a minimized version of plausible patterns in real world code that wishes to simultaneously use bounds-checked STL mode, but also needs to manually annotate a few performance-critical invariants that the compiler is unable to discover.

Although f2 and f3 look like they would be the same, Clang only deletes the check in f3. Now, Clang does emit a warning, so perhaps this is intended, but the warning says:

&lt;source&gt;:8:22: warning: the argument to '__builtin_assume' has side effects that will be discarded [-Wassume]
    8 |     __builtin_assume(n &lt;= sv.size());
      |

This reads to me like saying that the side effects in sv.size() will be discarded at program runtime, but not that the compiler will also discard the stated invariant. As a programmer, I would read this and think, well, okay, sv.size() actually no side effects but the type system doesn't know that, so I can silence that warning, and assume that the optimizer will still figure it out.

However, it seems that it actually means that the whole invariant is discarded and the optimizer doesn't use the invariant. This is a bit non-obvious. It can be worked around with the pattern in f3, but:

The programmer needs to know to do that
It is more verbose
Presumably __builtin_assume will actually be a macro that expands to nothing on non-Clang compilers, but then we'll get a warning about unused size, which is extra tedious.

Sirraide · 2024-05-14T14:16:44Z

I think we should fix this so that _LIBCPP_ASSUME still does __builtin_assume

Also, as a remark, iirc someone on the libc++ team disabled _LIBCPP_ASSUME beacuse __builtin_assume can currently actually worsen codegen in some cases, so it currently should not be used w/o actually testing that it improves something.

davidben · 2024-05-14T14:23:50Z

That discussion seems unrelated to this issue. That seems to talk about how LLVM does not handle stray assumptions very well.

That is, don't just add naively convert all asserts to assumptions and add them for specific optimization purpose. What libc++ ran into was that naive use was bad, and then it worked around with too big of a hammer. See #91801

Just because assumes are bad when used naively doesn't mean they're always bad. This issue is about the interaction of cases when they are valuable, such as informing the compiler about invariants around bounds.

Now, when you write code in C++ or Rust or any other language that encourages abstractions, you will often call small accessors, which have "side effects". They don't actually have side effects, but you'll only discover this after inlining, which you always want in this context.

This issue is about assumptions with "side effects" getting ignored before that inlining. You can work around this by storing things in locals, but the warning is ambiguously phrased and doesn't even make it clear that the assumption is being ignored, just the side effects, but there are no side effects to ignore. At the least, LLVM should rephrase the warning.

Sirraide · 2024-05-14T14:30:48Z

the warning is ambiguously phrased and doesn't even make it clear that the assumption is being ignored, just the side effects

I see yeah; I suppose the idea behind the warning is that if it worked correctly, you’d get the assumption information, but since the expression is never actually evaluated, then any of its side-effects will be discarded. As it stands, we just discard the entire assumption instead because LLVM can’t actually deal w/ discarding side-effects properly at the moment.

there are no side effects to ignore.

Yeah, we only really make a conservative assumption as to whether something could have side-effects, because this warning here is actually coming from Clang, not LLVM; we don’t actually look at the function body so we don’t know whether or not it has side-effects.

I suppose since we are actually discading the assumption at the moment, it would be more correct to print something along the lines of ‘assumption will be ignored, as it contains side-effects’ and change it back to what is is right now once the backend can actually handle assumptions with side-effects.

cor3ntin · 2024-05-14T14:31:45Z

See the reply here #55636 (comment)

Sirraide · 2024-05-14T14:32:00Z

‘assumption will be ignored, as it contains side-effects’

Or perhaps even ‘as it contains potential side-effects’

davidben · 2024-05-14T14:32:02Z

Although, even better than rephrasing the warning, would be to simply lower assume like any other function call:

Compute the boolean
Assume the result of the computation

This makes Clang a better tool for programmers. Side effects work like anything else and we're no better or worse off w.r.t. naive vs reasoned uses of assume.

But if there's some deeper reason why that's not viable, the warning should be rephrased at the very least.

Sirraide · 2024-05-14T14:33:57Z

Although, even better than rephrasing the warning, would be to simply lower assume like any other function call:

Compute the boolean

Assume the result of the computation

This makes Clang a better tool for programmers. Side effects work like anything else and we're no better or worse off w.r.t. naive vs reasoned uses of assume.

But if there's some deeper reason why that's not viable, the warning should be rephrased at the very least.

It’s not that something like that isn’t viable. Afaik that’s more or less what GCC does—it moves the assumption into a separate function so it can guarantee that it’s never evaluated. It’s just that someone would actually have to go ahead and implement that in LLVM...

davidben · 2024-05-14T15:23:28Z

(Does GCC even have __builtin_assume? I don't see it in its docs.)

Ohh, okay I think I see the disconnect now. I was saying that the following would be pretty solid semantics for __builtin_assume:

void __builtin_assume(bool pred) {
  if (!pred) {
    __builtin_unreachable();
  }
}

That is, rather than muck about with side effects, just say the predicate is evaluated and move on. Evaluating something like idx <= str.size() is fine because the compiler will inline it and discover it can delete the whole thing. (Though I suppose there is a risk of the compiler not inlining it. It'd be nice if the optimizer could more reliably discover function purity without inlining.)

That's how Rust does this, for example:
https://doc.rust-lang.org/stable/core/intrinsics/fn.assume.html

But I suspect you all are thinking of the C++23 assume attribute, and intending for __builtin_assume to match that. That does seem to specify that the predicate's side effects must not happen. Indeed there are some interesting examples that include a function call for the sake of matching the subsequent call. (Neat, although effectively implementing that sounds like a nightmare!)
https://en.cppreference.com/w/cpp/language/attributes/assume

Given that, yeah, it makes sense that my suggestion isn't quite what we want. So probably the immediate thing to do is to rephrase the warning, at least until Clang is able to implement the C++23 attribute more fully.

Sirraide · 2024-05-14T15:37:48Z

(Does GCC even have __builtin_assume? I don't see it in its docs.)

It does not, but it has __attribute__((assume(...))), which does pretty much the same (except that GCC’s backend can handle expressions w/ side-effects from what I know).

That is, rather than muck about with side effects, just say the predicate is evaluated and move on

I won’t deny that there are cases where that would probably not be an issue, but I don’t think we can change the semantics of __builtin_assume to do that at this point... If you want an assume that always evaluates its operand, then perhaps something along the lines of this would work:

#define _Assume(expr) (!(expr) ? __builtin_assume(0) : (void)0)

That said, I have no idea how good the optimiser would be at dealing with this; given that it already has problems optimising away assumptions that it doesn’t need—which can block other optimisations—I’m not sure how well this would work.

But I suspect you all are thinking of the C++23 assume attribute, and intending for __builtin_assume to match that.

Yeah, we had an extensive discussion about this topic when I implemented C++23’s [[assume]] (you can check the Discourse thread I linked) and decided that that would be the least confusing course of action.

Given that, yeah, it makes sense that my suggestion isn't quite what we want. So probably the immediate thing to do is to rephrase the warning, at least until Clang is able to implement the C++23 attribute more fully.

Yeah, I can take a look at making the warning a bit clearer when I have the time. We should be able to support expressions w/ (potential) side-effects in both __builtin_assume and [[assume]] once the optimiser has better support for them, but as I said, someone would have to implement that first (iirc there was some talk about maybe making that a gsoc project or sth like that, but I’m only really familiar w/ the frontend, so idk what the state of that is atm).

Currently, if the argument to `__builtin_assume` and friends contains side-effects, we issue the following diagnostic: ``` <source>:1:34: warning: the argument to '__builtin_assume' has side effects that will be discarded [-Wassume] 1 | void f(int x) { __builtin_assume(x++); } | ``` The issue here is that this diagnostic misrepresents what is actually happening: not only do we discard the side-effects of the expression, but we also don’t even emit any assumption information at all because the backend is not equipped to deal with eliminating side-effects in cases such as this. This has caused some confusion (see #91612) beacuse the current wording of the warning suggests that, sensibly, only the side-effects of the expression, and not the assumption itself, will be discarded. This pr updates the diagnostic to state what is actually happening: that the assumption has no effect at all because its argument contains side-effects: ``` <source>:1:34: warning: assumption is ignored because it contains (potential) side-effects [-Wassume] 1 | void f(int x) { __builtin_assume(x++); } | ``` I’ve deliberately included ‘(potential)’ here because even expressions that only contain potential side-effects (e.g. `true ? x : x++` or a call to a function that is pure, but we don’t know that it is) cause the assumption to be discarded. This, too, has caused some confusion because it was erroneously assumed that Clang would e.g. infer that a function call is pure and not discard the assumption as a result when that isn’t the case. This is intended to be temporary; we should revert back to the original diagnostic once we have proper support for assumptions with side-effects in the backend (in which case the side-effects will still be discarded, but the assumption won’t) This fixes #91612.

github-actions bot added the new issue label May 9, 2024

coopp mentioned this issue May 9, 2024

[HLSL] Compiler-as-a-library interface for clang HLSL #63631

Open

8 tasks

Sirraide added clang:frontend Language frontend issues, e.g. anything involving "Sema" and removed new issue labels May 14, 2024

Sirraide self-assigned this May 20, 2024

Sirraide mentioned this issue May 22, 2024

[Clang] [NFC] Clarify assume diagnostic #93077

Merged

Sirraide closed this as completed in #93077 May 27, 2024

EugeneZelenko added clang:diagnostics New/improved warning or error message in Clang, but not in clang-tidy or static analyzer and removed clang:frontend Language frontend issues, e.g. anything involving "Sema" labels May 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

__builtin_assume with embedded function calls are ignored #91612

__builtin_assume with embedded function calls are ignored #91612

davidben commented May 9, 2024

danakj commented May 9, 2024

davidben commented May 9, 2024

Sirraide commented May 14, 2024

Sirraide commented May 14, 2024

llvmbot commented May 14, 2024

llvmbot commented May 14, 2024

Sirraide commented May 14, 2024

davidben commented May 14, 2024

Sirraide commented May 14, 2024

cor3ntin commented May 14, 2024

Sirraide commented May 14, 2024

davidben commented May 14, 2024

Sirraide commented May 14, 2024

davidben commented May 14, 2024

Sirraide commented May 14, 2024

__builtin_assume with embedded function calls are ignored #91612

__builtin_assume with embedded function calls are ignored #91612

Comments

davidben commented May 9, 2024

danakj commented May 9, 2024

davidben commented May 9, 2024

Sirraide commented May 14, 2024

Sirraide commented May 14, 2024

llvmbot commented May 14, 2024

llvmbot commented May 14, 2024

Sirraide commented May 14, 2024

davidben commented May 14, 2024

Sirraide commented May 14, 2024

cor3ntin commented May 14, 2024

Sirraide commented May 14, 2024

davidben commented May 14, 2024

Sirraide commented May 14, 2024

davidben commented May 14, 2024

Sirraide commented May 14, 2024