Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

__builtin_assume with embedded function calls are ignored #91612

Closed
davidben opened this issue May 9, 2024 · 15 comments · Fixed by #93077
Closed

__builtin_assume with embedded function calls are ignored #91612

davidben opened this issue May 9, 2024 · 15 comments · Fixed by #93077
Assignees
Labels
clang:diagnostics New/improved warning or error message in Clang, but not in clang-tidy or static analyzer

Comments

@davidben
Copy link
Contributor

davidben commented May 9, 2024

I'm not sure if this is a bug or working as intended, but it surprised me. Consider this code:

bool f1(std::string_view sv, size_t n) {
    return n <= sv.size();
}

bool f2(std::string_view sv, size_t n) {
    __builtin_assume(n <= sv.size());
    return n <= sv.size();
}

bool f3(std::string_view sv, size_t n) {
    size_t size = sv.size();
    __builtin_assume(n <= size);
    return n <= sv.size();
}

https://godbolt.org/z/1G6cao4ss

While this is a toy function, this is a minimized version of plausible patterns in real world code that wishes to simultaneously use bounds-checked STL mode, but also needs to manually annotate a few performance-critical invariants that the compiler is unable to discover.

Although f2 and f3 look like they would be the same, Clang only deletes the check in f3. Now, Clang does emit a warning, so perhaps this is intended, but the warning says:

<source>:8:22: warning: the argument to '__builtin_assume' has side effects that will be discarded [-Wassume]
    8 |     __builtin_assume(n <= sv.size());
      |    

This reads to me like saying that the side effects in sv.size() will be discarded at program runtime, but not that the compiler will also discard the stated invariant. As a programmer, I would read this and think, well, okay, sv.size() actually no side effects but the type system doesn't know that, so I can silence that warning, and assume that the optimizer will still figure it out.

However, it seems that it actually means that the whole invariant is discarded and the optimizer doesn't use the invariant. This is a bit non-obvious. It can be worked around with the pattern in f3, but:

  • The programmer needs to know to do that
  • It is more verbose
  • Presumably __builtin_assume will actually be a macro that expands to nothing on non-Clang compilers, but then we'll get a warning about unused size, which is extra tedious.
@danakj
Copy link
Contributor

danakj commented May 9, 2024

Noting here that libc++'s _LIBCPP_ASSUME macro actually just turns off the warning before calling __builtin_assume so the developer will have no diagnostic at all to know it's doing nothing (if indeed that is always the outcome).

#if 0 && __has_builtin(__builtin_assume)
# define _LIBCPP_ASSUME(expression) \
(_LIBCPP_DIAGNOSTIC_PUSH _LIBCPP_CLANG_DIAGNOSTIC_IGNORED("-Wassume") \
__builtin_assume(static_cast<bool>(expression)) _LIBCPP_DIAGNOSTIC_POP)
#else
# define _LIBCPP_ASSUME(expression) ((void)0)
#endif

@davidben
Copy link
Contributor Author

davidben commented May 9, 2024

@danakj Yeah, I think we should fix this so that _LIBCPP_ASSUME still does __builtin_assume but we turn off the automatic thing where disabled assertions become assumes. (Possibly indefinitely because some assertions are off even in hardened mode, but every __builtin_assume is a safety hazard. They're sometimes useful, but probably the bar for adding them should be higher than "someone wanted a debug assert somewhere".)

@Sirraide
Copy link
Member

Yeah, this is a known issue. To summarise: LLVM is currently not equipped to handle assumptions with side-effects properly, so we just drop the assumption if it has side-effects. The same applies to C++23’s [[assume]] attribute.

@Sirraide
Copy link
Member

@Sirraide Sirraide added clang:frontend Language frontend issues, e.g. anything involving "Sema" and removed new issue labels May 14, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented May 14, 2024

@llvm/issue-subscribers-clang-frontend

Author: David Benjamin (davidben)

I'm not sure if this is a bug or working as intended, but it surprised me. Consider this code:
bool f1(std::string_view sv, size_t n) {
    return n &lt;= sv.size();
}

bool f2(std::string_view sv, size_t n) {
    __builtin_assume(n &lt;= sv.size());
    return n &lt;= sv.size();
}

bool f3(std::string_view sv, size_t n) {
    size_t size = sv.size();
    __builtin_assume(n &lt;= size);
    return n &lt;= sv.size();
}

https://godbolt.org/z/1G6cao4ss

While this is a toy function, this is a minimized version of plausible patterns in real world code that wishes to simultaneously use bounds-checked STL mode, but also needs to manually annotate a few performance-critical invariants that the compiler is unable to discover.

Although f2 and f3 look like they would be the same, Clang only deletes the check in f3. Now, Clang does emit a warning, so perhaps this is intended, but the warning says:

&lt;source&gt;:8:22: warning: the argument to '__builtin_assume' has side effects that will be discarded [-Wassume]
    8 |     __builtin_assume(n &lt;= sv.size());
      |    

This reads to me like saying that the side effects in sv.size() will be discarded at program runtime, but not that the compiler will also discard the stated invariant. As a programmer, I would read this and think, well, okay, sv.size() actually no side effects but the type system doesn't know that, so I can silence that warning, and assume that the optimizer will still figure it out.

However, it seems that it actually means that the whole invariant is discarded and the optimizer doesn't use the invariant. This is a bit non-obvious. It can be worked around with the pattern in f3, but:

  • The programmer needs to know to do that
  • It is more verbose
  • Presumably __builtin_assume will actually be a macro that expands to nothing on non-Clang compilers, but then we'll get a warning about unused size, which is extra tedious.

1 similar comment
@llvmbot
Copy link
Collaborator

llvmbot commented May 14, 2024

@llvm/issue-subscribers-clang-frontend

Author: David Benjamin (davidben)

I'm not sure if this is a bug or working as intended, but it surprised me. Consider this code:
bool f1(std::string_view sv, size_t n) {
    return n &lt;= sv.size();
}

bool f2(std::string_view sv, size_t n) {
    __builtin_assume(n &lt;= sv.size());
    return n &lt;= sv.size();
}

bool f3(std::string_view sv, size_t n) {
    size_t size = sv.size();
    __builtin_assume(n &lt;= size);
    return n &lt;= sv.size();
}

https://godbolt.org/z/1G6cao4ss

While this is a toy function, this is a minimized version of plausible patterns in real world code that wishes to simultaneously use bounds-checked STL mode, but also needs to manually annotate a few performance-critical invariants that the compiler is unable to discover.

Although f2 and f3 look like they would be the same, Clang only deletes the check in f3. Now, Clang does emit a warning, so perhaps this is intended, but the warning says:

&lt;source&gt;:8:22: warning: the argument to '__builtin_assume' has side effects that will be discarded [-Wassume]
    8 |     __builtin_assume(n &lt;= sv.size());
      |    

This reads to me like saying that the side effects in sv.size() will be discarded at program runtime, but not that the compiler will also discard the stated invariant. As a programmer, I would read this and think, well, okay, sv.size() actually no side effects but the type system doesn't know that, so I can silence that warning, and assume that the optimizer will still figure it out.

However, it seems that it actually means that the whole invariant is discarded and the optimizer doesn't use the invariant. This is a bit non-obvious. It can be worked around with the pattern in f3, but:

  • The programmer needs to know to do that
  • It is more verbose
  • Presumably __builtin_assume will actually be a macro that expands to nothing on non-Clang compilers, but then we'll get a warning about unused size, which is extra tedious.

@Sirraide
Copy link
Member

I think we should fix this so that _LIBCPP_ASSUME still does __builtin_assume

Also, as a remark, iirc someone on the libc++ team disabled _LIBCPP_ASSUME beacuse __builtin_assume can currently actually worsen codegen in some cases, so it currently should not be used w/o actually testing that it improves something.

@davidben
Copy link
Contributor Author

That discussion seems unrelated to this issue. That seems to talk about how LLVM does not handle stray assumptions very well.

That is, don't just add naively convert all asserts to assumptions and add them for specific optimization purpose. What libc++ ran into was that naive use was bad, and then it worked around with too big of a hammer. See #91801

Just because assumes are bad when used naively doesn't mean they're always bad. This issue is about the interaction of cases when they are valuable, such as informing the compiler about invariants around bounds.

Now, when you write code in C++ or Rust or any other language that encourages abstractions, you will often call small accessors, which have "side effects". They don't actually have side effects, but you'll only discover this after inlining, which you always want in this context.

This issue is about assumptions with "side effects" getting ignored before that inlining. You can work around this by storing things in locals, but the warning is ambiguously phrased and doesn't even make it clear that the assumption is being ignored, just the side effects, but there are no side effects to ignore. At the least, LLVM should rephrase the warning.

@Sirraide
Copy link
Member

the warning is ambiguously phrased and doesn't even make it clear that the assumption is being ignored, just the side effects

I see yeah; I suppose the idea behind the warning is that if it worked correctly, you’d get the assumption information, but since the expression is never actually evaluated, then any of its side-effects will be discarded. As it stands, we just discard the entire assumption instead because LLVM can’t actually deal w/ discarding side-effects properly at the moment.

there are no side effects to ignore.

Yeah, we only really make a conservative assumption as to whether something could have side-effects, because this warning here is actually coming from Clang, not LLVM; we don’t actually look at the function body so we don’t know whether or not it has side-effects.

I suppose since we are actually discading the assumption at the moment, it would be more correct to print something along the lines of ‘assumption will be ignored, as it contains side-effects’ and change it back to what is is right now once the backend can actually handle assumptions with side-effects.

@cor3ntin
Copy link
Contributor

See the reply here #55636 (comment)

@Sirraide
Copy link
Member

‘assumption will be ignored, as it contains side-effects’

Or perhaps even ‘as it contains potential side-effects’

@davidben
Copy link
Contributor Author

Although, even better than rephrasing the warning, would be to simply lower assume like any other function call:

  1. Compute the boolean
  2. Assume the result of the computation

This makes Clang a better tool for programmers. Side effects work like anything else and we're no better or worse off w.r.t. naive vs reasoned uses of assume.

But if there's some deeper reason why that's not viable, the warning should be rephrased at the very least.

@Sirraide
Copy link
Member

Although, even better than rephrasing the warning, would be to simply lower assume like any other function call:

  1. Compute the boolean
  2. Assume the result of the computation

This makes Clang a better tool for programmers. Side effects work like anything else and we're no better or worse off w.r.t. naive vs reasoned uses of assume.

But if there's some deeper reason why that's not viable, the warning should be rephrased at the very least.

It’s not that something like that isn’t viable. Afaik that’s more or less what GCC does—it moves the assumption into a separate function so it can guarantee that it’s never evaluated. It’s just that someone would actually have to go ahead and implement that in LLVM...

@davidben
Copy link
Contributor Author

(Does GCC even have __builtin_assume? I don't see it in its docs.)

Ohh, okay I think I see the disconnect now. I was saying that the following would be pretty solid semantics for __builtin_assume:

void __builtin_assume(bool pred) {
  if (!pred) {
    __builtin_unreachable();
  }
}

That is, rather than muck about with side effects, just say the predicate is evaluated and move on. Evaluating something like idx <= str.size() is fine because the compiler will inline it and discover it can delete the whole thing. (Though I suppose there is a risk of the compiler not inlining it. It'd be nice if the optimizer could more reliably discover function purity without inlining.)

That's how Rust does this, for example:
https://doc.rust-lang.org/stable/core/intrinsics/fn.assume.html

But I suspect you all are thinking of the C++23 assume attribute, and intending for __builtin_assume to match that. That does seem to specify that the predicate's side effects must not happen. Indeed there are some interesting examples that include a function call for the sake of matching the subsequent call. (Neat, although effectively implementing that sounds like a nightmare!)
https://en.cppreference.com/w/cpp/language/attributes/assume

Given that, yeah, it makes sense that my suggestion isn't quite what we want. So probably the immediate thing to do is to rephrase the warning, at least until Clang is able to implement the C++23 attribute more fully.

@Sirraide
Copy link
Member

(Does GCC even have __builtin_assume? I don't see it in its docs.)

It does not, but it has __attribute__((assume(...))), which does pretty much the same (except that GCC’s backend can handle expressions w/ side-effects from what I know).

That is, rather than muck about with side effects, just say the predicate is evaluated and move on

I won’t deny that there are cases where that would probably not be an issue, but I don’t think we can change the semantics of __builtin_assume to do that at this point... If you want an assume that always evaluates its operand, then perhaps something along the lines of this would work:

#define _Assume(expr) (!(expr) ? __builtin_assume(0) : (void)0)

That said, I have no idea how good the optimiser would be at dealing with this; given that it already has problems optimising away assumptions that it doesn’t need—which can block other optimisations—I’m not sure how well this would work.

But I suspect you all are thinking of the C++23 assume attribute, and intending for __builtin_assume to match that.

Yeah, we had an extensive discussion about this topic when I implemented C++23’s [[assume]] (you can check the Discourse thread I linked) and decided that that would be the least confusing course of action.

Given that, yeah, it makes sense that my suggestion isn't quite what we want. So probably the immediate thing to do is to rephrase the warning, at least until Clang is able to implement the C++23 attribute more fully.

Yeah, I can take a look at making the warning a bit clearer when I have the time. We should be able to support expressions w/ (potential) side-effects in both __builtin_assume and [[assume]] once the optimiser has better support for them, but as I said, someone would have to implement that first (iirc there was some talk about maybe making that a gsoc project or sth like that, but I’m only really familiar w/ the frontend, so idk what the state of that is atm).

@Sirraide Sirraide self-assigned this May 20, 2024
Sirraide added a commit that referenced this issue May 27, 2024
Currently, if the argument to `__builtin_assume` and friends contains
side-effects, we issue the following diagnostic:
```
<source>:1:34: warning: the argument to '__builtin_assume' has side 
effects that will be discarded [-Wassume]
    1 | void f(int x) { __builtin_assume(x++); }
      |                                  
```
The issue here is that this diagnostic misrepresents what is actually
happening: not only do we discard the side-effects of the expression,
but we also don’t even emit any assumption information at all because
the backend is not equipped to deal with eliminating side-effects in
cases such as this.

This has caused some confusion (see #91612) beacuse the current wording
of the warning suggests that, sensibly, only the side-effects of the
expression, and not the assumption itself, will be discarded.

This pr updates the diagnostic to state what is actually happening: that
the assumption has no effect at all because its argument contains
side-effects:
```
<source>:1:34: warning: assumption is ignored because it contains 
(potential) side-effects [-Wassume]
    1 | void f(int x) { __builtin_assume(x++); }
      |                                  
```

I’ve deliberately included ‘(potential)’ here because even expressions
that only contain potential side-effects (e.g. `true ? x : x++` or a
call to a function that is pure, but we don’t know that it is) cause the
assumption to be discarded. This, too, has caused some confusion because
it was erroneously assumed that Clang would e.g. infer that a function
call is pure and not discard the assumption as a result when that isn’t
the case.

This is intended to be temporary; we should revert back to the original
diagnostic once we have proper support for assumptions with side-effects
in the backend (in which case the side-effects will still be discarded,
but the assumption won’t)

This fixes #91612.
@EugeneZelenko EugeneZelenko added clang:diagnostics New/improved warning or error message in Clang, but not in clang-tidy or static analyzer and removed clang:frontend Language frontend issues, e.g. anything involving "Sema" labels May 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:diagnostics New/improved warning or error message in Clang, but not in clang-tidy or static analyzer
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants