Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Instrumentation] Fix EdgeCounts vector size in SetBranchWeights #99064

Merged

Conversation

avikivity
Copy link
Contributor

@avikivity avikivity commented Jul 16, 2024

SetBranchWeights() calculates the size of the EdgeCounts vector
using OutEdges.Size(), but this is an under-estimate with coroutines.

Use the number of successors, as the vector will be indexed by
the result of the GetSuccessorNumber() function.

Rename the Size local, to make it clear what it refers to.

A unit test, provided by @ellishg, is included.

Fixes #97962

(regression from ffd337b)

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added PGO Profile Guided Optimizations llvm:transforms labels Jul 16, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jul 16, 2024

@llvm/pr-subscribers-coroutines
@llvm/pr-subscribers-pgo

@llvm/pr-subscribers-llvm-transforms

Author: Avi Kivity (avikivity)

Changes

SetBranchWeights() calculates the size of the EdgeCounts vector using OutEdges.Size(), but this is an under-estimate with coroutines.

Use the number of successors, as the vector will be indexed by the result of the GetSuccessorNumber() function.

This crashes in 18.1, but somehow doesn't in main. Still, it looks like the fix is applicable to both.

Fixes #97962


Full diff: https://github.com/llvm/llvm-project/pull/99064.diff

1 Files Affected:

  • (modified) llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (+3-1)
diff --git a/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp b/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp
index 35b1bbf21be97..d32d00700bcb3 100644
--- a/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp
+++ b/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp
@@ -1624,7 +1624,9 @@ void PGOUseFunc::setBranchWeights() {
 
     // We have a non-zero Branch BB.
     unsigned Size = BBCountInfo.OutEdges.size();
-    SmallVector<uint64_t, 2> EdgeCounts(Size, 0);
+    unsigned SuccessorCount = BB.getTerminator()->getNumSuccessors();
+
+    SmallVector<uint64_t, 2> EdgeCounts(SuccessorCount, 0);
     uint64_t MaxCount = 0;
     for (unsigned s = 0; s < Size; s++) {
       const PGOUseEdge *E = BBCountInfo.OutEdges[s];

@avikivity
Copy link
Contributor Author

Requesting review from @xur-llvm as author of this function.

@avikivity avikivity force-pushed the PGOUseFunc-setBranchWeights-fix-EdgeCounts-size branch from eae8cec to b7db364 Compare July 21, 2024 10:55
@avikivity
Copy link
Contributor Author

Updated: renamed Size to OutEdgesCount to make it clear what it refers to.

@ellishg
Copy link
Contributor

ellishg commented Jul 23, 2024

Seems ok, but I'd like to see a test so we don't break this in the future.

@avikivity
Copy link
Contributor Author

Seems ok, but I'd like to see a test so we don't break this in the future.

I'm not an llvm developer, so any tips about reducing my C++ reproducer to a minimal llvm reproducer would be appreciated.

@ellishg
Copy link
Contributor

ellishg commented Jul 24, 2024

Seems ok, but I'd like to see a test so we don't break this in the future.

I'm not an llvm developer, so any tips about reducing my C++ reproducer to a minimal llvm reproducer would be appreciated.

There are a few ways to get a minimal reproducer depending on your setup. https://llvm.org/docs/HowToSubmitABug.html

I would first try to get a reproducer with unoptimized IR by invoking clang with -emit-llvm -S -Xclang -disable-llvm-optzns -o a.ll. Then you can use llvm-reduce to further reduce the IR to a small test case. https://llvm.org/docs/CommandGuide/llvm-reduce.html#example

@avikivity
Copy link
Contributor Author

Tacking on those flags to the command line (without -Xclang which was rejected) I get an error.

command:

 "/usr/bin/clang++" "-cc1" "-triple" "x86_64-redhat-linux-gnu" "-emit-llvm-bc" "-flto=thin" "-flto-unit" "-disable-free" "-clear-ast-before-backend" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "rpc.cc" "-mrelocation-model" "static" "-mframe-pointer=none" "-fmath-errno" "-ffp-contract=on" "-fno-rounding-math" "-mconstructor-aliases" "-funwind-tables=2" "-target-cpu" "westmere" "-debug-info-kind=constructor" "-dwarf-version=5" "-debugger-tuning=gdb" "--compress-debug-sections=zlib" "-fdebug-compilation-dir=/home/avi/scylla-maint/build/release-cs-pgo/seastar.lto" "-fdebug-prefix-map=/home/avi/scylla-maint=." "-fprofile-instrument=csllvm" "-fprofile-instrument-path=/home/avi/scylla-maint/build/release-cs-pgo/default_%m.profraw" "-fprofile-instrument-use-path=merged.profdata" "-fcoverage-compilation-dir=/home/avi/scylla-maint/build/release-cs-pgo/seastar.lto" "-sys-header-deps" "-D" "FMT_SHARED" "-D" "SEASTAR_API_LEVEL=7" "-D" "SEASTAR_DEFERRED_ACTION_REQUIRE_NOEXCEPT" "-D" "SEASTAR_HAS_MEMBARRIER" "-D" "SEASTAR_HAVE_ASAN_FIBER_SUPPORT" "-D" "SEASTAR_HAVE_HWLOC" "-D" "SEASTAR_HAVE_NUMA" "-D" "SEASTAR_HAVE_SYSTEMTAP_SDT" "-D" "SEASTAR_LOGGER_COMPILE_TIME_FMT" "-D" "SEASTAR_LOGGER_TYPE_STDOUT" "-D" "SEASTAR_PTHREAD_ATTR_SETAFFINITY_NP" "-D" "SEASTAR_SCHEDULING_GROUPS_COUNT=18" "-D" "SEASTAR_SSTRING" "-D" "SEASTAR_STRERROR_R_CHAR_P" "-D" "NDEBUG" "-U" "_FORTIFY_SOURCE" "-U" "NDEBUG" "-fmacro-prefix-map=/home/avi/scylla-maint=." "-fcoverage-prefix-map=/home/avi/scylla-maint=." "-O2" "-Werror=unused-result" "-Wno-error=#warnings" "-Wall" "-Wimplicit-fallthrough" "-Wdeprecated" "-Wno-error=deprecated" "-Wno-backend-plugin" "-Werror" "-std=gnu++23" "-fdeprecated-macro" "-ferror-limit" "19" "-fvisibility=hidden" "-fgnuc-version=4.2.1" "-fno-implicit-modules" "-fskip-odr-check-in-gmf" "-fcxx-exceptions" "-fexceptions" "-fcolor-diagnostics" "-vectorize-loops" "-vectorize-slp" "-mllvm" "-pgso=false" "-mllvm" "-enable-value-profiling=false" "-faddrsig" "-D__GCC_HAVE_DWARF2_CFI_ASM=1" "-x" "c++" "rpc-195796.cpp"  -emit-llvm -S  -disable-llvm-optzns -o a.ll

error:

fatal error: error in backend: Do not know how to promote this operator!

@avikivity
Copy link
Contributor Author

I'm guessing some internal pass related to coroutines cannot be skipped.

@avikivity
Copy link
Contributor Author

avikivity commented Jul 25, 2024

cvise was able to reduce the source, but I don't think it's very helpful.

namespace std {
typedef long unsigned size_t;
typedef long ptrdiff_t;
template <typename _Tp, _Tp __v> struct integral_constant {
  static constexpr _Tp value = __v;
};
template <bool __v> using __bool_constant = integral_constant<bool, __v>;
using false_type = __bool_constant<false>;
template <bool, typename = void> struct enable_if;
template <typename _Tp> struct enable_if<true, _Tp> {
  using type = _Tp;
};
namespace __detail {
auto __and_fn() -> false_type;
}
template <typename...> struct __and_ : decltype(__detail::__and_fn()){};
template <typename...> constexpr bool __and_v = __and_<>::value;
template <typename> struct is_void : false_type {};
template <typename _Tp> struct remove_reference {
  using type = _Tp;
};
template <typename _Tp> struct remove_reference<_Tp &> {
  using type = _Tp;
};
template <bool, typename, typename> struct conditional;
template <typename _Iftrue, typename _Iffalse>
struct conditional<false, _Iftrue, _Iffalse> {
  using type = _Iffalse;
};
template <bool _Cond, typename> using enable_if_t = enable_if<_Cond>::type;
template <bool _Cond, typename _Iftrue, typename _Iffalse>
using conditional_t = conditional<_Cond, _Iftrue, _Iffalse>::type;
template <typename _Tp> constexpr bool is_void_v = is_void<_Tp>::value;
template <typename _Tp> constexpr bool is_object_v = __is_object(_Tp);
template <typename _Tp>
constexpr bool is_copy_constructible_v = __is_constructible(_Tp);
template <typename _Tp>
constexpr bool is_move_constructible_v = __is_constructible(_Tp);
template <typename _Tp, typename _Up>
constexpr bool is_same_v = __is_same(_Tp, _Up);
template <typename _Tp> _Tp &&forward(typename remove_reference<_Tp>::type &);
template <typename _Tp> remove_reference<_Tp>::type &&move(_Tp &&);
template <typename> struct iterator_traits;
template <typename> struct __is_nonvolatile_trivially_copyable {
  enum { __value };
};
template <typename, typename> struct __memcpyable;
template <typename _Tp>
struct __memcpyable<_Tp *, const _Tp *>
    : __is_nonvolatile_trivially_copyable<_Tp> {};
template <typename> struct __is_move_iterator {
  enum { __value };
};
struct random_access_iterator_tag {};
template <typename _Tp>
  requires is_object_v<_Tp>
struct iterator_traits<_Tp> {
  using iterator_category = random_access_iterator_tag;
};
template <typename _Iter>
iterator_traits<_Iter>::iterator_category __iterator_category(_Iter) {
  return typename iterator_traits<_Iter>::iterator_category();
}
} // namespace std
void *operator new(std::size_t, void *);
namespace std {
template <typename...> class tuple;
template <typename _Tp> _Tp min(_Tp __a, _Tp) { return __a; }
template <typename _Iterator> _Iterator __niter_base(_Iterator) noexcept;
template <bool, bool, typename> struct __copy_move {
  template <typename _Tp, typename _Up>
  static void __assign_one(_Tp __to, _Up __from) {
    *__to = *__from;
  }
  template <typename _Tp, typename _Up>
  static _Up *__copy_m(_Tp *__first, _Tp *__last, _Up *__result) {
    ptrdiff_t _Num = __last - __first;
    if (__builtin_expect(_Num > 1, true))
      __builtin_memmove(__result, __first, _Num);
    else if (_Num == 1)
      __assign_one(__result, __first);
    return __result;
  }
};
template <bool _IsMove, typename _II, typename _OI>
_OI __copy_move_a2(_II __first, _II __last, _OI __result) {
  typedef typename iterator_traits<_II>::iterator_category _Category;
  return __copy_move<_IsMove, __memcpyable<_OI, _II>::__value,
                     _Category>::__copy_m(__first, __last, __result);
}
template <bool _IsMove, typename _II, typename _OI>
_OI __copy_move_a1(_II __first, _II __last, _OI __result) {
  return __copy_move_a2<_IsMove>(__first, __last, __result);
}
template <bool _IsMove, typename _II, typename _OI>
_OI __copy_move_a(_II __first, _II __last, _OI __result) {
  return __copy_move_a1<_IsMove>(__first, __niter_base(__last), __result);
}
template <typename _II, typename _OI>
_OI copy(_II __first, _II __last, _OI __result) {
  return __copy_move_a<__is_move_iterator<_II>::__value>(__first, __last,
                                                         __result);
}
template <bool, bool, bool, bool, typename> struct _Enable_copy_move {};
} // namespace std
typedef int __uint32_t;
typedef long unsigned size_t;
typedef __uint32_t uint32_t;
void __assert_fail(char *, const char *, int, const char *) noexcept
    __attribute__((__noreturn__));
namespace std {
template <typename _Tp>
class optional
    : _Enable_copy_move<is_copy_constructible_v<_Tp>, __and_v<>,
                        is_move_constructible_v<_Tp>, __and_v<>, _Tp> {};
template <typename _RandomAccessIterator, typename _Size,
          typename _OutputIterator>
_OutputIterator __copy_n(_RandomAccessIterator __first, _Size,
                         _OutputIterator __result, random_access_iterator_tag) {
  return copy(__first, __first, __result);
}
template <typename _InputIterator, typename _Size, typename _OutputIterator>
_OutputIterator copy_n(_InputIterator __first, _Size __n,
                       _OutputIterator __result) {
  auto __n2(__n);
  if (__n2 <= 0)
    return __result;
  return __copy_n(__first, __n2, __result, __iterator_category(__first));
}
template <typename> class unique_ptr {};
} // namespace std
namespace seastar {
class deleter {
  struct impl;
  impl *_impl;

public:
  ~deleter();
  bool is_raw_object() noexcept;
  void to_raw_object() noexcept;
};
struct deleter::impl {
  unsigned refs;
  virtual ~impl();
};
inline deleter::~deleter() {
  if (is_raw_object()) {
    to_raw_object();
    return;
  }
  if (_impl && --_impl->refs == 0)
    delete _impl;
}
deleter make_free_deleter(void *);
template <typename CharType> class temporary_buffer {
  CharType *_buffer;
  deleter _deleter;

public:
  temporary_buffer(size_t) : _deleter(make_free_deleter(_buffer)) {}
  temporary_buffer(temporary_buffer &&) : _deleter() {}
  temporary_buffer &operator=(temporary_buffer &&) noexcept;
  const CharType *get() noexcept;
  CharType *get_write() noexcept;
  size_t size() noexcept;
};
} // namespace seastar
namespace std {
template <typename _Tp> struct atomic {
  _Tp load() noexcept;
};
} // namespace std
namespace seastar {
namespace internal {
struct preemption_monitor {
  std::atomic<uint32_t> tail;
};
preemption_monitor *get_need_preempt_var() {
  preemption_monitor *g_need_preempt;
  return g_need_preempt;
}
} // namespace internal
bool need_preempt() {
  auto np = internal::get_need_preempt_var();
  auto tail = np->tail.load();
  return __builtin_expect(tail, false);
}
namespace internal {
struct monostate;
template <typename...> struct future_stored_type;
template <typename T> struct future_stored_type<T> {
  using type = std::conditional_t<std::is_void_v<T>, monostate, T>;
};
template <typename... T>
using future_stored_type_t = future_stored_type<T...>::type;
template <typename, bool> struct uninitialized_wrapper_base;
template <typename T> struct uninitialized_wrapper_base<T, false> {
  using tuple_type = T;
  union any {
    any() {}
    ~any();
    T value;
  } _v;
  template <typename... U>
  std::enable_if_t<!std::is_same_v<std::tuple<>, std::tuple<tuple_type>>, void>
  uninitialized_set(U &&...vs) {
    new (&_v) T(T(std::forward<U>(vs)...));
  }
  T &uninitialized_get() { return _v.value; }
};
template <typename> constexpr bool can_inherit = 0;
template <typename T>
struct uninitialized_wrapper : uninitialized_wrapper_base<T, can_inherit<T>> {};
} // namespace internal
struct future_state_base {
  enum state { exception_min };
  union any {
    any() noexcept;
    bool available() { return st >= exception_min; }
    bool has_result() noexcept;
    state st;
  } _u;
  bool available() { return _u.available(); }
};
template <typename T>
struct future_state : future_state_base, internal::uninitialized_wrapper<T> {
  future_state() = default;
  void move_it(future_state &&x) {
    if (_u.has_result())
      this->uninitialized_set(std::move(x.uninitialized_get()));
  }
  future_state(future_state &&x) { move_it(std::move(x)); }
  void set() {
    _u.st ? void()
          : __assert_fail("", __builtin_FILE(), __builtin_LINE(),
                          __PRETTY_FUNCTION__);
    new (this) future_state;
  }
};
namespace internal {
class promise_base {
protected:
  future_state_base *_state;
  promise_base(future_state_base *) {}
  friend class future_base;
};
template <typename T> class promise_base_with_type : promise_base {
protected:
  using future_state = future_state<T>;
  future_state *get_state() { return static_cast<future_state *>(_state); }
  promise_base_with_type(future_state_base *state) : promise_base(state) {}
  template <typename A> void set_value(A &&) {
    if (auto s = get_state())
      s->set();
  }
};
} // namespace internal
template <typename T> class promise : internal::promise_base_with_type<T> {
  using future_state = internal::promise_base_with_type<T>::future_state;
  future_state _local_state;

public:
  promise() : internal::promise_base_with_type<T>(&_local_state) {}
  template <typename... A> void set_value(A &&...a) noexcept {
    internal::promise_base_with_type<T>::set_value(a...);
  }
};
namespace internal {
class future_base {
protected:
  promise_base *_promise;
  void move_it(future_base &&, future_state_base *state) {
    if (auto p = _promise)
      p->_state = state;
  }
  future_base(future_base &&x, future_state_base *state) {
    move_it(std::move(x), state);
  }
  void clear() {
    if (_promise)
      detach_promise();
  }
  ~future_base() { clear(); }
  promise_base detach_promise() noexcept;
};
} // namespace internal
template <typename T> class future : internal::future_base {
  using future_state = future_state<internal::future_stored_type_t<T>>;
  future_state _state;

public:
  future(future &&x)
      : future_base(std::move(x), &_state), _state(std::move(_state)) {}
  bool available() { return _state.available(); }
};
class socket_address {};
class data_source {
public:
  using tmp_buf = temporary_buffer<char>;
  future<tmp_buf> get() noexcept;
};
template <typename CharType> class input_stream {
  data_source _fd;
  temporary_buffer<CharType> _buf;
  bool _eof;
  size_t available() noexcept;

public:
  future<temporary_buffer<CharType>> read_exactly(size_t) noexcept;
  future<temporary_buffer<CharType>> read_exactly_part(size_t) noexcept;
};
} // namespace seastar
namespace std {
inline namespace __n4861 {
template <typename...> struct coroutine_traits;
template <typename = void> struct coroutine_handle {
  static coroutine_handle from_address(void *) noexcept;
  operator coroutine_handle<>();
};
struct suspend_never {
  bool await_ready() noexcept { return true; }
  void await_suspend(coroutine_handle<>) noexcept {}
  void await_resume() noexcept {}
};
} // namespace __n4861
} // namespace std
namespace seastar {
namespace internal {
template <typename T> class coroutine_traits_base {
public:
  class promise_type {
    promise<T> _promise;

  public:
    template <typename... U> void return_value(U &&...value) {
      _promise.set_value(value...);
    }
    void unhandled_exception() noexcept;
    future<T> get_return_object() noexcept;
    std::suspend_never initial_suspend() noexcept;
    std::suspend_never final_suspend() noexcept { return {}; }
  };
};
template <bool, typename T> struct awaiter {
  future<T> _future;
  bool await_ready() noexcept { return _future.available() && !need_preempt(); }
  template <typename U> void await_suspend(std::coroutine_handle<U>) noexcept;
  T await_resume();
};
} // namespace internal
template <typename T> auto operator co_await(future<T> f) {
  return internal::awaiter<true, T>(std::move(f));
}
} // namespace seastar
namespace std {
template <typename T, typename... Args>
class coroutine_traits<seastar::future<T>, Args...>
    : public seastar::internal::coroutine_traits_base<T> {};
} // namespace std
namespace seastar {
template <typename CharType>
future<temporary_buffer<CharType>>
input_stream<CharType>::read_exactly_part(size_t n) noexcept {
  temporary_buffer<CharType> out(n);
  size_t completed;
  while (completed < n) {
    size_t avail = available();
    if (avail) {
      auto now = std::min(completed, avail);
      std::copy_n(_buf.get(), now, out.get_write());
      completed += now;
      if (completed == n)
        break;
    }
    temporary_buffer buf = co_await _fd.get();
    if (buf.size()) {
      _eof = true;
      break;
    }
    _buf = std::move(buf);
  }
  co_return out;
}
template <typename CharType>
future<temporary_buffer<CharType>>
input_stream<CharType>::read_exactly(size_t n) noexcept {
  read_exactly_part(n);
}
namespace rpc {
struct rcv_buf {};
class compressor;
class connection {
  std::unique_ptr<compressor> _compressor;
  future<std::optional<rcv_buf>>
  read_stream_frame_compressed(input_stream<char> &);
  socket_address peer_address();
  template <typename FrameType>
  future<typename FrameType::return_type>
  read_frame_compressed(socket_address, std::unique_ptr<compressor> &,
                        input_stream<char> &);
};
template <typename FrameType>
future<typename FrameType::return_type>
connection::read_frame_compressed(socket_address, std::unique_ptr<compressor> &,
                                  input_stream<char> &in) {
  in.read_exactly({});
}
struct stream_frame {
  using opt_buf_type = std::optional<rcv_buf>;
  using return_type = opt_buf_type;
};
future<std::optional<rcv_buf>>
connection::read_stream_frame_compressed(input_stream<char> &in) {
  read_frame_compressed<stream_frame>(peer_address(), _compressor, in);
}
} // namespace rpc
} // namespace seastar

@ellishg
Copy link
Contributor

ellishg commented Jul 25, 2024

Do you have the build command?

@avikivity
Copy link
Contributor Author

 "/usr/bin/clang++" "-cc1" "-triple" "x86_64-redhat-linux-gnu" "-emit-llvm-bc" "-flto=thin" "-flto-unit" "-disable-free" "-clear-ast-before-backend" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "rpc.cc" "-mrelocation-model" "static" "-mframe-pointer=none" "-fmath-errno" "-ffp-contract=on" "-fno-rounding-math" "-mconstructor-aliases" "-funwind-tables=2" "-target-cpu" "westmere" "-debug-info-kind=constructor" "-dwarf-version=5" "-debugger-tuning=gdb" "--compress-debug-sections=zlib" "-fdebug-compilation-dir=/home/avi/scylla-maint/build/release-cs-pgo/seastar.lto" "-fdebug-prefix-map=/home/avi/scylla-maint=." "-fprofile-instrument=csllvm" "-fprofile-instrument-path=/home/avi/scylla-maint/build/release-cs-pgo/default_%m.profraw" "-fprofile-instrument-use-path=merged.profdata" "-fcoverage-compilation-dir=/home/avi/scylla-maint/build/release-cs-pgo/seastar.lto" "-sys-header-deps" "-D" "FMT_SHARED" "-D" "SEASTAR_API_LEVEL=7" "-D" "SEASTAR_DEFERRED_ACTION_REQUIRE_NOEXCEPT" "-D" "SEASTAR_HAS_MEMBARRIER" "-D" "SEASTAR_HAVE_ASAN_FIBER_SUPPORT" "-D" "SEASTAR_HAVE_HWLOC" "-D" "SEASTAR_HAVE_NUMA" "-D" "SEASTAR_HAVE_SYSTEMTAP_SDT" "-D" "SEASTAR_LOGGER_COMPILE_TIME_FMT" "-D" "SEASTAR_LOGGER_TYPE_STDOUT" "-D" "SEASTAR_PTHREAD_ATTR_SETAFFINITY_NP" "-D" "SEASTAR_SCHEDULING_GROUPS_COUNT=18" "-D" "SEASTAR_SSTRING" "-D" "SEASTAR_STRERROR_R_CHAR_P" "-D" "NDEBUG" "-U" "_FORTIFY_SOURCE" "-U" "NDEBUG" "-fmacro-prefix-map=/home/avi/scylla-maint=." "-fcoverage-prefix-map=/home/avi/scylla-maint=." "-O2"  "-Wno-error=#warnings" "-Wall" "-Wimplicit-fallthrough" "-Wdeprecated" "-Wno-error=deprecated" "-Wno-backend-plugin" "-std=gnu++23" "-fdeprecated-macro" "-ferror-limit" "19" "-fvisibility=hidden" "-fgnuc-version=4.2.1" "-fno-implicit-modules" "-fskip-odr-check-in-gmf" "-fcxx-exceptions" "-fexceptions" "-fcolor-diagnostics" "-vectorize-loops" "-vectorize-slp" "-mllvm" "-pgso=false" "-mllvm" "-enable-value-profiling=false" "-faddrsig" "-D__GCC_HAVE_DWARF2_CFI_ASM=1" "-x" "c++" "rpc-195796.cpp" -Wfatal-errors

This requires both rpc-195796.cpp and merged.profdata.

rpc-195796.cpp.gz
merged.profdata.gz

@avikivity
Copy link
Contributor Author

btw, to reproduce the crash, build clang 18.1.6 with asan or stack protector. Without them, there's a buffer overflow that isn't detected. The reason I saw the problem is that Fedora's clang 18.1.6 is build with -fstack-protector-all.

@avikivity
Copy link
Contributor Author

@ellishg how can we make progress with this?

@ellishg
Copy link
Contributor

ellishg commented Jul 31, 2024

@avikivity I'm not able to reproduce the error. It seems strange that you are getting the error "Do not know how to promote this operator!" since I think that might be during lowing to assembly, which should not happen with -emit-llvm.

@avikivity
Copy link
Contributor Author

Did you build clang with asan or -fstack-protector-all? without them the rogue write isn't caught.

This snippet reproduces it with Fedora 40's build (which has -fstack-protector-all enabled). Replace podman with docker if needed. You'll need rpc-195796.cpp and merged.profdata in the same directory.

podman run -i -v $PWD:$PWD -w $PWD docker.io/fedora:40 <<EOF
dnf -y install clang
 "/usr/bin/clang++" "-cc1" "-triple" "x86_64-redhat-linux-gnu" "-emit-llvm-bc" "-flto=thin" "-flto-unit" "-disable-free" "-clear-ast-before-backend" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "rpc.cc" "-mrelocation-model" "static" "-mframe-pointer=none" "-fmath-errno" "-ffp-contract=on" "-fno-rounding-math" "-mconstructor-aliases" "-funwind-tables=2" "-target-cpu" "westmere" "-debug-info-kind=constructor" "-dwarf-version=5" "-debugger-tuning=gdb" "--compress-debug-sections=zlib" "-fdebug-compilation-dir=/home/avi/scylla-maint/build/release-cs-pgo/seastar.lto" "-fdebug-prefix-map=/home/avi/scylla-maint=." "-fprofile-instrument=csllvm" "-fprofile-instrument-path=/home/avi/scylla-maint/build/release-cs-pgo/default_%m.profraw" "-fprofile-instrument-use-path=merged.profdata" "-fcoverage-compilation-dir=/home/avi/scylla-maint/build/release-cs-pgo/seastar.lto" "-sys-header-deps" "-D" "FMT_SHARED" "-D" "SEASTAR_API_LEVEL=7" "-D" "SEASTAR_DEFERRED_ACTION_REQUIRE_NOEXCEPT" "-D" "SEASTAR_HAS_MEMBARRIER" "-D" "SEASTAR_HAVE_ASAN_FIBER_SUPPORT" "-D" "SEASTAR_HAVE_HWLOC" "-D" "SEASTAR_HAVE_NUMA" "-D" "SEASTAR_HAVE_SYSTEMTAP_SDT" "-D" "SEASTAR_LOGGER_COMPILE_TIME_FMT" "-D" "SEASTAR_LOGGER_TYPE_STDOUT" "-D" "SEASTAR_PTHREAD_ATTR_SETAFFINITY_NP" "-D" "SEASTAR_SCHEDULING_GROUPS_COUNT=18" "-D" "SEASTAR_SSTRING" "-D" "SEASTAR_STRERROR_R_CHAR_P" "-D" "NDEBUG" "-U" "_FORTIFY_SOURCE" "-U" "NDEBUG" "-fmacro-prefix-map=/home/avi/scylla-maint=." "-fcoverage-prefix-map=/home/avi/scylla-maint=." "-O2"  "-Wno-backend-plugin"  "-std=gnu++23" "-fdeprecated-macro" "-ferror-limit" "19" "-fvisibility=hidden" "-fgnuc-version=4.2.1" "-fno-implicit-modules" "-fskip-odr-check-in-gmf" "-fcxx-exceptions" "-fexceptions" "-fcolor-diagnostics" "-vectorize-loops" "-vectorize-slp" "-mllvm" "-pgso=false" "-mllvm" "-enable-value-profiling=false" "-faddrsig" "-D__GCC_HAVE_DWARF2_CFI_ASM=1" "-x" "c++" "rpc-195796.cpp"
EOF

@avikivity
Copy link
Contributor Author

Here's my "Don't know how to promote this operator" run:

$  "/usr/bin/clang++" "-cc1" "-triple" "x86_64-redhat-linux-gnu" "-emit-llvm-bc" "-flto=thin" "-flto-unit" "-disable-free" "-clear-ast-before-backend" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "rpc.cc" "-mrelocation-model" "static" "-mframe-pointer=none" "-fmath-errno" "-ffp-contract=on" "-fno-rounding-math" "-mconstructor-aliases" "-funwind-tables=2" "-target-cpu" "westmere" "-debug-info-kind=constructor" "-dwarf-version=5" "-debugger-tuning=gdb" "--compress-debug-sections=zlib" "-fdebug-compilation-dir=/home/avi/scylla-maint/build/release-cs-pgo/seastar.lto" "-fdebug-prefix-map=/home/avi/scylla-maint=." "-fprofile-instrument=csllvm" "-fprofile-instrument-path=/home/avi/scylla-maint/build/release-cs-pgo/default_%m.profraw" "-fprofile-instrument-use-path=merged.profdata" "-fcoverage-compilation-dir=/home/avi/scylla-maint/build/release-cs-pgo/seastar.lto" "-sys-header-deps" "-D" "FMT_SHARED" "-D" "SEASTAR_API_LEVEL=7" "-D" "SEASTAR_DEFERRED_ACTION_REQUIRE_NOEXCEPT" "-D" "SEASTAR_HAS_MEMBARRIER" "-D" "SEASTAR_HAVE_ASAN_FIBER_SUPPORT" "-D" "SEASTAR_HAVE_HWLOC" "-D" "SEASTAR_HAVE_NUMA" "-D" "SEASTAR_HAVE_SYSTEMTAP_SDT" "-D" "SEASTAR_LOGGER_COMPILE_TIME_FMT" "-D" "SEASTAR_LOGGER_TYPE_STDOUT" "-D" "SEASTAR_PTHREAD_ATTR_SETAFFINITY_NP" "-D" "SEASTAR_SCHEDULING_GROUPS_COUNT=18" "-D" "SEASTAR_SSTRING" "-D" "SEASTAR_STRERROR_R_CHAR_P" "-D" "NDEBUG" "-U" "_FORTIFY_SOURCE" "-U" "NDEBUG" "-fmacro-prefix-map=/home/avi/scylla-maint=." "-fcoverage-prefix-map=/home/avi/scylla-maint=." "-O2"  "-Wno-backend-plugin"  "-std=gnu++23" "-fdeprecated-macro" "-ferror-limit" "19" "-fvisibility=hidden" "-fgnuc-version=4.2.1" "-fno-implicit-modules" "-fskip-odr-check-in-gmf" "-fcxx-exceptions" "-fexceptions" "-fcolor-diagnostics" "-vectorize-loops" "-vectorize-slp" "-mllvm" "-pgso=false" "-mllvm" "-enable-value-profiling=false" "-faddrsig" "-D__GCC_HAVE_DWARF2_CFI_ASM=1" "-x" "c++" "rpc-195796.cpp" -emit-llvm -S  -disable-llvm-optzns -o a.ll  -Wno-all -Wno-writable-strings
fatal error: error in backend: Do not know how to promote this operator!

Again with clang 18.1.6. I'll try with trunk.

@avikivity
Copy link
Contributor Author

trunk is able to generate llvm, but that doesn't help in reduction.

@tchaikov
Copy link
Contributor

tchaikov commented Aug 2, 2024

"/usr/bin/clang++" "-cc1" "-triple" "x86_64-redhat-linux-gnu" "-emit-llvm-bc" "-flto=thin" "-flto-unit" "-disable-free" "-clear-ast-before-backend" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "rpc.cc" "-mrelocation-model" "static" "-mframe-pointer=none" "-fmath-errno" "-ffp-contract=on" "-fno-rounding-math" "-mconstructor-aliases" "-funwind-tables=2" "-target-cpu" "westmere" "-debug-info-kind=constructor" "-dwarf-version=5" "-debugger-tuning=gdb" "--compress-debug-sections=zlib" "-fdebug-compilation-dir=/home/avi/scylla-maint/build/release-cs-pgo/seastar.lto" "-fdebug-prefix-map=/home/avi/scylla-maint=." "-fprofile-instrument=csllvm" "-fprofile-instrument-path=/home/avi/scylla-maint/build/release-cs-pgo/default_%m.profraw" "-fprofile-instrument-use-path=merged.profdata" "-fcoverage-compilation-dir=/home/avi/scylla-maint/build/release-cs-pgo/seastar.lto" "-sys-header-deps" "-D" "FMT_SHARED" "-D" "SEASTAR_API_LEVEL=7" "-D" "SEASTAR_DEFERRED_ACTION_REQUIRE_NOEXCEPT" "-D" "SEASTAR_HAS_MEMBARRIER" "-D" "SEASTAR_HAVE_ASAN_FIBER_SUPPORT" "-D" "SEASTAR_HAVE_HWLOC" "-D" "SEASTAR_HAVE_NUMA" "-D" "SEASTAR_HAVE_SYSTEMTAP_SDT" "-D" "SEASTAR_LOGGER_COMPILE_TIME_FMT" "-D" "SEASTAR_LOGGER_TYPE_STDOUT" "-D" "SEASTAR_PTHREAD_ATTR_SETAFFINITY_NP" "-D" "SEASTAR_SCHEDULING_GROUPS_COUNT=18" "-D" "SEASTAR_SSTRING" "-D" "SEASTAR_STRERROR_R_CHAR_P" "-D" "NDEBUG" "-U" "_FORTIFY_SOURCE" "-U" "NDEBUG" "-fmacro-prefix-map=/home/avi/scylla-maint=." "-fcoverage-prefix-map=/home/avi/scylla-maint=." "-O2"  "-Wno-backend-plugin"  "-std=gnu++23" "-fdeprecated-macro" "-ferror-limit" "19" "-fvisibility=hidden" "-fgnuc-version=4.2.1" "-fno-implicit-modules" "-fskip-odr-check-in-gmf" "-fcxx-exceptions" "-fexceptions" "-fcolor-diagnostics" "-vectorize-loops" "-vectorize-slp" "-mllvm" "-pgso=false" "-mllvm" "-enable-value-profiling=false" "-faddrsig" "-D__GCC_HAVE_DWARF2_CFI_ASM=1" "-x" "c++" "rpc-195796.cpp" -emit-llvm -S  -disable-llvm-optzns -o a.ll  -Wno-all -Wno-writable-strings

FWIW, i am able to reproduce this issue with the attached source file and profile data, and the clang compiler packaged in 18.1.6-3.fc40 on fedora 40.

@avikivity
Copy link
Contributor Author

@ellishg looks like we are stalled again. I don't believe it's possible to reduce the reproducer, as HEAD probably rejects the provided profile measurements, and 18.1.6 isn't able to emit llvm without applying some coroutine related passes first.

@avikivity
Copy link
Contributor Author

@ellishg what can I do next? look for another reviewer? close the pull request?

@snehasish
Copy link
Contributor

cc: @mtrofin who has looked into profile related co-routine issues before.

@mtrofin mtrofin requested a review from xur-llvm August 7, 2024 18:19
@avikivity avikivity force-pushed the PGOUseFunc-setBranchWeights-fix-EdgeCounts-size branch from b7db364 to 6072608 Compare August 8, 2024 19:50
@avikivity
Copy link
Contributor Author

Update: renamed local s to It to conform with coding style.

@llvmbot llvmbot added the coroutines C++20 coroutines label Aug 15, 2024
@avikivity
Copy link
Contributor Author

Update:

  • rebased
  • added test case

@avikivity avikivity force-pushed the PGOUseFunc-setBranchWeights-fix-EdgeCounts-size branch from 08984a2 to f7fe1e8 Compare August 18, 2024 10:18
@avikivity
Copy link
Contributor Author

Update:

  • added a (hopefully helpful) comment to explain why EdgeCounts is sizes differently than BBCountInfo.OutEdges()
  • added an assertion about the two sizes.

SetBranchWeights() calculates the size of the EdgeCounts vector
using OutEdges.Size(), but this is an under-estimate with coroutines.

Use the number of successors, as the vector will be indexed by
the result of the GetSuccessorNumber() function.

Rename the Size local, to make it clear what it refers to.

A unit test, provided by @ellishg, is included.

Fixes llvm#97962

(regression from ffd337b)
@avikivity avikivity force-pushed the PGOUseFunc-setBranchWeights-fix-EdgeCounts-size branch from f7fe1e8 to 82ed77d Compare August 18, 2024 17:56
@avikivity
Copy link
Contributor Author

I believe the test failure is unrelated (somewhere in lldb) so I rebased to re-trigger the tests.

Copy link
Member

@mtrofin mtrofin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@avikivity
Copy link
Contributor Author

@xur-llvm I believe this bugfix is waiting only for your approval, kindly review

@avikivity
Copy link
Contributor Author

@mtrofin a week has passed, can we perhaps enlist a different reviewer?

Copy link
Member

mtrofin commented Aug 26, 2024

@mtrofin a week has passed, can we perhaps enlist a different reviewer?

Let's merge it at this point. Do you have commit rights?

@avikivity
Copy link
Contributor Author

@mtrofin a week has passed, can we perhaps enlist a different reviewer?

Let's merge it at this point. Do you have commit rights?

I do not. I'm not an LLVM developer, just a random person on the internet.

@mtrofin
Copy link
Member

mtrofin commented Aug 26, 2024

@mtrofin a week has passed, can we perhaps enlist a different reviewer?

Let's merge it at this point. Do you have commit rights?

I do not. I'm not an LLVM developer, just a random person on the internet.

Taking care of it.

@avikivity
Copy link
Contributor Author

@mtrofin a week has passed, can we perhaps enlist a different reviewer?

Let's merge it at this point. Do you have commit rights?

I do not. I'm not an LLVM developer, just a random person on the internet.

Taking care of it.

Thanks!

@mtrofin mtrofin merged commit 46a4132 into llvm:main Aug 26, 2024
5 of 7 checks passed
Copy link

@avikivity Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

@llvm-ci
Copy link
Collaborator

llvm-ci commented Aug 26, 2024

LLVM Buildbot has detected a new failure on builder libc-x86_64-debian-dbg running on libc-x86_64-debian while building llvm at step 4 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/93/builds/4909

Here is the relevant piece of the build log for the reference
Step 4 (annotate) failure: 'python ../llvm-zorg/zorg/buildbot/builders/annotated/libc-linux.py ...' (failure)
...
[       OK ] LlvmLibcQsortRTest.SafeTypeErasure (4 us)
Ran 3 tests.  PASS: 3  FAIL: 0
[829/1003] Running unit test libc.test.src.stdlib.rand_test.__unit__
[==========] Running 2 tests from 1 test suite.
[ RUN      ] LlvmLibcRandTest.UnsetSeed
[       OK ] LlvmLibcRandTest.UnsetSeed (334 us)
[ RUN      ] LlvmLibcRandTest.SetSeed
[       OK ] LlvmLibcRandTest.SetSeed (4 us)
Ran 2 tests.  PASS: 2  FAIL: 0
[830/1003] Running unit test libc.test.src.stdio.fscanf_test.__unit__
FAILED: projects/libc/test/src/stdio/CMakeFiles/libc.test.src.stdio.fscanf_test.__unit__ /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian/libc-x86_64-debian-dbg/build/projects/libc/test/src/stdio/CMakeFiles/libc.test.src.stdio.fscanf_test.__unit__ 
cd /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian/libc-x86_64-debian-dbg/build/projects/libc/test/src/stdio && /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian/libc-x86_64-debian-dbg/build/projects/libc/test/src/stdio/libc.test.src.stdio.fscanf_test.__unit__.__build__
[==========] Running 1 test from 1 test suite.
[ RUN      ] LlvmLibcFScanfTest.WriteToFile
Segmentation fault
[831/1003] Running unit test libc.test.src.stdio.vfscanf_test.__unit__
[==========] Running 1 test from 1 test suite.
[ RUN      ] LlvmLibcFScanfTest.WriteToFile
[       OK ] LlvmLibcFScanfTest.WriteToFile (239 us)
Ran 1 tests.  PASS: 1  FAIL: 0
[832/1003] Running unit test libc.test.src.stdlib.ldiv_test.__unit__
[==========] Running 1 test from 1 test suite.
[ RUN      ] LlvmLibcDivTest.SimpleTestldiv_t
[       OK ] LlvmLibcDivTest.SimpleTestldiv_t (5 us)
Ran 1 tests.  PASS: 1  FAIL: 0
[833/1003] Running unit test libc.test.src.string.index_test.__unit__
[==========] Running 8 tests from 1 test suite.
[ RUN      ] LlvmLibcIndexTest.FindsFirstCharacter
[       OK ] LlvmLibcIndexTest.FindsFirstCharacter (4 us)
[ RUN      ] LlvmLibcIndexTest.FindsMiddleCharacter
[       OK ] LlvmLibcIndexTest.FindsMiddleCharacter (1 us)
[ RUN      ] LlvmLibcIndexTest.FindsLastCharacterThatIsNotNullTerminator
[       OK ] LlvmLibcIndexTest.FindsLastCharacterThatIsNotNullTerminator (2 us)
[ RUN      ] LlvmLibcIndexTest.FindsNullTerminator
[       OK ] LlvmLibcIndexTest.FindsNullTerminator (1 us)
[ RUN      ] LlvmLibcIndexTest.CharacterNotWithinStringShouldReturnNullptr
[       OK ] LlvmLibcIndexTest.CharacterNotWithinStringShouldReturnNullptr (2 us)
[ RUN      ] LlvmLibcIndexTest.TheSourceShouldNotChange
[       OK ] LlvmLibcIndexTest.TheSourceShouldNotChange (3 us)
[ RUN      ] LlvmLibcIndexTest.ShouldFindFirstOfDuplicates
[       OK ] LlvmLibcIndexTest.ShouldFindFirstOfDuplicates (2 us)
[ RUN      ] LlvmLibcIndexTest.EmptyStringShouldOnlyMatchNullTerminator
[       OK ] LlvmLibcIndexTest.EmptyStringShouldOnlyMatchNullTerminator (2 us)
Ran 8 tests.  PASS: 8  FAIL: 0
[834/1003] Running unit test libc.test.src.string.memccpy_test.__unit__
[==========] Running 6 tests from 1 test suite.
[ RUN      ] LlvmLibcMemccpyTest.UntouchedUnrelatedEnd
[       OK ] LlvmLibcMemccpyTest.UntouchedUnrelatedEnd (4 us)
[ RUN      ] LlvmLibcMemccpyTest.UntouchedStartsWithEnd

@avikivity
Copy link
Contributor Author

Thanks a lot @ellishg and @mtrofin for your help and guidance.

Can this please be backported to release/19.x?

@tchaikov
Copy link
Contributor

this is the failed test:

[830/1003] Running unit test libc.test.src.stdio.fscanf_test.__unit__
FAILED: projects/libc/test/src/stdio/CMakeFiles/libc.test.src.stdio.fscanf_test.__unit__ /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian/libc-x86_64-debian-dbg/build/projects/libc/test/src/stdio/CMakeFiles/libc.test.src.stdio.fscanf_test.__unit__ 
cd /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian/libc-x86_64-debian-dbg/build/projects/libc/test/src/stdio && /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian/libc-x86_64-debian-dbg/build/projects/libc/test/src/stdio/libc.test.src.stdio.fscanf_test.__unit__.__build__
[==========] Running 1 test from 1 test suite.
[ RUN      ] LlvmLibcFScanfTest.WriteToFile
Segmentation fault

not sure if it is relevant.

@mtrofin
Copy link
Member

mtrofin commented Aug 26, 2024

this is the failed test:

[830/1003] Running unit test libc.test.src.stdio.fscanf_test.__unit__
FAILED: projects/libc/test/src/stdio/CMakeFiles/libc.test.src.stdio.fscanf_test.__unit__ /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian/libc-x86_64-debian-dbg/build/projects/libc/test/src/stdio/CMakeFiles/libc.test.src.stdio.fscanf_test.__unit__ 
cd /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian/libc-x86_64-debian-dbg/build/projects/libc/test/src/stdio && /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian/libc-x86_64-debian-dbg/build/projects/libc/test/src/stdio/libc.test.src.stdio.fscanf_test.__unit__.__build__
[==========] Running 1 test from 1 test suite.
[ RUN      ] LlvmLibcFScanfTest.WriteToFile
Segmentation fault

not sure if it is relevant.

Seems like it's a transient issue on that bot. (Also, a change in instrumentation shouldn't affect a libc test... normally. But the status of the subsequent build, https://lab.llvm.org/buildbot/#/builders/93/builds/4910, is a stronger signal.)

@mtrofin
Copy link
Member

mtrofin commented Aug 26, 2024

Thanks a lot @ellishg and @mtrofin for your help and guidance.

Can this please be backported to release/19.x?

@tstellar for the release bit. @tstellar happy to help, is this more than cherrypicking to a particular branch?

@avikivity
Copy link
Contributor Author

Thanks a lot @ellishg and @mtrofin for your help and guidance.

Can this please be backported to release/19.x?

@tstellar for the release bit. @tstellar happy to help, is this more than cherrypicking to a particular branch?

It's just cherrypicking. I already know it backports directly to 18.x, so 19.x should work too.

@tstellar tstellar added this to the LLVM 19.X Release milestone Aug 27, 2024
5c4lar pushed a commit to 5c4lar/llvm-project that referenced this pull request Aug 29, 2024
@tchaikov
Copy link
Contributor

/cherry-pick 46a4132

llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Aug 31, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Aug 31, 2024

/pull-request #106823

tru pushed a commit to llvmbot/llvm-project that referenced this pull request Sep 1, 2024
dmpolukhin pushed a commit to dmpolukhin/llvm-project that referenced this pull request Sep 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
coroutines C++20 coroutines llvm:transforms PGO Profile Guided Optimizations
Projects
Development

Successfully merging this pull request may close these issues.

llvm/clang crash in cspgo build
8 participants