Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Commit Breaks Llama3.1_405b_tp8 Compilation #19833

Closed
stbaione opened this issue Jan 28, 2025 · 1 comment · Fixed by #19835
Closed

Commit Breaks Llama3.1_405b_tp8 Compilation #19833

stbaione opened this issue Jan 28, 2025 · 1 comment · Fixed by #19835
Labels
bug 🐞 Something isn't working

Comments

@stbaione
Copy link

What happened?

Compilation was broken for Llama3.1_405b_tp8, with the following error message:

iree-compile: iree/third_party/llvm-project/mlir/include/mlir/IR/StorageUniquerSupport.h:180: static ConcreteT mlir::detail::StorageUserBase<mlir::iree_compiler::IREE::VectorExt::NestedLayoutAttr, mlir::Attribute, mlir::iree_compiler::IREE::VectorExt::detail::NestedLayoutAttrStorage, mlir::detail::AttributeUniquer, mlir::iree_compiler::IREE::VectorExt::VectorLayoutInterface::Trait>::get(mlir::MLIRContext *, Args &&...) [ConcreteT = mlir::iree_compiler::IREE::VectorExt::NestedLayoutAttr, BaseT = mlir::Attribute, StorageT = mlir::iree_compiler::IREE::VectorExt::detail::NestedLayoutAttrStorage, UniquerT = mlir::detail::AttributeUniquer, Traits = <mlir::iree_compiler::IREE::VectorExt::VectorLayoutInterface::Trait>, Args = <llvm::ArrayRef<long> &, llvm::ArrayRef<long> &, llvm::ArrayRef<long> &, llvm::ArrayRef<long> &, llvm::ArrayRef<long> &, llvm::SmallVector<long, 6> &, llvm::SmallVector<long, 6> &>]: Assertion `succeeded( ConcreteT::verifyInvariants(getDefaultDiagnosticEmitFn(ctx), args...))' failed.
Please report issues to https://github.com/iree-org/iree/issues and include the crash backtrace.
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
iree-compile: iree/third_party/llvm-project/mlir/include/mlir/IR/StorageUniquerSupport.h:180: static ConcreteT mlir::detail::StorageUserBase<mlir::iree_compiler::IREE::VectorExt::NestedLayoutAttr, mlir::Attribute, mlir::iree_compiler::IREE::VectorExt::detail::NestedLayoutAttrStorage, mlir::detail::AttributeUniquer, mlir::iree_compiler::IREE::VectorExt::VectorLayoutInterface::Trait>::get(mlir::MLIRContext *, Args &&...) [ConcreteT = mlir::iree_compiler::IREE::VectorExt::NestedLayoutAttr, BaseT = mlir::Attribute, StorageT = mlir::iree_compiler::IREE::VectorExt::detail::NestedLayoutAttrStorage, UniquerT = mlir::detail::AttributeUniquer, Traits = <mlir::iree_compiler::IREE::VectorExt::VectorLayoutInterface::Trait>, Args = <llvm::ArrayRef<long> &, llvm::ArrayRef<long> &, llvm::ArrayRef<long> &, llvm::ArrayRef<long> &, llvm::ArrayRef<long> &, llvm::SmallVector<long, 6> &, llvm::SmallVector<long, 6> &>]: Assertion `succeeded( ConcreteT::verifyInvariants(getDefaultDiagnosticEmitFn(ctx), args...))' failed.
Aborted (core dumped)

Bisecting points to 4b0ca34

Steps to reproduce your issue

  1. Download the 405b MLIR
  2. Run iree-compile:
iree-compile 405b_instruct_fp16.mlir -o 405b_artifacts/tp8/llama_1_28.vmfb --iree-hal-target-device=hip[0]  --iree-hal-target-device=hip[1] --iree-hal-target-device=hip[2] --iree-hal-target-device=hip[3] --iree-hal-target-device=hip[4] --iree-hal-target-device=hip[5] --iree-hal-target-device=hip[6] --iree-hal-target-device=hip[7]  --iree-hip-target=gfx942  --iree-dispatch-creation-enable-aggressive-fusion=true      --iree-global-opt-propagate-transposes=true   --iree-opt-aggressively-propagate-transposes=true      --iree-opt-data-tiling=false   --iree-preprocessing-pass-pipeline='builtin.module(util.func(iree-preprocessing-generalize-linalg-matmul-experimental))'      --iree-hal-indirect-command-buffers=true   --iree-stream-resource-memory-model=discrete  --iree-hal-memoization=true   --iree-opt-strip-assertions

What component(s) does this issue relate to?

Compiler

Version information

22b34b5

Additional context

No response

@stbaione stbaione added the bug 🐞 Something isn't working label Jan 28, 2025
@IanWood1
Copy link
Contributor

IanWood1 commented Jan 28, 2025

I'll try to get the failing dispatch, the commit seems to be causing the compiler to go down a bad codegen path. Do we need to revert until it's fixed?

ita9naiwa pushed a commit to ita9naiwa/iree that referenced this issue Feb 4, 2025
IanWood1 added a commit that referenced this issue Feb 12, 2025
Reland the changes to fold attention ops with broadcasts with a small
tweak to `AttentionOpDetail` so that the batch dimensions are properly
computed when an operand is broadcasted.


Original PR #19828
Revert PR #19835
Issue causing revert #19833

---------

Signed-off-by: Ian Wood <ianwood2024@u.northwestern.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐞 Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants