-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hang calling llc with aie2p target, after opt with -On n>0. #315
Comments
First an a-priory clarification, llc doesn't link, it transforms the llvm code to an object file. Anyway, I'm looking at this. |
BTW, thanks for the reproducer, it hangs beautifully. FYI, it usually suffices to have the input to llc, i.e. the output of opt. |
We seem to be looping in the legalizer around G_UNMERGE_VALUES.
The process memory size doesn't grow. We are stuck here: MachineDominatorTree::dominates but we sample different G_PTR_ADDs with the same G_LOAD |
The G_LOAD is a scalar load that doesn't occur in the input program. I guess it derives from scalarizing a vector load. |
OMG. We have scalarized every vector load and end up with one basic block of ~30000 instructions. We're stuck on the very first scalar load in that block. Perhaps it isn't an infinite loop, just a bit quadratic in block size. Well, not every vector load.. We are missing alignment on some of them, but they dominate the code size.
%3 and %4 are properly aligned, and result in a proper vector load. @newling, we currently need proper alignment on all loads and stores. Perhaps we can compose them from fewer aligned vector loads around the unaligned address, but I assume that will still not be the intended code . |
@martien-de-jong thanks for investigating this. If this is an load/store alignment problem, it's not the first time for us, it's usually what trips me up lowering through peano/AIE. I've tried changing (1) all loads in input.ll to and then running with Can you please remind me what the alignment expectation is (or point me to docs to shed some light)? If I can get any input.ll to get through |
I'll dig a bit further. I'm mostly looking at the llvm input to llc, which is translated to machine IR by the standard irtranslator. That initial machine IR has the 4 byte aligned load. I hope opt is cooperating in propagating alignments from input to output loads, and I hope irtranslator is doing the right thing, because that is a bit of a black box for us. |
On aie2p any vector load / store of size 512 bits or larger should have at least 64 byte alignment. In you input.ll there are several instances of 4 byte aligned loads and stores, like |
Indeed. You folks are moving fast, it's not often that bumping a component forward by 2 days fixes anything :) I have a path forward now: fix up the alignments that our compiler is generating. And for me the 'hang' isn't high priority anymore (although of course an error for unsupported scalar code would be nice). |
The majority of this PR is refactoring to make it possible to support 2+ target ISAs. - aievec.matmul now verifies its operand shapes based on the target device in the module. - I've removed a bunch of intrinsic matmul shapes for phoenix that we're not using, to simplify the code - The XLLVM dialect ops now include the target device in their names. i.e. the name includes either `AIE2` or `AIE2P` now. - Only matmul can lower to XLLVM from aievec for AIE2P, all other aievec ops (UPS, etc) have an assert on them that the device is AIE2. - A few XLLVM ops are removed: we don't use broadcast or a few others, so I've trimmed the set of ops we support down. - This PR adds utils `isAie2` and `isAie2P` to AMDAIEUtils.h I have confirmed that a linalg matmul can compile all the way through peano for AIE2P, but only with -O0. Next step after this is to fix alignment issues in iree-amd-aie to get this to work for -On n>0: Xilinx/llvm-aie#315
I'm starting to integrate support for aie2p into the IREE compiler. My first attempt for a small matmul is hitting a hang during object file generation with llc, but only when opt is run with -On for n>0.
I am using peano wheel from today, 28 January: llvm_aie-19.0.0.2025012801+24e4e160.dist-info from https://github.com/Xilinx/llvm-aie/releases
input.ll is attached, and below.
input.opt.ll with -O0 is generated with the following command:
llvm-aie/bin/opt -vectorize-loops=false -vectorize-slp=false --two-entry-phi-node-folding-threshold=10 -mandatory-inlining-before-opt=false -basic-aa-full-phi-analysis=true -basic-aa-max-lookup-search-depth=10 -O0 --inline-threshold=10 --disable-builtin=memset -S input.ll -o input.opt.ll
input.opt.ll with -O1 is generated with the following command (identical to the above, but with -O1):
llvm-aie/bin/opt -vectorize-loops=false -vectorize-slp=false --two-entry-phi-node-folding-threshold=10 -mandatory-inlining-before-opt=false -basic-aa-full-phi-analysis=true -basic-aa-max-lookup-search-depth=10 -O1 --inline-threshold=10 --disable-builtin=memset -S input.ll -o input.opt.ll
Object file generation works fine with the following command when run on the output of opt with '-O0':
llvm-aie/bin/llc input.opt.ll -O2 --march=aie2p --function-sections --filetype=obj -o input.o
but the above fails when run on the output of opt with '-O1':
repro_files.zip
What is causing the hang?
input.ll:
The text was updated successfully, but these errors were encountered: