-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rocblas: add spack test support #27436
Conversation
It seems that most ROCm math libraries need patches to build the tests properly. Patching every version back to 3.5 seems like unnecessary complexity. I'm just going to try to get 4.3 working, and anything that happens to work before then is a bonus. I need to reformat the PR, but I'm having a bit of trouble with
|
@cgmb what do you think, should we drop older versions of rocm packages at some point? I don't think we're ever going to be able to support them properly unless AMD keeps CI running on machines with old and new drivers... Dropping a couple of old versions would probably keep things simpler. |
It's not like I have any authority on this matter, but if it were up to me personally, I would drop anything prior to 4.1.0. That's when target-id was introduced, and it's kind of a dividing line for compiler compatibility. Spack is currently specifying a 1:1 relationship between the llvm-amdgpu version and the ROCm math library versions, but that's really more of a recommendation than a hard requirement. It depends on what architectures you're targeting, but I think generally you can use llvm-amdgpu@4.1.0: to build any math lib (possibly with a few exceptions). I suppose that highlights a missing feature in Spack: an understanding of GPU architectures. It should be fine for llvm-andgpu@4.1.0 to build rocBLAS@4.5.0, but that version of LLVM doesn't have support for gfx90a. Spack includes logic for handling that sort of problem with CPU architectures. For example, Spack picks up that I have a Zen 2 processor and my system compiler is gcc@7.5.0, so it chooses Of course, that then underscores that we're still hacking HIP language support in rather than doing it properly. Spack still thinks that rocBLAS is being compiled as Additionally, AMD libraries should start using CMake's HIP language support (from CMake 3.21.3+), and Spack should add explicit support for setting That's a rather large chunk of work. I suppose I should maybe see if I can find out if that's already been planned, or if there's someone I can talk to in order to make it happen. We shouldn't be hacking things together forever. ... That was kind of a tangent, wasn't it? Back to your point. I would probably drop versions before 4.1.0. It takes a couple hours to do a basic build and test of rocBLAS. Doing that three or four times is feasible, but ten is a lot! |
For what it's worth, I'm on spack develop (b16bfe4) and I can't get rocblas@4.0.0 to install anyway.
If I specify
If I specify
|
i will look into fixing the error with rocblas@4.0.0 in my next PR |
I've rebased against develop and tested 4.5.2, 4.3.1, 4.2.0 and 4.1.0. Please take another look. |
@haampie, @srekolam, and @arjun-raj-kuppala could you please review? |
@spackbot run pipeline |
I've started that pipeline for you! |
@spackbot run pipeline |
I've started that pipeline for you! |
@spackbot run pipeline |
I've started that pipeline for you! |
@spackbot run pipeline |
I've started that pipeline for you! |
@haampie, all tests have passed. Can we get this merge train started? |
@haampie, is there something you're waiting for? |
🎉 |
Sorry for the delay, too many github notifications on a daily basis 😬 |
* rocblas: add spack test support * rocblas: limit spack test support to @4.2.0: * rocblas: simplify define
This change adds support for building the rocblas-test client and teaches spack how to run the fast subset of tests. I also tried to narrow down the cmake, python and rocm dependencies a bit. It takes ages to build Tensile, so I've only tested on 4.3.1 thus far.
rocBLAS also supports using blis as a reference library. That would make the tests significantly faster, but the CMake for using blas is more or less the same for rocblas, hipblas (#27074), rocsolver (#26919) and hipsolver, so I figured I'd start with that.
To give it a spin, try:
@haampie @srekolam @arjun-raj-kuppala