-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AutoTester2 CI Configs (Sans Power9 & ROCM w/ TPLS) #2174
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ndellingwood can you confirm some of the changes suggested in my review?
-DKokkosKernels_INST_OFFSET_SIZE_T=ON \ | ||
-DKokkosKernels_INST_OFFSET_INT=ON \ | ||
-DKokkosKernels_INST_LAYOUTLEFT=ON \ | ||
-DKokkosKernels_ENABLE_TPL_CUSOLVER=OFF \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cuSOLVER
and cuSPARSE
are off when CUDA is OFF but on the other hand I do not see ROCBLAS
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will triple-check this but this is the cmake command generated from cm_test_all_sandia. So, we want to match the old configs here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropping the disabled CUSOLVER and/or CUSPARSE options in non-Cuda builds should be safe to test, that is there as an artifact of some clunkiness around the parsing and setting of TPL options here
kokkos-kernels/cm_generate_makefile.bash
Lines 174 to 240 in 4f3220c
get_kernels_tpls_list() { | |
echo "parsing KOKKOSKERNELS_TPLS=$KOKKOSKERNELS_TPLS" | |
KOKKOSKERNELS_TPLS_LIST_CMD= | |
KOKKOSKERNELS_USER_TPL_PATH_CMD= | |
KOKKOSKERNELS_USER_TPL_LIBNAME_CMD= | |
CUBLAS_DEFAULT=OFF | |
CUSPARSE_DEFAULT=OFF | |
CUSOLVER_DEFAULT=OFF | |
ROCBLAS_DEFAULT=OFF | |
ROCSPARSE_DEFAULT=OFF | |
PARSE_TPLS_LIST=$(echo $KOKKOSKERNELS_TPLS | tr "," "\n") | |
for TPLS_ in $PARSE_TPLS_LIST | |
do | |
UC_TPLS=$(echo $TPLS_ | tr "[:lower:]" "[:upper:]") | |
KOKKOSKERNELS_TPLS_CMD="-DKokkosKernels_ENABLE_TPL_${UC_TPLS}=ON ${KOKKOSKERNELS_TPLS_CMD}" | |
if [ "$UC_TPLS" == "CUBLAS" ]; then | |
CUBLAS_DEFAULT=ON | |
fi | |
if [ "$UC_TPLS" == "CUSPARSE" ]; then | |
CUSPARSE_DEFAULT=ON | |
fi | |
if [ "$UC_TPLS" == "CUSOLVER" ]; then | |
CUSOLVER_DEFAULT=ON | |
fi | |
if [ "$UC_TPLS" == "ROCBLAS" ]; then | |
ROCBLAS_DEFAULT=ON | |
fi | |
if [ "$UC_TPLS" == "ROCSPARSE" ]; then | |
ROCSPARSE_DEFAULT=ON | |
fi | |
if [ "$UC_TPLS" == "BLAS" ]; then | |
if [ "$BLAS_PATH" != "" ]; then | |
echo User BLAS_PATH=$BLAS_PATH | |
KOKKOSKERNELS_USER_TPL_PATH_CMD="-DBLAS_LIBRARY_DIRS=${BLAS_PATH} ${KOKKOSKERNELS_USER_TPL_PATH_CMD}" | |
fi | |
if [ "$BLAS_LIBNAME" != "" ]; then | |
echo User BLAS_LIBNAME=$BLAS_LIBNAME | |
KOKKOSKERNELS_USER_TPL_LIBNAME_CMD="-DBLAS_LIBRARIES=${BLAS_LIBNAME} ${KOKKOSKERNELS_USER_TPL_LIBNAME_CMD}" | |
fi | |
fi | |
if [ "$UC_TPLS" == "LAPACK" ] || [ "$UC_TPLS" == "BLAS" ]; then | |
if [ "$LAPACK_PATH" != "" ]; then | |
echo User LAPACK_PATH=$LAPACK_PATH | |
KOKKOSKERNELS_USER_TPL_PATH_CMD="-DLAPACK_LIBRARY_DIRS=${LAPACK_PATH} ${KOKKOSKERNELS_USER_TPL_PATH_CMD}" | |
fi | |
if [ "$LAPACK_LIBNAME" != "" ]; then | |
echo User LAPACK_LIBNAME=$LAPACK_LIBNAME | |
KOKKOSKERNELS_USER_TPL_LIBNAME_CMD="-DLAPACK_LIBRARIES=${LAPACK_LIBNAME} ${KOKKOSKERNELS_USER_TPL_LIBNAME_CMD}" | |
fi | |
fi | |
done | |
if [ "$CUBLAS_DEFAULT" == "OFF" ]; then | |
KOKKOSKERNELS_TPLS_CMD="-DKokkosKernels_ENABLE_TPL_CUBLAS=OFF ${KOKKOSKERNELS_TPLS_CMD}" | |
fi | |
if [ "$CUSPARSE_DEFAULT" == "OFF" ]; then | |
KOKKOSKERNELS_TPLS_CMD="-DKokkosKernels_ENABLE_TPL_CUSPARSE=OFF ${KOKKOSKERNELS_TPLS_CMD}" | |
fi | |
if [ "$CUSOLVER_DEFAULT" == "OFF" ]; then | |
KOKKOSKERNELS_TPLS_CMD="-DKokkosKernels_ENABLE_TPL_CUSOLVER=OFF ${KOKKOSKERNELS_TPLS_CMD}" | |
fi | |
if [ "$ROCBLAS_DEFAULT" == "OFF" ]; then | |
KOKKOSKERNELS_TPLS_CMD="-DKokkosKernels_ENABLE_TPL_ROCBLAS=OFF ${KOKKOSKERNELS_TPLS_CMD}" | |
fi | |
if [ "$ROCSPARSE_DEFAULT" == "OFF" ]; then | |
KOKKOSKERNELS_TPLS_CMD="-DKokkosKernels_ENABLE_TPL_ROCSPARSE=OFF ${KOKKOSKERNELS_TPLS_CMD}" | |
fi | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked, and, can confirm this is the exact CMake command that cm_test_all_sandia generates. This exact command has been verified to configure, build, and test properly on the mi210s. I would recommend using this to start with and cleaning it up in one or more follow on PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I wasn't contesting they weren't set by cm_test_all_sandia, just pointing out they are there as an unnecessary artifact from the snip above and that cleanup would be safe.
Thanks for confirming the matching configuration, I recognize now that your aim with this PR is to have completely matching configs as a baseline. We can clean up unnecessary options as an immediate follow on PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing. Yea, this is a first pass at transitioning from AutoTester to AutoTester2.
-DKokkosKernels_ENABLE_TPL_ROCSOLVER=ON \ | ||
-DKokkosKernels_ENABLE_TPL_ROCSPARSE=ON \ | ||
-DKokkosKernels_ENABLE_TPL_BLAS=ON \ | ||
-DKokkosKernels_ENABLE_TPL_CUBLAS=OFF \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also remove cublas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above. I will triple-check but I will leave these to match the old configs from cm_test_all_sandia. All of these are global state and can have all kinds of 'fun' side affects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropping the disabled CUSOLVER and/or CUSPARSE options in non-Cuda builds should be safe to test, that is there as an artifact of some clunkiness around the parsing and setting of TPL options here
kokkos-kernels/cm_generate_makefile.bash
Lines 174 to 240 in 4f3220c
get_kernels_tpls_list() { | |
echo "parsing KOKKOSKERNELS_TPLS=$KOKKOSKERNELS_TPLS" | |
KOKKOSKERNELS_TPLS_LIST_CMD= | |
KOKKOSKERNELS_USER_TPL_PATH_CMD= | |
KOKKOSKERNELS_USER_TPL_LIBNAME_CMD= | |
CUBLAS_DEFAULT=OFF | |
CUSPARSE_DEFAULT=OFF | |
CUSOLVER_DEFAULT=OFF | |
ROCBLAS_DEFAULT=OFF | |
ROCSPARSE_DEFAULT=OFF | |
PARSE_TPLS_LIST=$(echo $KOKKOSKERNELS_TPLS | tr "," "\n") | |
for TPLS_ in $PARSE_TPLS_LIST | |
do | |
UC_TPLS=$(echo $TPLS_ | tr "[:lower:]" "[:upper:]") | |
KOKKOSKERNELS_TPLS_CMD="-DKokkosKernels_ENABLE_TPL_${UC_TPLS}=ON ${KOKKOSKERNELS_TPLS_CMD}" | |
if [ "$UC_TPLS" == "CUBLAS" ]; then | |
CUBLAS_DEFAULT=ON | |
fi | |
if [ "$UC_TPLS" == "CUSPARSE" ]; then | |
CUSPARSE_DEFAULT=ON | |
fi | |
if [ "$UC_TPLS" == "CUSOLVER" ]; then | |
CUSOLVER_DEFAULT=ON | |
fi | |
if [ "$UC_TPLS" == "ROCBLAS" ]; then | |
ROCBLAS_DEFAULT=ON | |
fi | |
if [ "$UC_TPLS" == "ROCSPARSE" ]; then | |
ROCSPARSE_DEFAULT=ON | |
fi | |
if [ "$UC_TPLS" == "BLAS" ]; then | |
if [ "$BLAS_PATH" != "" ]; then | |
echo User BLAS_PATH=$BLAS_PATH | |
KOKKOSKERNELS_USER_TPL_PATH_CMD="-DBLAS_LIBRARY_DIRS=${BLAS_PATH} ${KOKKOSKERNELS_USER_TPL_PATH_CMD}" | |
fi | |
if [ "$BLAS_LIBNAME" != "" ]; then | |
echo User BLAS_LIBNAME=$BLAS_LIBNAME | |
KOKKOSKERNELS_USER_TPL_LIBNAME_CMD="-DBLAS_LIBRARIES=${BLAS_LIBNAME} ${KOKKOSKERNELS_USER_TPL_LIBNAME_CMD}" | |
fi | |
fi | |
if [ "$UC_TPLS" == "LAPACK" ] || [ "$UC_TPLS" == "BLAS" ]; then | |
if [ "$LAPACK_PATH" != "" ]; then | |
echo User LAPACK_PATH=$LAPACK_PATH | |
KOKKOSKERNELS_USER_TPL_PATH_CMD="-DLAPACK_LIBRARY_DIRS=${LAPACK_PATH} ${KOKKOSKERNELS_USER_TPL_PATH_CMD}" | |
fi | |
if [ "$LAPACK_LIBNAME" != "" ]; then | |
echo User LAPACK_LIBNAME=$LAPACK_LIBNAME | |
KOKKOSKERNELS_USER_TPL_LIBNAME_CMD="-DLAPACK_LIBRARIES=${LAPACK_LIBNAME} ${KOKKOSKERNELS_USER_TPL_LIBNAME_CMD}" | |
fi | |
fi | |
done | |
if [ "$CUBLAS_DEFAULT" == "OFF" ]; then | |
KOKKOSKERNELS_TPLS_CMD="-DKokkosKernels_ENABLE_TPL_CUBLAS=OFF ${KOKKOSKERNELS_TPLS_CMD}" | |
fi | |
if [ "$CUSPARSE_DEFAULT" == "OFF" ]; then | |
KOKKOSKERNELS_TPLS_CMD="-DKokkosKernels_ENABLE_TPL_CUSPARSE=OFF ${KOKKOSKERNELS_TPLS_CMD}" | |
fi | |
if [ "$CUSOLVER_DEFAULT" == "OFF" ]; then | |
KOKKOSKERNELS_TPLS_CMD="-DKokkosKernels_ENABLE_TPL_CUSOLVER=OFF ${KOKKOSKERNELS_TPLS_CMD}" | |
fi | |
if [ "$ROCBLAS_DEFAULT" == "OFF" ]; then | |
KOKKOSKERNELS_TPLS_CMD="-DKokkosKernels_ENABLE_TPL_ROCBLAS=OFF ${KOKKOSKERNELS_TPLS_CMD}" | |
fi | |
if [ "$ROCSPARSE_DEFAULT" == "OFF" ]; then | |
KOKKOSKERNELS_TPLS_CMD="-DKokkosKernels_ENABLE_TPL_ROCSPARSE=OFF ${KOKKOSKERNELS_TPLS_CMD}" | |
fi | |
} |
.github/workflows/power9.yml
Outdated
-DKokkos_ENABLE_SERIAL=${{ matrix.serial }} \ | ||
-DKokkos_ENABLE_OPENMP=${{ matrix.openmp }} \ | ||
-DKokkos_ARCH_PASCAL60=ON \ | ||
-DKokkos_ARCH_POWER8=ON \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it Power8 or Power9? because the name of that workflow is power9... we should check eventually...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above WRT cm_test_all_sandia.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@e10harvey what machine was this tested on? Pascal60 and Power8 have not been options available in cm_test_all_sandia since white/ride were decommissioned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this is a Serial / OpenMP build so I the GPU Arch does not need to be specified, we can just drop it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has not been tested yet, only the 7 items checked off in the list above have been tested and confirmed to pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, FWIW, this is the exact CMake command being generated by cm_test_all_sandia for KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930_Tpls_CLANG13CUDA10
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the job to sanity check where the arch vars come from, looks like they are passed in manually (not preset within the script). I'm assuming this may have been copy-paste carryover from testing on a previous machine. Thanks for pointing to that job to clarify
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So same thing as what I wrote below, can we remove it from the PR and only add it later when it is ready?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, no problem.
.github/workflows/power9.yml
Outdated
-DCMAKE_INSTALL_PREFIX=$PWD/../install-${{ matrix.backend }} \ | ||
-DKokkos_ENABLE_SERIAL=${{ matrix.serial }} \ | ||
-DKokkos_ENABLE_OPENMP=${{ matrix.openmp }} \ | ||
-DKokkos_ARCH_PASCAL60=ON \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I do not think that we still have Pascal build or that we support this architecture at all?
Shouldn't this be V100?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, same as above WRT cm_test_all_sandia.
.github/workflows/power9.yml
Outdated
-DKokkosKernels_ENABLE_TPL_CUSOLVER=OFF \ | ||
-DKokkosKernels_ENABLE_TPL_BLAS=OFF \ | ||
-DKokkosKernels_ENABLE_TPL_CUBLAS=OFF \ | ||
-DKokkosKernels_ENABLE_TPL_CUSPARSE=OFF \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again all the cuda and rocm TPLs will be off automatically because the associated Kokkos backends are not enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above WRT cm_test_all_sandia.
.github/workflows/power9_tpls.yml
Outdated
-DKokkosKernels_ENABLE_TPL_ROCSPARSE=OFF \ | ||
-DKokkosKernels_ENABLE_TPL_ROCBLAS=OFF \ | ||
-DKokkosKernels_ENABLE_TPL_CUSOLVER=OFF \ | ||
-DKokkosKernels_ENABLE_TPL_BLAS=ON \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, but here you might want to turn on LAPACK since BLAS is on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above WRT cm_test_all_sandia.
.github/workflows/power9_tpls.yml
Outdated
-DKokkosKernels_ENABLE_TPL_CUSPARSE=OFF \ | ||
-DCMAKE_EXE_LINKER_FLAGS="-lgfortran -lm" \ | ||
-DLAPACK_LIBRARIES=lapack \ | ||
-DBLAS_LIBRARIES=blas \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That one might be openblas actually? Usually I also have to specify -D BLAS_LIBRARY_DIR=/path/to/openblas/lib
Same thing would be needed for LAPACK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above WRT cm_test_all_sandia.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay but that one is most likely not working at all. Do you have a config output for this command line?
Since the BLAS/LAPACK implementation used is supposedly openblas based on the name of the build, the library will not be able to find it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has not been tested yet, only the 7 items checked off in the list above have been tested and confirmed to pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, in that case can we remove it from this PR?
My experience tells me that if we check it in knowing with issues and do not work them out immediately this will likely stay there forever or until something important goes wrong and we have to fix it will we are putting out a fire.
@ndellingwood, @lucbv: In case I didn't communicate this clearly, this PR is a first pass at transitioning to AutoTester2. In short, the |
@e10harvey sure I understand that this is a work in progress so I am not too picky but at the same time I think removing the yaml files that are clearly wrong, even if coming from our old buggy configuration, is still fair in a first pass. |
I agree |
This PR adds KokkosKernels AutoTester2 CI configs as github actions workflows. Each CI check runs in a container and is currently set to be run manually (you need to push a button in the PR to run it, it won't run automatically).
Everything listed in the summary below shows the AutoTester CI check name and what the AutoTester2 CI check name is (
<AUTOTESTER_CI_NAME> -> <AUTOTESTER2_CI_NAME>
). Everything checked off below has been tested. For example an of what this looks like for the kokkos-kernels developers, see: https://github.com/testing-autotester4github/kokkos-kernels/actions/runs/8602581477/job/23712104423.@lucbv, @cwpearson, @ndellingwood, @srajama1: Please let me know which of the
PR_*
checks listed below you would like to run along side the AutoTester as non-blocking CI checks so I can push yaml file updates here to ensure they run automatically in every PR.Summary:
KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930_Tpls_CLANG13CUDA10 -> PR_POWER9_VOLTA70_GCC930_CLANG13_CUDA10_OPENMP_SERIAL_CUDA_LEFT_OPENBLAS_OPENLAPACK_REL