Skip to content

Releases: ROCm/aomp

AOMP Release 15.0-2

20 May 01:54
Compare
Choose a tag to compare

These are the release notes for AOMP 15.0-2. This release uses modifications to the LLVM development trunk called the "amd-stg-open" branch. This is found at https://github.com/RadeonOpenCompute/llvm-project. The amd-stg-open branch is constantly changing as AMD merges upstream development trunk with its internal open development efforts. Some AMD modifications are experimental and/or under review for the LLVM upstream mono-repo. The AOMP release is a snapshot of amd-stg-open and supporting repositories to build various components.
For AOMP 15.0-2, the last trunk commit is 3bef90dff64fc717c5d5e33a4d5fb47a4566d04a on May 15, 2022. This is the third AOMP release for LLVM 15 development. The last amd-only commit is 651deba7aa2805d1fe19ace427548f42f2c7a29f on May 16. This forms a frozen branch now called "aomp-15.0-2". See https://github.com/RadeonOpenCompute/llvm-project/tree/aomp-15.0-2

AOMP is a "standalone" build of all necessary ROCm components with the exception of the kernel module. The non llvm-project components for this release were built with ROCM 5.1.x sources.

The changes from 15.0-1 to 15.0-2 include:

  • Add user requested hint value AMD_unsafe_fp_atomics to match AMD_fast_fp_atomics.
  • Fixes to compile SPEC CPU with A + A options.
  • Add implementation of omp_is_initial_device to the new OpenMP runtime.
  • Add Fortran specific functions to the new OpenMP runtime. Classic flang compiler does not use the same OpenMP API
    as Clang and does not use __kmpc_parallel_51. This function is responsible for thread parallelization. __kmpc_parallel_51 increases the parallel level and launches parallel code.
  • Update cloc.sh in aomp-extras to pass bitcode for abi version.
  • Fix timing accuracy for OMPT target data transfer and kernel dispatch trace records.

AOMP Release 15.0-1

06 Apr 18:43
Compare
Choose a tag to compare

These are the release notes for AOMP 15.0-1. This release uses modifications to the LLVM development trunk called the "amd-stg-open" branch. This is found at https://github.com/RadeonOpenCompute/llvm-project. The amd-stg-open branch is constantly changing as AMD merges upstream development trunk with its internal open development efforts. Some AMD modifications are experimental and/or under review for the LLVM upstream mono-repo. The AOMP release is a snapshot of amd-stg-open and supporting repositories to build various components.
For AOMP 15.0, the last trunk commit is 6ec79a15cbe9539faf121b5ad39f195dc611fc09 on Mar 29, 2022. This is the first AOMP release for LLVM 15 development. The last amd-only commit is 7eb00e23dd0bd034c4b502a4a99e32b49ac010eb on Mar 26. This forms a frozen branch now called "aomp-15.0-1". See https://github.com/RadeonOpenCompute/llvm-project/tree/aomp-15.0-1

AOMP is a "standalone" build of all necessary ROCm components with the exception of the kernel module. The non llvm-project components for this release were built with ROCM 5.1.x sources.

The changes from 15.0-0 to 15.0-1 include:

  • Switch to ROCm 5.1.x sources

AOMP Release 15.0-0

04 Apr 14:20
Compare
Choose a tag to compare

These are the release notes for AOMP 15.0-0. This release uses modifications to the LLVM development trunk called the "amd-stg-open" branch. This is found at https://github.com/RadeonOpenCompute/llvm-project. The amd-stg-open branch is constantly changing as AMD merges upstream development trunk with its internal open development efforts. Some AMD modifications are experimental and/or under review for the LLVM upstream mono-repo. The AOMP release is a snapshot of amd-stg-open and supporting repositories to build various components.

For AOMP 15.0, the last trunk commit is 6ec79a15cbe9539faf121b5ad39f195dc611fc09 on Mar 29, 2022. This is the first AOMP release for LLVM 15 development. The last amd-only commit is 7eb00e23dd0bd034c4b502a4a99e32b49ac010eb on Mar 26. This forms a frozen branch now called "aomp-15.0-0". See https://github.com/RadeonOpenCompute/llvm-project/tree/aomp-15.0-0

AOMP is a "standalone" build of all necessary ROCm components with the exception of the kernel module. The non llvm-project components for this release were built with ROCM 5.0 .x sources.

The changes from 14.0-3 to 15.0-0 include:

  • New development infrastructure support source build of supplemental components.
    There are two types of supplemental components, prerequisite and post-build components. All supplemental components are built and installed in subdirectories of a directory specified with the AOMP_SUPP environment variable which gets a default value of $HOME/local.
    Supplemental components are not included in the aomp installation package because they are for development, build, and test only. Prerequisite components are cmake, hwloc, and rocmsmilib. These are created with the build_prereq.sh script. Post-build components are for testing and require the AOMP or ROCm compiler to be installed. The current list of post-build supplemental components is : openmpi, hdf5, silo, and fftw. Post-build supplemental components are built with the build_supp.sh script. build_prereq.sh is a symbolic link to build_supp.sh. For each component, the script fetches the source, builds the component, and then installs the components.

  • Enhanced support for CU masking in the openmp_set_cu_mask wedge script. This now supports multiple devices and no longer requires that the number of CUs be a multiple of the number of ranks. If the total number of CUs is not a multiple of ranks, appropriate controls ensure each rank gets an equal set of CUs on some device.

  • Added new scripts to build and run GenASiS and GESTS applications which would scan the dependent libraries coming from build_supp.sh and use them to build and run the applications.

Performance Improvements

  • A new type of GPU kernel called "no-loop" is created for simple target regions. Currently this is an opt-in feature because it ignores runtime environment variables that require additional loop logic.

Reliability improvements

  • Increase the maximum number of captured variables in a target region to 48 from 32. Future plans are to remove this maximum completely.

AOMP Release 14.0-3

08 Mar 15:25
Compare
Choose a tag to compare

These are the release notes for AOMP 14.0-3. This release uses modifications to the LLVM development trunk called the "amd-stg-open" branch. This is found at https://github.com/RadeonOpenCompute/llvm-project. The amd-stg-open branch is constantly changing as AMD merges upstream development trunk with its internal open development efforts. Some AMD modifications are experimental and/or under review for the LLVM mono-repo. The AOMP release is a snapshot of amd-stg-open and supporting repositories to build various components.

For AOMP 14.0-3, the last trunk commit is db01b123d012df2f0e6acf7e90bf4ba63382587c on Feb 2, 2022. That was the last upstream trunk commit before the beginning of LLVM 15 development. The last amd-only commit is b566cb1cb8f1fc8fede66a8e3af258b95009d190 on Feb 11. This forms a frozen branch now called "aomp-14.0-3". See https://github.com/RadeonOpenCompute/llvm-project/tree/aomp-14.0-3 .

AOMP is a "standalone" build of all necessary ROCm components with the exception of the kernel module. The non llvm-project components for this release were built with ROCM 5.0 sources.

The changes from 14.0-2 to 14.0-3 include:

  • Update to ROCm 5.0 components.
  • Fix to libompd cmake to support enabling HWLOC.
  • Fix to mygpu when using it from /usr/bin

Known issues

  • For badly formed custom mappers, host access to unmapped struct members causes segfault.
  • Usage of flang -g results in differing debug_info_version and dwarf_version.

AOMP Release 14.0-2

11 Feb 22:00
Compare
Choose a tag to compare

These are the release notes for AOMP 14.0-2. This release uses modifications to the LLVM development trunk called the "amd-stg-open" branch. This is found at https://github.com/RadeonOpenCompute/llvm-project. The amd-stg-open branch is constantly changing as AMD merges upstream development trunk with its internal open development efforts. Some AMD modifications are experimental and/or under review for the LLVM mono-repo. The AOMP release is a snapshot of amd-stg-open and supporting repositories to build various components.

For AOMP 14.0-2, the last trunk commit is db01b123d012df2f0e6acf7e90bf4ba63382587c on Feb 2, 2022. That was the last upstream trunk commit before the beginning of LLVM 15 development. The last amd-only commit is 4dcce9a16b1685dd87069abdf1274fa75b91a928 on Feb 8. This forms a frozen branch now called "aomp-14.0-2". See https://github.com/RadeonOpenCompute/llvm-project/tree/aomp-14.0-2 .

AOMP is a "standalone" build of all necessary ROCm components with the exception of the kernel module. The non llvm-project components for this release were built with ROCM 4.5 sources.

The changes from 14.0-1 to 14.0-2 include:

  • Device runtime performance improvements by overlapping multiple copies between host and device.
  • Inclusion of a static build of hwloc in libomp.so . This supports the use of PLACES for CPU affinity.
  • fixes to support compilation for nvptx64
  • fix to support uninitialized integers for hostrpc support
  • Support for managed memory allocations in OpenMP
  • Fix to support optimization of constant indexes for shared memory arrays in a target region.
  • Fix to resolve an unresolved global found in certain rocm-device-lib bitcode files.

Known issues

  • For badly formed custom mappers, host access to unmapped struct members causes segfault.
  • Usage of flang -g results in differing debug_info_version and dwarf_version.

AOMP Release 14.0-1

20 Jan 16:16
Compare
Choose a tag to compare

These are the release notes for AOMP 14.0-1. This release uses modifications to the LLVM development trunk called the "amd-stg-open" branch. This is found at https://github.com/RadeonOpenCompute/llvm-project. The amd-stg-open branch is constantly changing as AMD merges upstream development trunk with its internal open development efforts. Some AMD modifications are experimental and/or under review for the LLVM mono-repo. The AOMP release is a snapshot of amd-stg-open and supporting repositories to build various components.

For AOMP 14.0-1, the last trunk commit is 9be193bc58b356e2d2e0bddff59a404358e2c75e on Jan 11. The last amd-only commit is a4a503a2b65b37f4c8e4931d502cc6d53810b5f8 on Jan 13. This forms a frozen branch now called "aomp-14.0-1". See https://github.com/RadeonOpenCompute/llvm-project/tree/aomp-14.0-1 .

This is a major update from AOMP 14.0.0. The changes include

  • A restructuring of the clang driver to a) remove the clang-build-select-link tool b) remove all "post" clang linking with mlink attributes on the clang -cc1 command. All device library linking is now done in the llvm-link step which follows clang -cc1. Furthermore, libraries including the critical libomptarget-..-.bc library are internalized by the llvm-link step to avoid unnecessary bit code for the backend.
  • The construction of libomptarget-..-.bc library now includes rocm-device-lib functions, device libm functions, hostrpc stubs, and lastly the OpenMP deviceRTLs. This all-inclusive library simplifies the device toolchain and improves performance.
  • Elimination of the need for the aomp-extras library.

Known issues:

  • Compilation for nvidia GPUs is broken. We will fix this in 14.0-2.

AOMP Release 14.0-0

16 Nov 13:40
Compare
Choose a tag to compare
AOMP Release 14.0-0 Pre-release
Pre-release

These are the release notes for AOMP 14.0-0. This release uses modifications to the LLVM development trunk called the "amd-stg-open" branch. This is found at https://github.com/RadeonOpenCompute/llvm-project. The amd-stg-open branch is constantly changing as AMD merges upstream development trunk with its internal open development efforts. Some AMD modifications are experimental and/or under review for the LLVM mono-repo. The AOMP release is a snapshot of amd-stg-open and supporting repositories to build various components.

For AOMP 14.0-0, the last trunk commit is a0633f5ccb04e4b1613eeb23af10ad729dace2b5 on Nov 8. The last amd-only commit is 8a48924725f0c53217d108b1d4b95f6ba0038031 on Nov 8. This forms a frozen branch now called "aomp-14.0-0". See https://github.com/RadeonOpenCompute/llvm-project/tree/aomp-14.0-0 . The difference from the upstream LLVM trunk is found in the patch below. It is 35563 lines on 345 files. not including test directories.

Changes from aomp 13.0-6:

  • AOMP is now based on amd-stg-open branch
  • Most components are build from ROCm release 4.5 sources
  • Components are now cloned using a manifest file. The script clone_aomp.sh is still used to clone and update repos.
  • New hip build method
  • Support for unified shared memory on gfx90a
  • Support for atomic hint clause to enable fast floating point atomics
  • Support for LLVM IR code generation with updated device RTL (deviceRTLs)
  • Support for target ID with XNACK settings
  • Support for cross-platform offload device identification LLVM library and tool (offload-arch).
  • Fixed many reduction problems and nested parallelism

Known Issues:

  • Slow CPU device-to-host data transfer speeds
  • Miniqmc, Kokkos, Raja fail to build
  • Non-deterministic failures in qmcpack deterministic tests
  • Possible incorrect linking of libclang-cpp.so in the build of libomptarget.so

Check later for more updates...

AOMP Release 13.0-6

23 Aug 22:44
Compare
Choose a tag to compare
AOMP Release 13.0-6 Pre-release
Pre-release

These are the release notes for AOMP_13.0-6. The source code base for this release is the upstream LLVM 11 monorepo main branch as of April 1, 2021 with hash value 0889181625bb570e463362ab8f53f9a14c886b2e. Updates from the aomp-stg-open repo were added as one commit per different file as of April 6, 2021.

Update to ROCm 4.3 sources.
Flang cmake race condition fixed.

AOMP Release 13.0-5

29 Jul 15:07
Compare
Choose a tag to compare
AOMP Release 13.0-5 Pre-release
Pre-release

THIS IS AN OLD RELEASE. DO NOT DOWNLOAD. PLEASE DOWNLOAD THE LATEST RELEASE.

These are the release notes for AOMP_13.0-5. The source code base for this release is the upstream LLVM 11 monorepo main branch as of April 1, 2021 with hash value 0889181625bb570e463362ab8f53f9a14c886b2e. Updates from the aomp-stg-open repo were added as one commit per different file as of April 6, 2021.

This release includes a demo of a new LLVM library called libLLVMOffloadArch.cpp. The clang tool offload-arch is now built with this library. The libomptarget runtime no longer calls the binary "offload-arch -c" and traps the stdout. Instead a library call is made to libLLVMOffloadArch.cpp to determine current capabilities. The tool offload-arch is still created with the llvm build and the sources are in llvm-project/llvm/lib/OffloadArch/offload-arch . Updates were made so offload-arch returns the first VISIBLE gpu which could be the result of setting ROCM_VISIBLE_DEVICES for amdgpus.

This release starts to deprecate the use of mygpu in favor of offload-arch. A new version of mygpu calls offload-arch. The tables used to drive mygpu have been deleted. All pci-id tables for offloading identification are now in llvm library OffloadArch.

Added a new command line option -offload-usm which turns on OpenMP pragma requires unified_shared_memory and sets toolchain flags appropriately. This saves having to change every source file to turn on unified shared memory.

Build changes:
Update list of gfx names to include gfx1030 and gfx1031

Known Issues:
9 Clang lit test failures
Long build times when large numbers of archive libraries are needed because toolchain must unbundle the archive for device linking.

AOMP Release 13.0-4

18 Jun 16:37
Compare
Choose a tag to compare
AOMP Release 13.0-4 Pre-release
Pre-release

THIS IS AN OLD RELEASE. DO NOT DOWNLOAD. PLEASE DOWNLOAD THE LATEST RELEASE.

These are the release notes for AOMP_13.0-4. The source code base for this release is the upstream LLVM 13 monorepo main branch as of April 1, 2021 with hash value 0889181625bb570e463362ab8f53f9a14c886b2e. Updates from the aomp-stg-open repo were added as one commit per different file as of April 6, 2021. This release is primarily a bug fix of the regressions in 13.0-3. 13.0-3 had significant driver changes to support multiple images. which caused a number of regressions that were fixed in this release. We strongly recommend deleting 13.0-3 and using this release.

Features:

  • Support larger CU masks up to 128 bit or up to 128 CUs.
  • Provide warning when HIP tries to use OpenMP offloading that this is not supported and target constructs will be ignored.

Fixes:

  • Fixed examples/cloc/vector_copy_hip and vector_copy_hip_omp to use HIP_PLATFORM_AMD from deprecated HIP_PLATFORM_HCC
  • Fixed examples/hip/device_lib to unbundle openmp library since toolchain is looking for hip library.
  • Fixed examples/hip/writeIndex.
  • Fixed test/hip-openmp/aomp_hip_launch_test. Bug in driver.cpp and new name for a.out file.
  • Fixed test/hip-openmp/hip_host_register.
  • Fixed hipcc when using noroot installs.
  • Fixed host compilation picking up the wrong libomp.so on some systems.
  • Fixed RPATH to include lib64.
  • Fixed RAJA build issue.
  • Fixed issue where targetID was not handled properly with march/fopenmp-targets.

Build changes:

  • Update list of gfx names to include gfx90a
  • Update list of NVPTX GPU names to : 30,35,37,50,52,53,60,61,62

Known Issues:

  • 9 Clang lit test failures
  • Long build times when large numbers of archive libraries are needed because toolchain must unbundle the archive for device linking.