Skip to content

Releases: ROCm/aomp

AOMP Release 0.7-2

09 Sep 17:46
Compare
Choose a tag to compare
AOMP Release 0.7-2 Pre-release
Pre-release

THIS IS AN OLD RELEASE. DO NOT DOWNLOAD. PLEASE DOWNLOAD THE LATEST RELEASE.

The source code base for this release is the clang/llvm 9.0 development trunk as of August 2, 2019. These are the other changes included in this release.

  • Fixed reduction not showing correct value on device. This was due to a full work group/block barrier being called by the worker threads which threw off synchronization between master warp and worker warps.
  • Fixed compilation errors on AOMP cloc examples

AOMP Release 0.7-1

04 Sep 22:12
Compare
Choose a tag to compare
AOMP Release 0.7-1 Pre-release
Pre-release

THIS IS AN OLD RELEASE. DO NOT DOWNLOAD. PLEASE DOWNLOAD THE LATEST RELEASE.

The source code base for this release is the clang/llvm 9.0 development trunk as of August 2, 2019. These are the other changes included in this release.

  • Added logic to use the FileID and LineNum of the parent file (the includer) instead of the includee file where the target region is located. This avoids creating symbols with the same name when including a header file that has a c++ template with a target region.
  • For OpenMP+HIP hip will be on when processing host bc, so clang must be told this is IR and not HIP input.
  • Fixed the HIP toolchain so that the custom linker tool build-select is not called for hip applications. It is only called for openmp. This fixes problem where kernels are not seen when multiple source files are specified.
  • Cleaned up some things to lessen the patch from upstream HIP.cpp
  • Added the hip header hip_host_runtime_api.h to avoid modifications to the hip repository
  • Added hipcc wrapper script with modifications to work from AOMP install directory
  • Check if an archive contains device code for AMDGCN.
  • Cleanup deviceRTL for amdgcn to prepare for common GPU deviceRTL
  • Added rocminfo utilities to support hip.
  • Defer issue with reductions till 0.7-2.

AOMP Release 0.7-0

02 Aug 20:53
Compare
Choose a tag to compare
AOMP Release 0.7-0 Pre-release
Pre-release

THIS IS AN OLD RELEASE. DO NOT DOWNLOAD. PLEASE DOWNLOAD THE LATEST RELEASE.

This release is a major update from 0.6-5. The source code base for this release is the clang/llvm 9.0 development trunk as of July 15, 2019. These are the other changes included in this release.

  • The package now installs in /usr/lib/aomp_0.7-X with symbolic link from /usr/lib/aomp.
  • Uses build of rocm-device-libs exactly from rocm 2.6 source files.
  • New untested infrastructure to eventually support fortran with flang
  • Moved to the new llvm-project repository. This is the new monorepo that eliminates need for clang, llvm, lld, and openmp repositories.
  • no longer build for nvptx backend, removed cuda examples
  • moved utils to aomp-extras repository
  • moved custom libraries from rocm-device-libs to aomp-device-libs
  • hcc now build with rocm 2.6 hcc is not in the package because we only use it to build the hip runtime.
  • roct and rocr are now build from rocm 2.6 sources
  • comgr is now build from the rocm 2.6 sources.
  • fixes for a number of new test cases

AOMP Release 0.6-5

29 Jun 14:01
Compare
Choose a tag to compare
AOMP Release 0.6-5 Pre-release
Pre-release

Like 0.6-4, this release 0.6-5 of aomp is based off the stable version of clang/llvm 8.0.
These are the changes found in 0.6-5 compared to the previous 0.6-4 release.

  • Added support for archives of bundles on command line.
  • Created hostcall payload on system memory instead of GPU memory. This avoids cache effects of HBM memory that gets flushed only at kernel boundaries.
  • Cleaned up examples.
  • Readability changes to various README files in docs.
  • Added SLES-15-SP1 source install dependencies and important notes for linux support.
  • Emit struct of per kernel attributes.
  • Detect and warn that a target exit data clause fails, rather than abort.
  • Fixed linking issue when archive files contain no BC files.

AOMP Release 0.6-4

17 Jun 13:31
Compare
Choose a tag to compare
AOMP Release 0.6-4 Pre-release
Pre-release

Like 0.6-3, this release 0.6-4 of aomp is based off the stable version of clang/llvm 8.0.

These are the changes found in 0.6-4 compared to the previous 0.6-3 release.

  • support for building on SLES15 SP1
  • rpm package for SLES15 SP1
  • do not create a host thread for GPU hostcall services if no services are used by any kernel in the application. This fixes a performance regression we saw with openmpapps in 0.6-3 because none of those apps currently use printf on the device. This still needs more study.
  • Reorganized the github README and linked pages to make it less confusing and to ready support for more platforms.
  • removed hip wrapper scripts such as hipcc. Users must compile hip with clang++ as demonstrated in the examples to get openmp support with hip.
  • properly set amdgpu-flat-work-group-size for generic mode: add wave_size
  • add -lelf to link step of libomptarget.rtl.hsa.so
  • more gracefully exit when gpu arch of kernel does not match device arch
  • refine LIBPOMPTARGET_KERNEL_TRACE 1=>minimal, 2=>verbos'er

AOMP Release 0.6-3

28 May 18:20
Compare
Choose a tag to compare
AOMP Release 0.6-3 Pre-release
Pre-release

Like 0.6-2, this release is based off the stable version of clang/llvm 8.0.

These changes are from 0.6-2.

  • New support for synchronous services called hostcall.
  • The source to support hostcall can be found in a new repository called aomp-extras in the hostcall directory
  • There are minor changes to atmi to support hostcall. These are in branch atmi-0.5-063.
  • Removed printf end-of-kernel service and added to hostcall. printf is now much more reliable from the gpu.
  • Enhancements to toolchain to support static device libraries
  • fix to correctly pickup math functions from libm-.bc . Previously it was seeing math functions as builtins.
  • Suppress calls to __kmpc_push_target_count for host code, resolves undefined reference.
  • Allow -frtti flag to be honored if user requests it on command line.
  • Add AOMP/include path before /usr/local/include to pick up correct header for omp.h.
  • Generate Metadata for both SPMD and Generic offload targets.
  • Honor OMP_TEAM_LIMIT for work groups, just like OMP_NUM_TEAMS.
  • Added *_wg_size symbol to reflect compile time known thread limit for a kernel.
  • Added support to openmp runtimes to support 1024 threads per team/work group.
  • Reenabled SILoadStoreOptimizer pass after pulling upstream fix for scalar carry corruption.
  • Fixed amdgcn noinline and alwaysinline incompatibility issue for the Parallel Data Sharing Wrapper

AOMP Release 0.6-2

01 May 01:34
Compare
Choose a tag to compare
AOMP Release 0.6-2 Pre-release
Pre-release

This release uses the release_80 stable release of clang/llvm/lld/openmp repositories. The artifacts for this release include the patches to the release_80 repos to support openmp for amdgcn for release 0.6-2

Here are the fixes for 0.6-2

  • Fixed issue with constant size teams and threads.
  • Moved to the stable clang/llvm 8.0 code base
  • Fixed code in deviceRTLs/amdgcn that set Max_Warp_Number to 16, was 64
  • Enable Float16 for 0.6-2, disabled by default in release_80 merge
  • Disable META data opt, and provide evar AMDGPU_ENABLE_META_OPT_BUG to enable
  • Add archive handling for bc linking.
  • For performance, rewrite select_outline_wrapper calls, to be direct calls.
    Example: change the generated from:
    @_HASHW_DeclareSharedMemory_cpp__omp_outlined___wrapper =
    local_unnamed_addr addrspace(4) constant i64 -4874776124079246075
    call void @select_outline_wrapper(i16 0, i32 %6, i64 -4874776124079246075)
    to:
    call void @DeclareSharedMemory_cpp__omp_outlined___wrapper(i16 0, i32 %6)
  • In release_80, Loop_tripcount API is now used, so we need to limit num_groups/teams
    to no more than Max_Teams, fixes assertok_error, and snap4
    Also handle num_teams clause inside loop_tripcount logic.
  • BALLOT_SYNC macro replaced with ACTIVEMASK in release_80

AOMP Release 0.6-1

15 Apr 16:56
Compare
Choose a tag to compare
AOMP Release 0.6-1 Pre-release
Pre-release

Changes from 0.6-0 to 0.6-1:

  • Disabled SILoadStoreOptimizer pass to work around 64 bit address calculation issue

  • Added 6 new device APIs as extentions to OpenMP device apis

    • omp_ext_get_warp_id
    • omp_ext_get_lane_id
    • omp_ext_get_master_thread_id
    • omp_ext_get_smid
    • omp_ext_is_spmd_mode
    • omp_ext_get_active_threads_mask
  • rtl get_launch_vals added, algorithm rewrite for threads, teams computation

    • Throttle code for teams and threads off by default, enabled with THREAD_TEAM_THROTTLE
  • Added support for an LLC and OPT specific env-var AOMP_LLC_ARGS AOMP_OPT_ARGS

    • Allows adding compiler options to opt and llc via env-var, useful for triage, dumps, and debug.
  • Added clang-unbundle-archive tool.

  • Added support for device library archives in clang when using -l flag.

  • Updated llvm-link to work with archives of .bc components

  • Added new method AddStaticDeviceLibs to CommonArgs.cpp that searches for static device
    libraries using -l and -L command line options in a way similar to the search method used for
    host libraries including which directories to search for. The differences from host search are:

    • Searches look for names that specify the architecture and/or GPU
    • Searches look in the libdevice subdirectory of each host directory path
    • Searches look for filenames with .a suffix before searching for .bc suffix
  • Cleanup of aomp build scripts including split of llvm component into llvm, clang, and lld.

  • Fix where llvm-config is found during build

  • Added installed binaries from llvm to help with clang lit testing

  • New build script for comgr. This is not part of the compiler build yet. Developers and those building from source can run build_comgr.sh

  • Do not build hip runtime for ppc and arm builds.

  • Added two new smoke tests and improved automation of smoke tests

  • Corrected mymcpu and mygpu for vega20

AOMP Release 0.6-0

25 Feb 23:08
Compare
Choose a tag to compare
AOMP Release 0.6-0 Pre-release
Pre-release

This is the initial release of AOMP.

AOMP is the new name for HCC2. The last HCC2 release was HCC2 0.5-4.
Changes from HCC2 0.5-4

  • AOMP is built from sources for ROCm 2.1.
  • AOMP can build for Nvidia cards so install of CUDA 10 SDK is required.
  • AOMP needs to build hcc for proper build of hip.

Two of the openmpapps are known to fail. We are working to fix this in 0.6-1.

If you built aomp from source, it will default install into $HOME/rocm/aomp. This package will install into /opt/rocm/aomp. Many of the samples will look first in $HOME/rocm/aomp. To override this,

export AOMP=/opt/rocm/aomp