Skip to content

Releases: GridTools/gridtools

GridTools version 1.1.0

07 Oct 12:02
12ee091
Compare
Choose a tag to compare

GridTools

In GridTools v1.1.0 we set the default C++ standard to C++14 and drop compatibility for C++11. This requires at least CUDA 9.0.

Changes since v1.0.0

Full introduction of the SID concept

The backend is completely restructured based on the SID (stencil iteratable data) concept. There should be no user facing changes as long as user code was only using documented public API (*). The changes separate backend implementation from the core library to allow non intrusive extension of the library with new backends. Additionally maintainability of the gridtools infrastructure is significantly improved.
Performance should be improved in general, but might be worse for specific computations. A common pattern for performance improvement/degradation is not observed.

(*) There is one change which might trigger different behavior (though the old behavior was not documented): temporary fields are now implicitly 3 dimensional. Prior to this version the user could have abused a 2D temporary field for accumulating values between k-levels.

New

  • New example illustrating the type-erasure pattern for computations. #1318

Deprecation (support will be removed in GridTools v2.0.0)

  • Using the gridtools::c_bindings is deprecated. Switch to the standalone https://github.com/GridTools/cpp_bindgen.
  • global_accessor is deprecated, use in_accessor (without extents) instead.
  • make_global_parameter with backend as template parameter is deprecated. The backend is not needed anymore.

Fixes / Cleanup

  • Fix performance for CUDA 9.2 / 10.0 #1281 #1327 #1339
  • Use c++14 features. #1307
  • Use multiple threads in storage Initialization. #1300
  • Remove dependency on boost::mpl and boost::fusion
  • Fixes required to compile gridtools with HIP-Clang. Full support for AMD GPUs via HIP-Clang will come in a next release. #1363
  • Fix a bug in communication #1355.
  • The global_parameter doesn't require pre-allocated storage (as it is now passed via constant memory in case of CUDA), therefore global_parameter is a lightweight wrapper around the value type, which can be created without overhead, e.g. when passing it to computation.run().

Infrastructure/Development

  • The bash build script is replaced by a python driven build process, see wiki for how to get the environment. #1273 #1298 #1341
  • Improved jenkins performance plots. #1301 #1338
  • Googletest is now pulled-in with CMake's FetchConent instead of having it as part of the repository. #1310

GridTools version 1.0.3

07 Oct 12:10
8468d20
Compare
Choose a tag to compare

Fixes

  • Fix a module in communication #1356
  • CMake: fix storage module #1353

GridTools version 1.0.2

05 Aug 13:15
2d42ea7
Compare
Choose a tag to compare

Fixes

  • The workaround implemented in v1.0.1 did not fully recover CUDA 8.0 performance for CUDA >= 9.2. A further workaround now recovers performance. See #1326.
  • Make GT_DEFAULT_VERTICAL_BLOCK_SIZE macro modifiable for the user. See #1350.

GridTools version 1.0.1

16 May 07:56
Compare
Choose a tag to compare

Fixes

  • Workaround for performance regression in CUDA 9.2 and newer, see #1223.

GridTools version 1.0.0

05 Apr 16:49
5dfeace
Compare
Choose a tag to compare

GridTools

An introduction to GridTools can be found in the documentation. Functionality as described in the documentation is considered public API, other functionality is considered internal and might change without notice.

Upgrading from pre-release versions

In the process of finalizing GridTools v1.0, API was changed in many places in the past pre-release version. See the description of the releases for information on how to update to the latest API.

Changes since v0.21.0

API breaking changes

The backend strategy (naive/block) was removed and replaced by a separate naive backend. (#1238, #1240, #1244)

In the process, the target tags became obsolete as they were just referring to a backend. Therefore target was renamed to backend.
To update apply the following changes

  • backend_t::make_global_parameter(…) is now make_global_parameter<backend_t>(). Same for update_global_parameter.
  • backend_t::storage_traits_t was removed, use storage_traits<backend_t> instead.
  • target::X is now called backend::X.
  • CMake variables GT_ENABLE_TARGET_X are renamed to GT_ENABLE_BACKEND_X.

Other

  • Already deprecated functions were removed (#1232)
  • Removed 2D and packing version from gcl (#1233)
  • The last 2 parameters of axis are encapsulated in types, and the order of these parameters is reversed, e.g. use axis<2, axis_config::offset_limit<4>> instead of axis<2,0,4> (#1257)
  • The call operator is removed from the global parameter (#1256)

Preparation for public release

21 Mar 14:37
33d8e68
Compare
Choose a tag to compare
Pre-release

Changes since 0.20.0

API breaking changes

  • Conditionals (if_, switch_) are removed.
  • Rename all files and folders with - (dash) to _ (underscore).
  • Rename reactivate_device_write_views() to reactivate_target_write_views()
  • Removed the multiple-kernel implementation for boundary conditions.

Examples

  • Examples are now provided with standalone CMakeLists.txt. The examples are used as a test for the GridTools CMake installation in our regression tests.
  • C-bindings example was added.

Performance improvements

  • mc: changed loop order and added omp statement for boundary conditions

Bug fixes

  • Restores x86 performance, which was broken in 0.20.0.
  • Restores cuda performance for layout transformations, which was broken in 0.20.0.
  • Enable a workaround for CUDA 10.1 which already existed for CUDA < 10.1.
  • CMake: export the mpi workaround
  • CMake: fix a path for gt_bindings.cmake

API changes in preparation for the public release

27 Feb 09:13
Compare
Choose a tag to compare

Changes since 0.19.0

API breaking changes

Naming changes

A lot of public GridTools functions, types and macros were renamed to consistently use lower-case

  • arg_list -> param_list as the elements are the parameters of the stencil operator (not the arguments).
  • Do-method → apply-method
  • enumtype::in and enumtype::inout -> intent::in, intent::inout
  • execute<enumtype::forward> etc. -> execute::forward
  • access_mode::ReadOnly, ReadWriteaccess_mode::read_only, read_write
  • cache_type::IJ, Kcache_type::ij, k
  • direction::I, J, Kdirection::i, j, k
  • ownership::ExternalCPU, ExternalGPUownership::external_cpu, external_gpu
  • STRUCTURED_GRIDSGT_STRUCTURED_GRIDS
  • FLOAT_PRECISIONGT_FLOAT_PRECISION
  • BACKEND_*GT_BACKEND_*
  • ENABLE_METERSGT_ENABLE_METERS
  • storage_info_interface -> storage_info

Removed

  • Removed axis<...>::with_offset_limit, axis<...>::with_extra_offsets as they were confusing. These options have to be set directly as template arguments to the axis.

Internal API changes

  • GRIDTOOLS_STATIC_ASSERTGT_STATIC_ASSERT
  • ASSERT_OR_THROWGT_ASSERT_OR_THROW
  • DISALLOW_COPY_AND_ASSIGNGT_DISALLOW_COPY_AND_ASSIGN
  • _USE_GPU_GT_USE_GPU
  • GTREPO_*GT_REPO_*
  • GRIDTOOLS_PP_*GT_PP_*
  • PEDANTICGT_PEDANTIC
  • VERBOSEGT_VERBOSE
  • RESTRICTGT_RESTRICT
  • __DISABLE_CACHING__GT_DISABLE_CACHING
  • META_STORAGE_INDEX_LIMITGT_META_STORAGE_INDEX_LIMIT
  • Removed ALLOW_EMPTY_EXTENTS, _USE_DATATYPES_
  • Added GT_-prefix to some file-local macros to minimize conflict probability.
  • _GCL_GPU_GCL_GPU
  • _GCL_MPI_GCL_MPI
  • CUDAMSGGCL_CUDAMSG
  • _GCL_CHECK_DESTRUCTORGCL_CHECK_DESTRUCTOR
  • HOSTWORKAROUNDGCL_HOSTWORKAROUND
  • NULLnullptr
  • Added GCL_-prefix to GCL macros.
  • Replaced GCL header guards by #pragma once

Other API changes

  • Structured grids is now the default
  • Users should use make_param_list to create the param_list instead of explicitly using boost::mpl::vector. In the future using boost::mpl::vector might not work anymore, the underlying type is implementation detail, not public API
  • cache_type is now an enum class. Update code by prefixing all ij and k with cache_type::
  • Introduces make_expandable_computation(expand_factor<N>, ...) and removes the respective overload of make_computation; and make_expandable_positional_computation(expand_factor<N>, ...) and removes the respective overload of make_positional_computation

New functionality

  • Distributed boundaries: timers for pack/unpack, exchange, and boundary condition.

New example

  • Tridiagonal solver

Bug fixes

  • Fix CUDA type unsigned long long char, which was a copy and paste bug from the CUDA programming guide where they are missing a comma.
  • Add != to halo_descriptors (== already existed).
  • fortran_array_adapter: Throw if datastore was not allocated.
  • c_bindings: wrap line for procedures.
  • repository: bindings support to add a prefix.
  • In CUDA temporaries are only allocated if they are not cached.
  • User-friendly error on missing backend in make_computation.
  • User-friendly error argument type check of make_multistage.
  • Added Back check_grid_against_extents
  • communication: only exchange the part of the buffer which is actually used by the exchange (not the full allocated buffer)
  • Workaround nvcc which has problems in unrolling a loop in hypercube_iterator.
  • Fix to the pointer sharing constructor of storage_info.

Other changes

  • Documentation was updated

Internal changes

  • Added hymap which is a boost::fusion-like map.
  • Updates to sid

New versioning scheme

28 Jan 14:22
Compare
Choose a tag to compare
New versioning scheme Pre-release
Pre-release

Starting with this release we introduce a new versioning scheme.

Changes since 1.08.02 (which would have been 0.18.2 in the new versioning scheme).

New versioning scheme

Version number: X.Y.Z

  • X: Major version will be 0 until the public release, then it will be 1, probably until a new major feature, e.g. complete icgrid.
  • Y: Minor version will be increased after every API change and new smaller features, probably very often.
  • Z: Patch version will be increased for bug fixes.
    The CMake version matching is changed in this release to COMPATIBILITY SameMinorVersion which means the following: Let's say the user requires find_package(GridTools 0.18.2). Then 0.18.3 (a newer patch release) will be compatible; 0.18.1 (an older than requested release) and 0.19.0 (a newer minor release) will be rejected.

API breaking changes

Removes reduction support from the stencil-composition API

  • make_reduction is removed
  • computation type erasure doesn't have ReturnType as a first template argument, i.e. computation<void, args...> needs to be replaced by computation<args...>.
  • run method of computation returns void now.

New functionality

Possibility to query intent and extent for placeholders from computation

  • computation.get_arg_intent(my_arg()) returns enumtype::intent
  • computation.get_arg_extent(my_arg()) returns rt_extent which contains extents in i,j,k directions

Performance improvements

  • several unneeded cudaDeviceSynchronize() in boundary_conditions are removed

Bug fixes

  • c_bindings: support for multiple template arguments in generic bindings macro

Internal changes

  • added convenience library for integral constants with __host__ __device__ conversion and construction with custom literal _c
  • SID utilities