Releases: GridTools/gridtools
GridTools version 1.1.0
GridTools
In GridTools v1.1.0 we set the default C++ standard to C++14 and drop compatibility for C++11. This requires at least CUDA 9.0.
Changes since v1.0.0
Full introduction of the SID concept
The backend is completely restructured based on the SID (stencil iteratable data) concept. There should be no user facing changes as long as user code was only using documented public API (*). The changes separate backend implementation from the core library to allow non intrusive extension of the library with new backends. Additionally maintainability of the gridtools infrastructure is significantly improved.
Performance should be improved in general, but might be worse for specific computations. A common pattern for performance improvement/degradation is not observed.
(*) There is one change which might trigger different behavior (though the old behavior was not documented): temporary fields are now implicitly 3 dimensional. Prior to this version the user could have abused a 2D temporary field for accumulating values between k-levels.
New
- New example illustrating the type-erasure pattern for computations. #1318
Deprecation (support will be removed in GridTools v2.0.0)
- Using the gridtools::c_bindings is deprecated. Switch to the standalone https://github.com/GridTools/cpp_bindgen.
global_accessor
is deprecated, usein_accessor
(without extents) instead.make_global_parameter
withbackend
as template parameter is deprecated. Thebackend
is not needed anymore.
Fixes / Cleanup
- Fix performance for CUDA 9.2 / 10.0 #1281 #1327 #1339
- Use c++14 features. #1307
- Use multiple threads in storage Initialization. #1300
- Remove dependency on boost::mpl and boost::fusion
- Fixes required to compile gridtools with HIP-Clang. Full support for AMD GPUs via HIP-Clang will come in a next release. #1363
- Fix a bug in communication #1355.
- The
global_parameter
doesn't require pre-allocated storage (as it is now passed via constant memory in case of CUDA), thereforeglobal_parameter
is a lightweight wrapper around the value type, which can be created without overhead, e.g. when passing it tocomputation.run()
.
Infrastructure/Development
GridTools version 1.0.3
GridTools version 1.0.2
GridTools version 1.0.1
Fixes
- Workaround for performance regression in CUDA 9.2 and newer, see #1223.
GridTools version 1.0.0
GridTools
An introduction to GridTools can be found in the documentation. Functionality as described in the documentation is considered public API, other functionality is considered internal and might change without notice.
Upgrading from pre-release versions
In the process of finalizing GridTools v1.0, API was changed in many places in the past pre-release version. See the description of the releases for information on how to update to the latest API.
Changes since v0.21.0
API breaking changes
The backend strategy (naive/block) was removed and replaced by a separate naive
backend. (#1238, #1240, #1244)
In the process, the target
tags became obsolete as they were just referring to a backend
. Therefore target
was renamed to backend
.
To update apply the following changes
backend_t::make_global_parameter(…)
is nowmake_global_parameter<backend_t>()
. Same forupdate_global_parameter
.backend_t::storage_traits_t
was removed, usestorage_traits<backend_t>
instead.target::X
is now calledbackend::X
.- CMake variables
GT_ENABLE_TARGET_X
are renamed toGT_ENABLE_BACKEND_X
.
Other
- Already deprecated functions were removed (#1232)
- Removed 2D and packing version from gcl (#1233)
- The last 2 parameters of axis are encapsulated in types, and the order of these parameters is reversed, e.g. use
axis<2, axis_config::offset_limit<4>>
instead ofaxis<2,0,4>
(#1257) - The call operator is removed from the global parameter (#1256)
Preparation for public release
Changes since 0.20.0
API breaking changes
- Conditionals (
if_
,switch_
) are removed. - Rename all files and folders with
-
(dash) to_
(underscore). - Rename
reactivate_device_write_views()
toreactivate_target_write_views()
- Removed the multiple-kernel implementation for boundary conditions.
Examples
- Examples are now provided with standalone CMakeLists.txt. The examples are used as a test for the GridTools CMake installation in our regression tests.
- C-bindings example was added.
Performance improvements
- mc: changed loop order and added omp statement for boundary conditions
Bug fixes
- Restores x86 performance, which was broken in 0.20.0.
- Restores cuda performance for layout transformations, which was broken in 0.20.0.
- Enable a workaround for CUDA 10.1 which already existed for CUDA < 10.1.
- CMake: export the mpi workaround
- CMake: fix a path for gt_bindings.cmake
API changes in preparation for the public release
Changes since 0.19.0
API breaking changes
Naming changes
A lot of public GridTools functions, types and macros were renamed to consistently use lower-case
arg_list
->param_list
as the elements are the parameters of the stencil operator (not the arguments).Do
-method →apply
-methodenumtype::in
andenumtype::inout
->intent::in
,intent::inout
execute<enumtype::forward>
etc. ->execute::forward
access_mode::ReadOnly
,ReadWrite
→access_mode::read_only
,read_write
cache_type::IJ, K
→cache_type::ij, k
direction::I, J, K
→direction::i, j, k
ownership::ExternalCPU, ExternalGPU
→ownership::external_cpu, external_gpu
STRUCTURED_GRIDS
→GT_STRUCTURED_GRIDS
FLOAT_PRECISION
→GT_FLOAT_PRECISION
BACKEND_*
→GT_BACKEND_*
ENABLE_METERS
→GT_ENABLE_METERS
storage_info_interface
->storage_info
Removed
- Removed
axis<...>::with_offset_limit
,axis<...>::with_extra_offsets
as they were confusing. These options have to be set directly as template arguments to theaxis
.
Internal API changes
GRIDTOOLS_STATIC_ASSERT
→GT_STATIC_ASSERT
ASSERT_OR_THROW
→GT_ASSERT_OR_THROW
DISALLOW_COPY_AND_ASSIGN
→GT_DISALLOW_COPY_AND_ASSIGN
_USE_GPU_
→GT_USE_GPU
GTREPO_*
→GT_REPO_*
GRIDTOOLS_PP_*
→GT_PP_*
PEDANTIC
→GT_PEDANTIC
VERBOSE
→GT_VERBOSE
RESTRICT
→GT_RESTRICT
__DISABLE_CACHING__
→GT_DISABLE_CACHING
META_STORAGE_INDEX_LIMIT
→GT_META_STORAGE_INDEX_LIMIT
- Removed
ALLOW_EMPTY_EXTENTS
,_USE_DATATYPES_
- Added
GT_
-prefix to some file-local macros to minimize conflict probability. _GCL_GPU_
→GCL_GPU
_GCL_MPI_
→GCL_MPI
CUDAMSG
→GCL_CUDAMSG
_GCL_CHECK_DESTRUCTOR
→GCL_CHECK_DESTRUCTOR
HOSTWORKAROUND
→GCL_HOSTWORKAROUND
NULL
→nullptr
- Added
GCL_
-prefix to GCL macros. - Replaced GCL header guards by
#pragma once
Other API changes
- Structured grids is now the default
- Users should use
make_param_list
to create theparam_list
instead of explicitly usingboost::mpl::vector
. In the future usingboost::mpl::vector
might not work anymore, the underlying type is implementation detail, not public API cache_type
is now an enum class. Update code by prefixing allij
andk
withcache_type::
- Introduces
make_expandable_computation(expand_factor<N>, ...)
and removes the respective overload ofmake_computation
; andmake_expandable_positional_computation(expand_factor<N>, ...)
and removes the respective overload ofmake_positional_computation
New functionality
- Distributed boundaries: timers for pack/unpack, exchange, and boundary condition.
New example
- Tridiagonal solver
Bug fixes
- Fix CUDA type unsigned long long char, which was a copy and paste bug from the CUDA programming guide where they are missing a comma.
- Add
!=
to halo_descriptors (==
already existed). - fortran_array_adapter: Throw if datastore was not allocated.
- c_bindings: wrap line for procedures.
- repository: bindings support to add a prefix.
- In CUDA temporaries are only allocated if they are not cached.
- User-friendly error on missing backend in make_computation.
- User-friendly error argument type check of make_multistage.
- Added Back check_grid_against_extents
- communication: only exchange the part of the buffer which is actually used by the exchange (not the full allocated buffer)
- Workaround nvcc which has problems in unrolling a loop in hypercube_iterator.
- Fix to the pointer sharing constructor of storage_info.
Other changes
- Documentation was updated
Internal changes
- Added
hymap
which is a boost::fusion-like map. - Updates to sid
New versioning scheme
Starting with this release we introduce a new versioning scheme.
Changes since 1.08.02 (which would have been 0.18.2 in the new versioning scheme).
New versioning scheme
Version number: X.Y.Z
- X: Major version will be 0 until the public release, then it will be 1, probably until a new major feature, e.g. complete icgrid.
- Y: Minor version will be increased after every API change and new smaller features, probably very often.
- Z: Patch version will be increased for bug fixes.
The CMake version matching is changed in this release toCOMPATIBILITY SameMinorVersion
which means the following: Let's say the user requiresfind_package(GridTools 0.18.2)
. Then0.18.3
(a newer patch release) will be compatible;0.18.1
(an older than requested release) and0.19.0
(a newer minor release) will be rejected.
API breaking changes
Removes reduction support from the stencil-composition API
make_reduction
is removedcomputation
type erasure doesn't haveReturnType
as a first template argument, i.e.computation<void, args...>
needs to be replaced bycomputation<args...>
.run
method of computation returnsvoid
now.
New functionality
Possibility to query intent and extent for placeholders from computation
computation.get_arg_intent(my_arg())
returnsenumtype::intent
computation.get_arg_extent(my_arg())
returnsrt_extent
which contains extents in i,j,k directions
Performance improvements
- several unneeded
cudaDeviceSynchronize()
in boundary_conditions are removed
Bug fixes
- c_bindings: support for multiple template arguments in generic bindings macro
Internal changes
- added convenience library for integral constants with
__host__ __device__
conversion and construction with custom literal_c
- SID utilities