Introduction
This document contains the release notes for the automatic differentiation plugin for clang Clad, release 1.8. Clad is built on top of Clang and LLVM compiler infrastructure. Here we describe the status of Clad in some detail, including major improvements from the previous release and new feature work.
Note that if you are reading this file from a git checkout, this document applies to the next release, not the current one.
What's New in Clad 1.8?
Some of the major new features and improvements to Clad are listed here. Generic improvements to Clad as a whole or to its underlying infrastructure are described first.
External Dependencies
- Clad works with clang-8 to clang-18.
Forward Mode & Reverse Mode
- Support
std::array
,std::vector
.
Forward Mode
- Support
std::tie
,std::atan2
andstd::acos
.
Reverse Mode
- Support
std::initializer_list
andsizeof
. - Support pointer-valued functions.
- Support range-based for loops.
CUDA
- Add support of CUDA device pullbacks.
- Enable computation of CUDA global kernels derivative.
Misc
- Enable immediate evaluation mode (
consteval
andconstexpr
) with a new clad modeclad::immediate_mode
. - Make
clad::CladFunction
andclad::array_ref
constexpr. - Support operators defined outside of classes.
- Add Varied analysis to the reverse mode.
- Add support for
Kokkos::View
,Kokkos::deep_copy
, Kokkos::resizeand
parallel_for` in reverse mode. - Add support for
Kokkos::parallel_for
,Kokkos::fence
,Kokkos::View
and
Kokkos::parallel_reduce
in forward mode.
Fixed Bugs
472 480 527 682 684 830 855 1000 1019 1033 1049 1057 1070 1071 1081 1087 1151 1162
Special Kudos
This release wouldn't have happened without the efforts of our contributors, listed in the form of Firstname Lastname (#contributions):
FirstName LastName (#commits)
A B (N)
petro.zarytskyi (30)
Vassil Vassilev (22)
Atell Krasnopolski (22)
Christina Koutsou (17)
Mihail Mihov (12)
ovdiiuv (7)
kchristin (6)
Vipul Cariappa (3)
Alexander Penev (3)
mcbarton (1)
infinite-void-16 (1)
fsfod (1)
Vaibhav Thakkar (1)
Max Andriychuk (1)
Infinite Void (1)
Austeja (1)
What's Changed
- Product of references in different scope fix by @ovdiiuv in #1030
- Redesign of the rangebased for loops body by @ovdiiuv in #1034
- Desugar the type before analyzing it in TBR by @PetroZarytskyi in #1046
- Add support of custom
_forw
functions by @infinite-void-16 in #1037 - Update FileCheck test lines for number literals and types generated on windows by @fsfod in #1042
- Support pointer-valued functions in the reverse mode by @PetroZarytskyi in #1047
- Add support for
Kokkos::parallel_for
in the fwd mode by @gojakuch in #1022 - Add support for std::initializer_list in the reverse mode by @PetroZarytskyi in #1018
- Add support for
Kokkos::fence
in the fwd mode by @gojakuch in #1048 - Fix collectDataFromPredecessors to collect data from the passed branch by @ovdiiuv in #1051
- Add a test for a call expr with a bool literal by @PetroZarytskyi in #1055
- Introduce placeholders in the reverse mode to correctly handle multiplication by @PetroZarytskyi in #1039
- Fix braceless if differentiation in reverse mode by @ovdiiuv in #1058
- Provide pushforward methods for
Kokkos::View
indexing by @gojakuch in #1061 - Add support for
Kokkos::parallel_reduce
in the fwd mode by @gojakuch in #1056 - Make the signatures of
KokkosBuiltins.h
more general by @gojakuch in #1063 - Add primitive support of reverse-mode constructor custom derivatives by @infinite-void-16 in #1045
- Fix CallDeclOnly unit test by @kchristin22 in #1075
- Try to improve literal types by @vaithak in #998
- Fix custom
reverse_forw
s for operators by @gojakuch in #1076 - Fix zero types in the new STL custom derivatives test by @gojakuch in #1077
- Enable computation of CUDA global kernels derivative in reverse mode by @kchristin22 in #1059
- [cuda][ci] Run the cuda tests on a self-hosted runner with a gpu. by @vgvassilev in #1069
- Add support for
std::array
in the rvs mode by @gojakuch in #1080 - Cover the specifics of debugging the architecture check runner in the docs by @gojakuch in #1085
- [cuda] Enable filecheck on tests by @vgvassilev in #1083
- Store and restore outputaArray elements for jacobians by @ovdiiuv in #1093
- Add basic support for
std::tie
and tuples in the fwd mode by @gojakuch in #1094 - Enhance the support of
std::vector
andstd::array
in the fwd mode by @gojakuch in #1099 - Add support for 'std::atan2' and 'std::acos' by @ZeptoStarling in #1097
- Enhance the support of std::vector and std::array in the rvs mode by @gojakuch in #1101
- Fix appendage of nullptrs to args of a CUDA kernel by @kchristin22 in #1102
- Add support of CUDA builtins by @kchristin22 in #1092
- Fix some cases of
std::vector::push_back
in the rvs mode by @gojakuch in #1103 - Move checks in CladFunction to compile-time by @MihailMihov in #1090
- Coverage CUDA self-hosted by @alexander-penev in #1107
- Handle write-race conditions in CUDA kernels: Add atomic operation by @kchristin22 in #1104
- Remove excessive parameters from Derive functions by @PetroZarytskyi in #1110
- Fix the generation of invalid code in some common cases by @gojakuch in #1088
- Add support for basic Kokkos operations in the rvs mode by @gojakuch in #1068
- Add support for
Kokkos::deep_copy
in the rvs mode by @gojakuch in #1115 - Fix synth literal function for enums by @kchristin22 in #1113
- Add support of CUDA device pullbacks by @kchristin22 in #1111
- Store/restore reference args only if they are lvalue and non-const by @PetroZarytskyi in #1117
- Add Varied analysis to the reverse mode by @ovdiiuv in #1084
- Support operators defined outside of classes by @PetroZarytskyi in #1119
- Kokkos reverse mode improvements by @gojakuch in #1116
- Initialize derivative pointers that are allocated through malloc or realloc by @kchristin22 in #1124
- Add support for parameter specification in Varied Analysis by @ovdiiuv in #1122
- Add support of cuda kernels as pullback functions by @kchristin22 in #1114
- Fix Incorrect derivative when loops contains continue by @kchristin22 in #833
- Add support for differentiation of immediate functions by @MihailMihov in #1109
- Add cudaMemset call after cudaMalloc for derivative pointers by @kchristin22 in #1129
- Avoid creation of local derivative of const parameter by @kchristin22 in #1131
- Revert skipping creation of local adjoints for const params and declare those as non-const by @kchristin22 in #1134
- Don't create pullbacks for function with not varied parameters by @ovdiiuv in #1127
- Fix _r local vars being passed to non-ref cuda kernel pullbacks by @kchristin22 in #1133
- Remove excessive FD and request parameters from DeriveVectorMode by @PetroZarytskyi in #1136
- Always use valid location when generating operators by @PetroZarytskyi in #1137
- Add CUDA 3D tensor contraction demo by @kchristin22 in #1141
- Improve array_expression/array/array_ref operators by @PetroZarytskyi in #1138
- Support conversion between different clad::array instantiations by @PetroZarytskyi in #1143
- Use clad::zero_vector instead of 0 in the vectorized forward mode by @PetroZarytskyi in #1140
- Reimplement jacobians using the vectorized forward mode by @PetroZarytskyi in #1121
- Add doc page for usage of Clad with CUDA by @kchristin22 in #1144
- Enable setup tmate when on debug mode, regardless of whether the runner succeeded by @kchristin22 in #1149
- Fix struct initialization and return stmts by @kchristin22 in #1142
- Clone printf and fprintf calls by @kchristin22 in #1147
- Add CUDA demos by @kchristin22 in #1139
- Clone base decl when having an anonymous struct or union by @kchristin22 in #1152
- [cmake] Enable building clad within the LLVM monorepo. by @vgvassilev in #1157
- [cmake] Fix target typo. by @vgvassilev in #1158
- [cmake] Improve the cmake infrastructure. by @vgvassilev in #1159
- No need to handle recursive calls separately in the reverse mode by @PetroZarytskyi in #1160
- Use a single point to process non-differentiable functions in the reverse mode by @PetroZarytskyi in #1161
- Add
fabs_pushforward
to built-in derivatives by @gojakuch in #1165 - [ci] Bump clang-tidy trying ot fix it by @vgvassilev in #1166
- Don't consider arrays as a special case in DifferentiateVarDecl by @PetroZarytskyi in #1164
- Initialize adjoints of aggregate types with init lists by @PetroZarytskyi in #1163
- Fix cmake include directories by @Vipul-Cariappa in #1167
- fix cmake dependencies when building together with clang by @Vipul-Cariappa in #1168
- Fix warning by @vgvassilev in #1169
- Constify interfaces. NFC by @vgvassilev in #1170
- Revert "fix cmake dependencies when building together with clang" by @Vipul-Cariappa in #1173
- Synthesize the nested name specifiers to include the namespace qualifiers. by @vgvassilev in #1172
- Simplify handling of diff request options. NFC by @vgvassilev in #1174
- Respect shadow declarations when writing propagators. by @vgvassilev in #1171
- Add llvm 18 osx ci by @vgvassilev in #1025
New Contributors
- @infinite-void-16 made their first contribution in #1037
- @fsfod made their first contribution in #1042
- @ZeptoStarling made their first contribution in #1097
- @Vipul-Cariappa made their first contribution in #1167
Full Changelog: v1.7...v1.8