Skip to content

Releases: HiPerCoRe/KTT

Version 0.6 RC1

19 Feb 14:47
Compare
Choose a tag to compare
Version 0.6 RC1 Pre-release
Pre-release
  • Added support for multiple compute queues and asynchronous operations
  • Added support for online autotuning - kernel tuning combined with regular kernel running
  • Added support for kernel arguments with user-defined data types
  • Users now have greater control over kernel argument handling, tuner run modes were deprecated as a result
  • Validated kernel arguments can now have user-defined comparator
  • Added MCMC searcher
  • Added local memory argument modifiers which work similarly to kernel thread size modifiers
  • Added new buffer handling methods to tuning manipulator API
  • Added support for floating-point kernel parameters
  • Added method for retrieving kernel source code for specified kernel configuration
  • Implemented caching of compiled kernels when using tuning manipulator
  • Fixed several bugs in kernel composition methods
  • Fixed several rare bugs which could occur while using tuning manipulator
  • Added tutorials and several new examples
  • Fixed paths to kernel files in examples on Linux
  • Significantly improved documentation and added FAQ
  • Added macro definitions for KTT version

Version 0.5 Beta

27 Oct 16:53
d52c73c
Compare
Choose a tag to compare
Version 0.5 Beta Pre-release
Pre-release
  • Added support for kernel compositions
  • Added two different tuner modes - tuning mode and low overhead computation mode
  • Added support for storing buffers in host memory, including support for zero-copy buffers when computation mode is used
  • Kernel arguments can now be retrieved through API by utilizing new method for running kernels
  • Added an option to automatically ensure that global size is multiple of local size
  • Best kernel configuration can now be retrieved through API
  • Added an option to switch between CUDA and OpenCL global size notation
  • Improvements to tuning manipulator API
  • Usability improvements to dimension vector
  • Tweaks to CUDA backend
  • Minor improvements to result printer
  • Improved examples and documentation

Version 0.4 Beta

19 Jun 13:14
Compare
Choose a tag to compare
Version 0.4 Beta Pre-release
Pre-release
  • Added support for CUDA API
  • Significantly improved tuning manipulator API
  • Simplified baseline tuning manipulator and reference class usage
  • Improved overall tuner performance
  • Added support for uploading arguments into local (shared) memory
  • Configurations with local size larger than maximum of the current device are now automatically excluded from computation
  • Fixed memory leak in OpenCL backend
  • Fixed several bugs in tuning manipulator API
  • Fixed crash in annealing searcher
  • Added an option to print results from failed kernel runs
  • Improved tuner info messages
  • Improved CSV printing method
  • KTT is now compiled as dynamic (shared) library
  • Added build customization options to premake script
  • Additions and improvements to examples
  • Improved documentation

Version 0.3.1 Beta

15 May 15:28
Compare
Choose a tag to compare
Version 0.3.1 Beta Pre-release
Pre-release
  • Added support for new argument data types (8, 16, 32 and 64 bits long)
  • Added support for time unit specification for result printing
  • Added new utility methods to tuning manipulator API
  • Improvements to tuning manipulator
  • Fixed bugs in tuning manipulator API
  • Read-only arguments are now cached in OpenCL backend
  • Improved documentation

Version 0.3 Beta

08 May 16:32
Compare
Choose a tag to compare
Version 0.3 Beta Pre-release
Pre-release
  • Added tuning manipulator interface
  • Added support for validating multiple arguments with reference class
  • Added support for short argument data type
  • Added method for printing content of kernel arguments to file
  • Added method for specifying location for info messages printing
  • Additions and improvements to documentation
  • Improvements to samples
  • Fixed bug in CSV printing method
  • Other minor bug fixes and improvements

Version 0.2 Beta

10 Apr 18:07
Compare
Choose a tag to compare
Version 0.2 Beta Pre-release
Pre-release
  • Added methods for result printing
  • Added methods for kernel output validation
  • Additions and improvements to samples
  • Added API documentation
  • Implemented annealing searcher
  • Fixed build under Linux

Version 0.1 Beta

02 Apr 14:57
Compare
Choose a tag to compare
Version 0.1 Beta Pre-release
Pre-release
  • Basic autotuning functionality is now available