Version 0.7 RC2
Pre-release
Pre-release
- Introduced stop condition API for offline tuning
- Added support for persistent kernel arguments
- Added global kernel cache, its capacity can be controlled through API
- Significant improvements to online tuning capabilities and performance
- Improvements to asynchronous functionality in tuning manipulator
- Online tuning and kernel running methods now return information about computation status and duration
- Fixed bug in device synchronization method in tuning manipulator
- Fixed memory leak in CUDA backend
- Fixed incorrect handling of invalid kernel results in some situations
- Added new examples
- Improvements to sort and reduction examples