Releases: termoshtt/accel
Release v0.1.0
Accel v0.1.0
CUDA-based GPGPU framework for Rust
Compile PTX Kernel from Rust using NVPTX backend of LLVM
From 2017/2, we can compile Rust into a PTX assembla using NVPTX backend of LLVM as demonstrated in japaric/nvptx, however, it needs a complicated setting. Accel generates this setting automatically using procedural macro feature.
proc-macro-attribute-based approach like futures-await
accel-derive crate introduces a proc-macro #[kernel]
, which generates two functions. One is compiled into a PTX code, called "kernel", and the other calls it from CPU code using cudaLaunchKernel
, called "caller". A support crate rust2ptx
is created while the proc-macro at $HOME/.rust2ptx
directory, and compiles the generated function (saved as lib.rs
) using xargo
. Generated PTX assembla is inserted into the source code of "caller" and thus embedded into the executable binary.
Simple memory management using Unified Memory
Unified memory (UM) is a feature introduced in CUDA6 and extended in CUDA8. We can manage memory without considering the memory is on CPU or GPU. Accel introduces accel::UVec
struct which manage UM as RAII and you can use it as a slice through Deref
.