Hardware info:
Nvidia GeForce GTX 1060 3GB
compute capability 6.1
Pascal series
SM count = 9
Software info:
Windows 10
Visual Studio 2022
CUDA 12.4
nvcc V12.4.99
requires additional arguments--expt-extended-lambda -Xcompiler "-openmp"
(included in VS solution)
This is a Visual Studio project, so you can (hopefully) use the solution to build it
main.cu is the only non-header file, so the only build step is callingnvcc main.cu
with correct arguments
Sample output:
GPU finished in 0.194812 seconds.
GPU data copying took 0.003253 seconds.
GPU total: 0.198065 seconds.
CPU finished in 5.23115 seconds.
OMP (8 threads) finished in 1.20689 seconds.