Tuned OpenCL BLAS
-
Updated
Nov 8, 2024 - C++
Tuned OpenCL BLAS
Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm
BELLA: a Computationally-Efficient and Highly-Accurate Long-Read to Long-Read Aligner and Overlapper
Serial and parallel implementations of matrix multiplication
Scientific computing with Metal in C++: Matrix multiplication example
Offload Eigen operations to GPUs
Algorithms for matrix matrix multiplication, dgemm, AVX-256, AVX-512
Just a little playground, to test and try the benefits of Running Calculations on CPU or GPU with multiple threads.
Parallelizing Strassen’s matrix multiplication using OpenMP, MPI and CUDA.
Multi-GPU CUDA based scheduler.
Contains implementations of cache-optimized and external memory algorithms.
2D and 3D Matrix Convolution and Matrix Multiplication with CUDA
Simple C++ library for dealing with matrices.
Sample implementations of the Strassen algorithm for matrix multiplication
Probabilistic method for the computation of the approximate product of two matrices
CVM Class Library
Different matrix multiplication implementation and benchmarking on CPUs
Add a description, image, and links to the matrix-multiplication topic page so that developers can more easily learn about it.
To associate your repository with the matrix-multiplication topic, visit your repo's landing page and select "manage topics."