Skip to content

tretre91/TER

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

gemm

A C++ implementation of blas general matrix-matrix multiplication routines (xgemm).

Build

The library is header only, you just have to copy the files located in the include folder to use it.

In order to build the tests/benchmarks, you will need openblas installed on your system.

The tests and benchmarks can then be built and ran with CMake:

$ cmake -S . -B build -DGEMM_BUILD_TEST=ON -DGEMM_BUILD_BENCHMARK=ON # -DGEMM_USE_CTEST=ON
$ cmake --build build --target test --target benchmark
$ ./build/test/test
$ ./build/benchmark/benchmark --reporter=benchmark

or with xmake:

$ xmake
$ xmake build
$ xmake run test
$ xmake run benchmark --reporter=benchmark

Benchmark

Some benchmark results are available here, they were ran on a AMD Ryzen 5 3500u cpu, locked at 2100MHz. The program was compiled with GCC 13.1.1 and the -O3 -march=native options.

In the benchmarks, the floating point version gemm<float> is run against openblas' sgemm and, for small enough matrices, a naive algorithm (with the loop swapping optimization).

Plot

The number of cycles per computed matrix entries can be plotted by running the following commands:

$ ./build/benchmark/benchmark --reporter=plot::out=metrics.json
$ python3 benchmark/plot.py metrics.json -o plot.svg

It gives the following plots for the benchmark run mentionned above:

GFlops:

Cycles per computed value:

TODO

  • Integrate the kernels to the main matrix multiplication function
    • Have a working implementation
    • Fix performance issues
  • Microkernels
    • Kernel composition function
    • 1x(1, 2, 4, 8)x(1, 2, 4, 8) kernels
    • 2x(1, 2, 4, 8)x(1, 2, 4, 8) kernels
    • 4x(1, 2, 4, 8)x(1, 2, 4, 8) kernels
    • 8x(1, 2, 4, 8)x(1, 2, 4, 8) kernels
  • Tests
    • Kernels
    • Big matrices
    • Fix precision issues
    • Double
  • Benchmarks
    • Small matrices
    • Big matrices
    • Double
    • Plots

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published