This repository implements and benchmarks different matrix transpose algorithms. Definitely check out the corresponding blog post
The data
folder contains the original benchmark data from the tested architectures that was used in the experimental analyses.
The lib
folder contains C-functions used by all tested algorithms with the corresponding header file.
The src
folder contains C files with the different algorithms for matrix transposition.
After cloning the repository you can run
make
and the files in src
are compiled, and the benchmark test is started and stored in the stats
folder (that is created by the Makefile). Waring: It can take a lot of time for the benchmarks to finish!
If you only want to compile the files in src
run
make compile
This should compile all C files in src
and store them into the (newly created) bin
folder without starting the benchmarks. Compiled binaries follow the naming convention of <ALGORITHM>-<OPTIMIZATION LEVEL>
The correctness of the provided implementations can be verified by running the compiled binaries in 'debug mode'. After compilation you can run
./bin/<BINARY> <MATRIX SIZE> --debug
For example
./bin/naive-0 2 --debug
Should output a randomly initialized matrix with dimension 2^2 and the corresponding transposed matrix. Additionaly the execution time and the effective bandwidth are displayed.