Skip to content

Commit

Permalink
Gate Aggregate Performance Tests; Issue #158 (#165)
Browse files Browse the repository at this point in the history
* Added gate benchmark project

* Changed x & y labels

* added cleanup bash script

* Updated usage prompt

* added cmake policy 3.14

* Fixed CMP0127 warning msg

* adjustment to font size

* removed typos in comments

* cmake_cxx_flags passed via commandline argument

* removed number of program runs & added const

* removed number of program runs & added const

* added comments on implementation

* Markdown formatting

* code format updated

* bullet point on aggregate gate performance

* build directory handled with pushd & popd

* merge of changelog

* saving figures instead of showing them

* entry now fits style of previous entries

* added clang-tidy to adhere to rest of library

* compiler specified by user & removed flags

* incorperated changes; phrasing now more general

* removed readout of compiler flags & set to "-O3"

* removed comments on setting flags & cpp standard

* fixed clang-tidy nested namespace warning

* fixed clang-tidy nodiscard warning

* fixed formatting

* Update examples/README.md

Co-authored-by: Lee James O'Riordan <mlxd@users.noreply.github.com>

* Update examples/README.md

Co-authored-by: Lee James O'Riordan <mlxd@users.noreply.github.com>

* Update examples/README.md

Co-authored-by: Lee James O'Riordan <mlxd@users.noreply.github.com>

* Update examples/README.md

Co-authored-by: Lee James O'Riordan <mlxd@users.noreply.github.com>

* Update examples/README.md

Co-authored-by: Lee James O'Riordan <mlxd@users.noreply.github.com>

* Update examples/gate_benchmark.cpp

Co-authored-by: Lee James O'Riordan <mlxd@users.noreply.github.com>

* Update examples/README.md

Co-authored-by: Lee James O'Riordan <mlxd@users.noreply.github.com>

* Update examples/README.md

Co-authored-by: Lee James O'Riordan <mlxd@users.noreply.github.com>

* added gate performance test under 0.20.00-dev

* added Python plotting requirements

* Update examples/README.md

Co-authored-by: Lee James O'Riordan <mlxd@users.noreply.github.com>

* Trigger CI

Co-authored-by: Isidor Schoch <isschoch@ethz.ch>
Co-authored-by: Lee James O'Riordan <mlxd@users.noreply.github.com>
Co-authored-by: antalszava <antalszava@gmail.com>
  • Loading branch information
4 people authored Dec 6, 2021
1 parent d1a4981 commit 82c197b
Show file tree
Hide file tree
Showing 14 changed files with 278 additions and 16 deletions.
7 changes: 5 additions & 2 deletions .github/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@

### New features since last release

* Added examples folder containing aggregate gate performance test.
[(#165)](https://github.com/PennyLaneAI/pennylane-lightning/pull/165)

### Breaking changes

### Improvements
Expand All @@ -22,7 +25,7 @@ Chae-Yeun Park

This release contains contributions from (in alphabetical order):

Ali Asadi
Ali Asadi, Isidor Schoch

---

Expand Down Expand Up @@ -353,4 +356,4 @@ Initial release.

This release contains contributions from (in alphabetical order):

Tom Bromley, Josh Izaac, Nathan Killoran, Antal Száva
Tom Bromley, Josh Izaac, Nathan Killoran, Antal Száva
4 changes: 4 additions & 0 deletions examples/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
*.csv
compiler_info.txt
*/build/*
*.png
46 changes: 46 additions & 0 deletions examples/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
#############################
## I. Set project details
#############################
cmake_minimum_required(VERSION 3.14)
set(CMAKE_POLICY_DEFAULT_CMP0127 NEW)

project("gate_benchmark"
VERSION 0.1.0
DESCRIPTION "Benchmark of parametric & non-parametric gates."
LANGUAGES CXX
)

option(ENABLE_WARNINGS "Enable warnings" ON)
option(ENABLE_CLANG_TIDY "Enable clang-tidy build checks" OFF)

if(ENABLE_CLANG_TIDY)
if (NOT DEFINED CLANG_TIDY_BINARY)
set(CLANG_TIDY_BINARY clang-tidy)
endif()
set(CMAKE_CXX_CLANG_TIDY ${CLANG_TIDY_BINARY};
-extra-arg=-std=c++17;
)
endif()

#############################
## II. Fetch project
#############################

Include(FetchContent)

FetchContent_Declare(
Pennylane-lightning
GIT_REPOSITORY https://github.com/PennyLaneAI/pennylane-lightning
GIT_TAG master
)
FetchContent_MakeAvailable(Pennylane-lightning)

#############################
## III. Create project target
#############################

set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_FLAGS "-O3")

add_executable(gate_benchmark gate_benchmark.cpp)
target_link_libraries(gate_benchmark pennylane_lightning)
20 changes: 20 additions & 0 deletions examples/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Gate aggregate performance tests
Run `bash run_gate_benchmark.sh $CXX_COMPILER`, where `$CXX_COMPILER` is the compiler you wish to use, in the terminal (e.g. `bash run_gate_benchmark.sh clang++`). The script will automatically build the gate_benchmark project.
It will set the CXX environment variable to "$CXX_COMPILER".

## Implementation details:
* The compile-time options are controlled by the bash script `run_gate_benchmark.sh`
* The PennyLane-Lightning benchmark is provided in the `gate_benchmark.cpp` file
* Plotting is accomplished with the Python script `gate_benchmark_plotter.py`.
* Plotting requires the packages listed in `requirements.txt`
* The number of gate repetitions is set to 3 and can be changed in the bash script `run_gate_benchmark.sh` by modifying the `num_gate_reps` variable

### `gate_benchmark.cpp`:
* A single random angle is generated per gate repetition and qubit; the same random angle is used once for all of the parameterised gates
* The gates are applied in the order X, Y, Z, H, CNOT, CZ, RX, RY, RZ, CRX, CRY, CRZ
* The above order is repeated `num_gate_reps`-times

### `gate_benchmark_plotter.py`:
* The first plot shows the absolute runtime
* The second plot is on a loglog scale which better depicts the exponential scaling of the relative runtime with respect to the number of simulated qubits
* We plot the time needed to execute the gate sequence averaged over the repetitions
96 changes: 96 additions & 0 deletions examples/gate_benchmark.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
#include <algorithm>
#include <chrono>
#include <cstdlib>
#include <iostream>
#include <random>
#include <stdexcept>
#include <string>

#include "StateVectorManaged.hpp"

/**
* @brief Outputs wall-time for gate-benchmark.
* @param argc Number of arguments + 1 passed by user.
* @param argv Binary name followed by number of times gate is repeated and
* number of qubits.
* @return Returns 0 if completed successfully.
*/
int main(int argc, char *argv[]) {
using TestType = double;

// Handle input
try {
if (argc != 3) {
throw argc;
}
} catch (int e) {
std::cerr << "Wrong number of inputs. User provided " << e - 1
<< " inputs. "
<< "Usage: " + std::string(argv[0]) +
" $num_gate_reps $num_qubits"
<< std::endl;
return -1;
}
const size_t num_gate_reps = std::stoi(argv[1]);
const size_t num_qubits = std::stoi(argv[2]);

// Generate random values for parametric gates
std::random_device rd;
std::default_random_engine eng(rd());
std::uniform_real_distribution<TestType> distr(0.0, 1.0);
std::vector<std::vector<TestType>> random_parameter_vector(num_gate_reps);
std::for_each(
random_parameter_vector.begin(), random_parameter_vector.end(),
[num_qubits, &eng, &distr](std::vector<TestType> &vec) {
vec.resize(num_qubits);
std::for_each(vec.begin(), vec.end(),
[&eng, &distr](TestType &val) { val = distr(eng); });
});

// Run each gate specified number of times and measure walltime
Pennylane::StateVectorManaged<TestType> svdat{num_qubits};
std::chrono::time_point<std::chrono::high_resolution_clock> t_start, t_end;
t_start = std::chrono::high_resolution_clock::now();
for (size_t gate_rep = 0; gate_rep < num_gate_reps; gate_rep++) {
for (size_t index = 0; index < num_qubits; index++) {
// Apply single qubit non-parametric operations
const auto int_idx = svdat.getInternalIndices({index});
const auto ext_idx = svdat.getExternalIndices({index});
svdat.applyPauliX(int_idx, ext_idx, false);
svdat.applyPauliY(int_idx, ext_idx, false);
svdat.applyPauliZ(int_idx, ext_idx, false);
svdat.applyHadamard(int_idx, ext_idx, false);

// Apply two qubit non-parametric operations
const auto two_qubit_int_idx =
svdat.getInternalIndices({index, (index + 1) % num_qubits});
const auto two_qubit_ext_idx =
svdat.getExternalIndices({index, (index + 1) % num_qubits});
svdat.applyCNOT(two_qubit_int_idx, two_qubit_ext_idx, false);
svdat.applyCZ(two_qubit_int_idx, two_qubit_ext_idx, false);

// Apply single qubit parametric operations
const TestType angle =
2.0 * M_PI * random_parameter_vector[gate_rep][index];
svdat.applyRX(int_idx, ext_idx, false, angle);
svdat.applyRY(int_idx, ext_idx, false, angle);
svdat.applyRZ(int_idx, ext_idx, false, angle);

// Apply two qubit parametric operations
svdat.applyCRX(two_qubit_int_idx, two_qubit_ext_idx, false, angle);
svdat.applyCRY(two_qubit_int_idx, two_qubit_ext_idx, false, angle);
svdat.applyCRZ(two_qubit_int_idx, two_qubit_ext_idx, false, angle);
}
}
t_end = std::chrono::high_resolution_clock::now();

// Output walltime in csv format (Num Qubits, Time (milliseconds))
const auto walltime =
0.001 * ((std::chrono::duration_cast<std::chrono::microseconds>(
t_end - t_start))
.count());
std::cout << num_qubits << ", "
<< walltime / static_cast<double>(num_gate_reps) << std::endl;

return 0;
}
39 changes: 39 additions & 0 deletions examples/gate_benchmark_plotter.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
import sys
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd # Needed to read .csv file

if __name__ == "__main__":
assert len(sys.argv) == 3, "Usage: $PYTHON3_PATH " + sys.argv[0] + " $PATH_TO_CSV $PATH_TO_COMPILER_INFO"

data_df = pd.read_csv(sys.argv[1])
num_qubits_idx = data_df.columns.get_loc("Num Qubits")
time_idx = data_df.columns.get_loc(" Time (milliseconds)")

compiler_info = open(sys.argv[2], 'r').readlines()
optimization = "-O3"

data = data_df.to_numpy()
avg_time_arr = [np.average(data[data[:, num_qubits_idx]==num_qubits][:, time_idx]) for num_qubits in data[:, num_qubits_idx]]

# Plot absolute values in lin-lin plot
plt.title("Averaged Absolute Time vs Number of Qubits\n")
plt.xlabel("Number of Qubits in $[1]$")
plt.ylabel("Time in $[ms]$")
plt.grid(linestyle=':')
plt.plot(data[:, num_qubits_idx], avg_time_arr, "rX")
plt.figtext(0.05,0.0, ("Compiler:\t" + compiler_info[0] + "Optimization:\t" + optimization).expandtabs(), fontsize=7, va="bottom", ha="left")
plt.subplots_adjust(bottom=0.2)
plt.savefig("avg_time.png", dpi=200)
plt.close()

# Plot relative values in log-log plot
plt.title("Scaling Behaviour: Relative Time vs Number of Qubits")
plt.xlabel("Number of Qubits in $[1]$")
plt.ylabel("Relative Time (compared to 1 qubit) in $[1]$")
plt.grid(linestyle=':')
plt.loglog(data[:, num_qubits_idx], avg_time_arr/avg_time_arr[0], "rX")
plt.figtext(0.05,0.0, ("Compiler:\t" + compiler_info[0] + "Optimization:\t" + optimization).expandtabs(), fontsize=7, va="bottom", ha="left")
plt.subplots_adjust(bottom=0.2)
plt.savefig("scaling.png", dpi=200)
plt.close()
3 changes: 3 additions & 0 deletions examples/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
numpy
matplotlib
pandas
7 changes: 7 additions & 0 deletions examples/run_cleanup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
#!/bin/bash
echo Removing build folder, gate_benchmark.csv, compiler_info.txt, avg_time.png, scaling.png
rm -rf build
rm gate_benchmark.csv
rm compiler_info.txt
rm avg_time.png
rm scaling.png
48 changes: 48 additions & 0 deletions examples/run_gate_benchmark.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
#!/bin/bash

crt_dir=$(pwd)

# Export env variables in case cmake reverts to default values
export CXX=$1
if [ $# -eq 0 ]; then
echo "Usage: bash $0 CXX_COMPILER"
exit 1
fi

# Compiler version & optimization
compiler_file_name=compiler_info.txt
path_to_compiler_file=$crt_dir/$compiler_file_name
echo "Creating $path_to_compiler_file"
$CXX --version | head -n 1 > $path_to_compiler_file

# CMake & make
mkdir build
pushd ./build
cmake -DCMAKE_CXX_COMPILER=$CXX .. && make
popd

# Parameter initialization
min_num_qubits=6
max_num_qubits=22
num_qubits_increment=2
num_gate_reps=3

# Creating data file
data_file_name=gate_benchmark.csv
binary_dir=$crt_dir/build
binary_name=gate_benchmark
path_to_binary=$binary_dir/$binary_name
path_to_csv=$crt_dir/$data_file_name
echo "Creating $path_to_csv"
echo "Num Qubits, Time (milliseconds)" > $path_to_csv

# Generate data
for ((num_qubits=$min_num_qubits; num_qubits<$max_num_qubits+1; num_qubits+=$num_qubits_increment)); do
printf "Run with %1d gate repitions and %2d qubits \n" "$num_gate_reps" "$num_qubits"
$path_to_binary ${num_gate_reps} ${num_qubits} >> $path_to_csv
done

# Plot results
python_path=$(which python3)
echo "Plotting results"
$python_path gate_benchmark_plotter.py $path_to_csv $path_to_compiler_file
6 changes: 2 additions & 4 deletions pennylane_lightning/src/algorithms/AdjointDiff.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -136,8 +136,7 @@ void applyGeneratorControlledPhaseShift(
} // namespace
/// @endcond

namespace Pennylane {
namespace Algorithms {
namespace Pennylane::Algorithms {

/**
* @brief Utility struct for observable operations used by AdjointJacobian
Expand Down Expand Up @@ -764,5 +763,4 @@ template <class T = double> class AdjointJacobian {
}
};

} // namespace Algorithms
} // namespace Pennylane
} // namespace Pennylane::Algorithms
6 changes: 2 additions & 4 deletions pennylane_lightning/src/simulator/Gates.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,7 @@ using namespace Pennylane::Util;
}
/// @endcond

namespace Pennylane {
namespace Gates {
namespace Pennylane::Gates {

/**
* @brief Create a matrix representation of the PauliX gate data in row-major
Expand Down Expand Up @@ -507,5 +506,4 @@ static auto getControlledPhaseShift(const std::vector<U> &params)
return getControlledPhaseShift<T>(params.front());
}

} // namespace Gates
} // namespace Pennylane
} // namespace Pennylane::Gates
2 changes: 1 addition & 1 deletion pennylane_lightning/src/simulator/StateVector.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -275,7 +275,7 @@ template <class fp_t = double> class StateVector {
*
* @return const CFP_t* Pointer to statevector data.
*/
auto getData() const -> CFP_t * { return arr_; }
[[nodiscard]] auto getData() const -> CFP_t * { return arr_; }

/**
* @brief Get the underlying data pointer.
Expand Down
4 changes: 3 additions & 1 deletion pennylane_lightning/src/simulator/StateVectorManaged.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,9 @@ class StateVectorManaged : public StateVector<fp_t> {
return *this;
}
auto getDataVector() -> std::vector<CFP_t> & { return data_; }
auto getDataVector() const -> const std::vector<CFP_t> & { return data_; }
[[nodiscard]] auto getDataVector() const -> const std::vector<CFP_t> & {
return data_;
}

auto getInternalIndices(const std::vector<size_t> &qubit_indices)
-> std::vector<size_t> {
Expand Down
6 changes: 2 additions & 4 deletions pennylane_lightning/src/util/Util.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -53,8 +53,7 @@ using CBLAS_LAYOUT = enum CBLAS_LAYOUT {
#endif
/// @endcond

namespace Pennylane {
namespace Util {
namespace Pennylane::Util {

/**
* @brief Compile-time scalar real times complex number.
Expand Down Expand Up @@ -780,5 +779,4 @@ template <class T> struct remove_cvref {
using type = std::remove_cv_t<std::remove_reference_t<T>>;
};

} // namespace Util
} // namespace Pennylane
} // namespace Pennylane::Util

0 comments on commit 82c197b

Please sign in to comment.