-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support arbitrary gates #91
Conversation
Codecov Report
@@ Coverage Diff @@
## master #91 +/- ##
==========================================
+ Coverage 98.03% 98.38% +0.34%
==========================================
Files 3 3
Lines 51 62 +11
==========================================
+ Hits 50 61 +11
Misses 1 1
Continue to review full report at Codecov.
|
unsigned int indx = 2 * (baseIndex + j); | ||
CplxType matrix_elem(real_matrix[indx], real_matrix[indx + 1]); | ||
shiftedState[index] += matrix_elem * v[j]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This version of applyKernel()
is the same as the AbstractGate
one, except for the lines here. In this point, we are having to remember that real_matrix
is of the form:
[real, imag, real, imag, real, imag...]
, so we need to make jumps of 2 instead of 1.
operations = { | ||
"BasisState", | ||
"QubitStateVector", | ||
kernel_operations = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By doing this, lightning.qubit
will now support any gate that default.qubit
does. Further, if the gate is in kernel_operation
then we have an optimized kernel for it. If not, we'll directly use the the gate's matrix to apply the gate through standard matrix multiplication.
auto size = parameters.size(); | ||
unsigned int log2size = 0; | ||
while (size >>= 1) ++log2size; | ||
unsigned int numQubits = (log2size - 1) / 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We infer the number of qubits based upon the input parameters.
op_names.append(o.name) | ||
op_param.append(o.parameters) | ||
else: | ||
op_names.append("QubitUnitary") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, any gate that does not have an optimized kernel in lighting.qubit
will be treated as a unitary by the backend. We just need to find the unitary matrix, which we then view as
[real, imag, real, imag, real, imag...]
using mat.ravel().view(np.float64)
(no copies required).
These real numbers are then passed as parameters of the gate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the issue here that we cannot pass a complex array with pybind and thus need to represent it as a flattened real array?
@@ -323,6 +323,23 @@ namespace Pennylane { | |||
void applyKernel(const StateVector& state, const std::vector<size_t>& indices, const std::vector<size_t>& externalIndices); | |||
}; | |||
|
|||
class Unitary : public AbstractGate { | |||
private: | |||
const std::vector<CplxType> matrix; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually don't explicitly define matrix
and so matrix
and asMatrix
are not needed. However, they need to be defined in the header because they are virtual in AbstractGate
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why we don't just properly define the std::vector<CplxType> matrix
, and re-use the base-class implementation for applyKernel()
for no additional work? This would simply involve initialising a vector<CplxType>
in the create()
method from the parameters, and passing that into the constructor for the class.
Although, with that said, I'm wondering why we're even trying to encode the gate in the parameter list instead of just modifying the pybind bindings to pass the matrix in as a separate parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why we don't just properly define the std::vector matrix, and re-use the base-class implementation for applyKernel() for no additional work? This would simply involve initialising a vector in the create() method from the parameters, and passing that into the constructor for the class.
Agreed, I would have liked to do that but it wasn't clear to me how to cope with the matrix being passed as real values through the parameters
argument as a vector {real, complex, real, complex, ...}
(also agree that we should discuss whether this approach is the best way). If we have params
as a 2*d
length flattened and real vector, can we convert that to a d
-length complex vector with no overhead/copying?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have params as a 2*d length flattened and real vector, can we convert that to a d-length complex vector with no overhead/copying?
This solution already involves copying (the Unitary
constructor is calling the vector
copy constructor). In fact, because we're relying on the pybind implicit conversion from a python list op_param
to vector<vector<double>> params
, the overhead is already awful, scaling with the size of the matrix! So we really should be passing it in separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because we're relying on the pybind implicit conversion from a python list op_param to vector<vector> params, the overhead is already awful, scaling with the size of the matrix
Actually I take this back; this may not be the case. Its a python list of what's potentially a mixture of python lists and numpy array views. I'm not entirely sure how that gets handled.
tests/test_apply.py
Outdated
|
||
|
||
@pytest.mark.parametrize("wires", range(2, 10)) | ||
def test_qubit_unitary(wires): | ||
"""Test if lightning.qubit successful applies an arbitrary unitary""" | ||
|
||
unitary = unitary_group.rvs(2 ** wires, random_state=0) | ||
|
||
dev = qml.device("lightning.qubit", wires=wires) | ||
|
||
@qml.qnode(dev) | ||
def f(): | ||
qml.QubitUnitary(unitary, wires=range(wires)) | ||
return qml.state() | ||
|
||
target_state = unitary[:, 0] | ||
assert np.allclose(f(), target_state) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variants of this test are almost certainly provided by the device test suite, but I'm happy to have it explicitly here even if there is some redundancy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really great, with many clever solutions and workarounds! 💯 😍 Nothing major, maybe just worth commenting on the approach that is taking to allow support. Would just jump in for a quick round before approval.
Pennylane::Unitary Pennylane::Unitary::create(const vector<double>& parameters) { | ||
auto size = parameters.size(); | ||
unsigned int log2size = 0; | ||
while (size >>= 1) ++log2size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, would not mind having this still be a while
with syntax of
while (size >>= 1){
++log2size;
}
Also, this functionality could potentially come in handy in other parts of the code base too, could consider defining it in Utils.hpp
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we using while loops for what should be a trivial relation between parameters.size()
and log2size
? Isn't it just log2size = ((int) log2(parameters.size()) - 1)/ 2
, where log2
is defined in cmath
(we should really validate to ensure the original matrix dimension is exactly a power of 2)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good quesion, I just followed a suggestion from a quick search. However, I guess there are no issues with doing what you suggest when parameters.size()
is a power of two (?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just followed a suggestion from a quick search.
If we really cared about floating-point precision issues, we could just round the resulting result to the nearest integer.
I guess there are no issues with doing what you suggest when parameters.size() is a power of two (?).
I'm not sure why this would make a difference (its essentially an implied floor, instead of the current implementation's implied ceiling, neither of which is more correct); either way, we should be validating that it is exactly so!
@@ -323,6 +323,23 @@ namespace Pennylane { | |||
void applyKernel(const StateVector& state, const std::vector<size_t>& indices, const std::vector<size_t>& externalIndices); | |||
}; | |||
|
|||
class Unitary : public AbstractGate { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the existing gates we knew the matrix representation in advance (and could hardcode them). Down the line, would be nice to check that there's no major performance degradation if we have a circuit with several layers that apply the same QubitUnitary
. In such a case, parameters will be passed for every QubitUnitary
and lightning will consider each of them separately. 🤔
class Unitary : public AbstractGate { | ||
private: | ||
const std::vector<CplxType> matrix; | ||
const std::vector<double> real_matrix; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth leaving a comment on how this vector represents the complex matrix.
return Pennylane::Unitary(numQubits, parameters); | ||
} | ||
|
||
Pennylane::Unitary::Unitary(const int numQubits, const std::vector<double>& real_matrix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Find the name real_matrix
confusing, matrix_with_real_and_imag
(or something better)?
{} | ||
|
||
void Pennylane::Unitary::applyKernel(const StateVector& state, const std::vector<size_t>& indices, const std::vector<size_t>& externalIndices) { | ||
const vector<double>& real_matrix = asRealMatrix(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on the relative order of the PRs, inverse computations would be required here as per #89?
Pennylane::Unitary Pennylane::Unitary::create(const vector<double>& parameters) { | ||
auto size = parameters.size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not use case for now since we are binding PennyLane, though might be worth verifying the size of parameters
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with many clever solutions and workarounds!
I find that this is generally indicative of one of three things:
- There are genuine shortcomings in the language features, in which case ¯_(ツ)_/¯; or
- There are problems with the class design, in which case we should be modify said design to support the case that we're adding; or
- We need to take a step back, re-evaluate what it is we're trying to do, and see if there's a more natural way of accomplishing things.
This PR feels like we're trying to fit a square into a triangle-shaped hole, so it feels like 3) to me. And, in the first instance, it should already be shaped like a triangle!
Pennylane::Unitary Pennylane::Unitary::create(const vector<double>& parameters) { | ||
auto size = parameters.size(); | ||
unsigned int log2size = 0; | ||
while (size >>= 1) ++log2size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we using while loops for what should be a trivial relation between parameters.size()
and log2size
? Isn't it just log2size = ((int) log2(parameters.size()) - 1)/ 2
, where log2
is defined in cmath
(we should really validate to ensure the original matrix dimension is exactly a power of 2)?
@@ -323,6 +323,23 @@ namespace Pennylane { | |||
void applyKernel(const StateVector& state, const std::vector<size_t>& indices, const std::vector<size_t>& externalIndices); | |||
}; | |||
|
|||
class Unitary : public AbstractGate { | |||
private: | |||
const std::vector<CplxType> matrix; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why we don't just properly define the std::vector<CplxType> matrix
, and re-use the base-class implementation for applyKernel()
for no additional work? This would simply involve initialising a vector<CplxType>
in the create()
method from the parameters, and passing that into the constructor for the class.
Although, with that said, I'm wondering why we're even trying to encode the gate in the parameter list instead of just modifying the pybind bindings to pass the matrix in as a separate parameter.
@@ -583,6 +629,7 @@ static map<string, function<unique_ptr<Pennylane::AbstractGate>(const vector<dou | |||
addToDispatchTable<Pennylane::CGeneralRotationGate>(dispatchTable); | |||
addToDispatchTable<Pennylane::ToffoliGate>(dispatchTable); | |||
addToDispatchTable<Pennylane::CSWAPGate>(dispatchTable); | |||
addToDispatchTable<Pennylane::Unitary>(dispatchTable); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I'd rather we just treat anything that doesn't match the dispatch table:
if (dispatchTableIterator == dispatchTable.end()) |
To treat it as a Unitary
automatically. And if we pass the matrix (or a functional to generate it) as a separate parameter, then we can get rid of the kernel_operations
list in Python, relieving the Python-end from the need to know implementation details in the backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of (a) Unitary
being the dispatch table default and (b) kernel_operations
being removed (that also worried me in terms of maintainability). However, if we do (b), don't we then need to pass the matrix
from Python to the backend for every operation (including those with a defined kernel)? Is there an overhead in doing that:
- Even if passed by reference, do we expect any overhead from PyBind?
- There would be a (potentially negligible) overhead in creating the matrix in Python when not needed to. E.g., for Rot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if passed by reference, do we expect any overhead from PyBind?
If its just a numpy array that we're referencing, I don't think so. It should be easy enough to test by just passing through a humongous matrix and seeing if it makes a difference.
There would be a (potentially negligible) overhead in creating the matrix in Python when not needed to. E.g., for Rot?
That's exactly why I've suggested trying to pass a functional to generate the matrix instead of directly passing the matrix, so that it can be called on-demand. There may be some complications in managing the lifetime of the returned object so that it doesn't get garbage collected by Python prematurely, but I haven't looked into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ThomasLoke, really appreciate the feedback and guidance on this! And I agree that this may be a case of taking a step back and potentially re-evaluating, although also balancing that against the cost of slower development.
So the current approach is: if op doesn't have a supported kernel, pass the unitary matrix as a flat, real-valued vector. In the backend, define a tweaked kernel that correctly calculates the matrix multiplication.
What would our ideal approach look like?
- If op doesn't have supported kernel we somehow pass the matrix to the backend
- If op name isn't defined in the dispatch table, we fall back to
Unitary
Unitary
just relies on theapplyKernel
fromAbstractGate
So for me the question seems to be about how and when to pass the matrix to the backend. I also wonder if we're starting to see limitations in the simple serialization approach we used to communicate between frontend and backend. For example, what if we could pass a list of Gate
objects to the backend (perhaps this would be more lightweight than PL's current Operation
class).
Pennylane::Unitary Pennylane::Unitary::create(const vector<double>& parameters) { | ||
auto size = parameters.size(); | ||
unsigned int log2size = 0; | ||
while (size >>= 1) ++log2size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good quesion, I just followed a suggestion from a quick search. However, I guess there are no issues with doing what you suggest when parameters.size()
is a power of two (?).
@@ -323,6 +323,23 @@ namespace Pennylane { | |||
void applyKernel(const StateVector& state, const std::vector<size_t>& indices, const std::vector<size_t>& externalIndices); | |||
}; | |||
|
|||
class Unitary : public AbstractGate { | |||
private: | |||
const std::vector<CplxType> matrix; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why we don't just properly define the std::vector matrix, and re-use the base-class implementation for applyKernel() for no additional work? This would simply involve initialising a vector in the create() method from the parameters, and passing that into the constructor for the class.
Agreed, I would have liked to do that but it wasn't clear to me how to cope with the matrix being passed as real values through the parameters
argument as a vector {real, complex, real, complex, ...}
(also agree that we should discuss whether this approach is the best way). If we have params
as a 2*d
length flattened and real vector, can we convert that to a d
-length complex vector with no overhead/copying?
@@ -583,6 +629,7 @@ static map<string, function<unique_ptr<Pennylane::AbstractGate>(const vector<dou | |||
addToDispatchTable<Pennylane::CGeneralRotationGate>(dispatchTable); | |||
addToDispatchTable<Pennylane::ToffoliGate>(dispatchTable); | |||
addToDispatchTable<Pennylane::CSWAPGate>(dispatchTable); | |||
addToDispatchTable<Pennylane::Unitary>(dispatchTable); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of (a) Unitary
being the dispatch table default and (b) kernel_operations
being removed (that also worried me in terms of maintainability). However, if we do (b), don't we then need to pass the matrix
from Python to the backend for every operation (including those with a defined kernel)? Is there an overhead in doing that:
- Even if passed by reference, do we expect any overhead from PyBind?
- There would be a (potentially negligible) overhead in creating the matrix in Python when not needed to. E.g., for Rot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I agree that this may be a case of taking a step back and potentially re-evaluating, although also balancing that against the cost of slower development.
That's partially a matter of perspective. Slower development in the present to get things right may save a lot of pain in the future. But I don't dictate development milestones, so ¯\(ツ)/¯
What would our ideal approach look like?
As mentioned in the preceding comments, ideally Python shouldn't even need to distinguish between optimised and non-optimised gates. Instead, it'd just pass in a functional to generate the matrix, and the backend will invoke that if necessary.
In terms of serialisation, passing in an extra argument to extend the Bindings interface does feel a bit suboptimal. Ideally, we'd just encapsulate all the settings that define a gate within a class of some sort, and just instantiate a single list of those objects to the backend. But reworking this doesn't need to be the domain of this PR.
@@ -583,6 +629,7 @@ static map<string, function<unique_ptr<Pennylane::AbstractGate>(const vector<dou | |||
addToDispatchTable<Pennylane::CGeneralRotationGate>(dispatchTable); | |||
addToDispatchTable<Pennylane::ToffoliGate>(dispatchTable); | |||
addToDispatchTable<Pennylane::CSWAPGate>(dispatchTable); | |||
addToDispatchTable<Pennylane::Unitary>(dispatchTable); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if passed by reference, do we expect any overhead from PyBind?
If its just a numpy array that we're referencing, I don't think so. It should be easy enough to test by just passing through a humongous matrix and seeing if it makes a difference.
There would be a (potentially negligible) overhead in creating the matrix in Python when not needed to. E.g., for Rot?
That's exactly why I've suggested trying to pass a functional to generate the matrix instead of directly passing the matrix, so that it can be called on-demand. There may be some complications in managing the lifetime of the returned object so that it doesn't get garbage collected by Python prematurely, but I haven't looked into it.
Pennylane::Unitary Pennylane::Unitary::create(const vector<double>& parameters) { | ||
auto size = parameters.size(); | ||
unsigned int log2size = 0; | ||
while (size >>= 1) ++log2size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just followed a suggestion from a quick search.
If we really cared about floating-point precision issues, we could just round the resulting result to the nearest integer.
I guess there are no issues with doing what you suggest when parameters.size() is a power of two (?).
I'm not sure why this would make a difference (its essentially an implied floor, instead of the current implementation's implied ceiling, neither of which is more correct); either way, we should be validating that it is exactly so!
@@ -323,6 +323,23 @@ namespace Pennylane { | |||
void applyKernel(const StateVector& state, const std::vector<size_t>& indices, const std::vector<size_t>& externalIndices); | |||
}; | |||
|
|||
class Unitary : public AbstractGate { | |||
private: | |||
const std::vector<CplxType> matrix; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have params as a 2*d length flattened and real vector, can we convert that to a d-length complex vector with no overhead/copying?
This solution already involves copying (the Unitary
constructor is calling the vector
copy constructor). In fact, because we're relying on the pybind implicit conversion from a python list op_param
to vector<vector<double>> params
, the overhead is already awful, scaling with the size of the matrix! So we really should be passing it in separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no major thoughts other than some of the comments already expressed here. I agree that it seems like a better solution to somehow use the already existing applyKernel
here, rather than applying the complex-splitting trick (although I too find it quite clever). Not sure how this could be solved better though. 🤔
unitary_view = mat.ravel().view(np.float64) | ||
op_param.append(unitary_view) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, view()
splits up the complex number into two floats. Nice! 😮
op_names.append(o.name) | ||
op_param.append(o.parameters) | ||
else: | ||
op_names.append("QubitUnitary") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the issue here that we cannot pass a complex array with pybind and thus need to represent it as a flattened real array?
Closing this for now while we think about the best way to implement. |
No description provided.