Introduce device capability flag and default handler for parameter br…

…oadcasting (#2590) * introduce Operator.ndim_params, Operator.batch_size, QuantumTape.batch_size * linting * changelog * enable tf.function input_signature usage * black * test for unsilenced error * Apply suggestions from code review Co-authored-by: Josh Izaac <josh146@gmail.com> * introduce device flag and batch_transform for unbroadcasting; use transform in device.batch_transform * black, [skip ci] * code review * string formatting [skip ci] * operation broadcasting interface tests * unbroadcast_expand * tests for expand function * tests * black * compatibility with TensorFlow 2.6 * builtins unstack * failing case coverage * stop using I in operation.py [skip ci] * Apply suggestions from code review Co-authored-by: Josh Izaac <josh146@gmail.com> * review * Apply suggestions from code review Co-authored-by: Josh Izaac <josh146@gmail.com> * review [skip ci] * move changelog section from "improvements" to "new features" * changelog * add missing files * namespace * linting variable names * pin protobuf<4.21.0 * docstring * unpin protobuf * Allow broadcasting in the numerical representations of standard operations (#2609) * commit old changes * intermed * clean up, move broadcast dimension first * update tests that manually set ndim_params for default ops * pin protobuf<4.21.0 * improve shape coersion order * changelog formatting * broadcasted pow tests * attribute test, ControlledQubitUnitary update * test kwargs attributes * Apply suggestions from code review Co-authored-by: Josh Izaac <josh146@gmail.com> * changelog * review * remove prints * explicit attribute supports_broadcasting tests * tests disentangle * fix * PauliRot broadcasted identity compatible with TF * rename "batched" into "broadcasted" for uniform namespace * old TF version support in qubitunitary unitarity check * python3.7 support * Apply suggestions from code review Co-authored-by: Josh Izaac <josh146@gmail.com> * linebreak Co-authored-by: Josh Izaac <josh146@gmail.com> * black * black again * feature collision amend tests * black [skip ci] Co-authored-by: Josh Izaac <josh146@gmail.com>
PennyLaneAI · Jun 3, 2022 · 85cc93f · 85cc93f
1 parent 3344c77
commit 85cc93f
Show file tree

Hide file tree

Showing 27 changed files with 2,761 additions and 382 deletions.
diff --git a/doc/releases/changelog-dev.md b/doc/releases/changelog-dev.md
@@ -4,20 +4,119 @@
 
 <h3>New features since last release</h3>
 
-* Operators have new attributes `ndim_params` and `batch_size`, and `QuantumTapes` have the new
-  attribute `batch_size`.
-  - `Operator.ndim_params` contains the expected number of dimensions per parameter of the operator,
-  - `Operator.batch_size` contains the size of an additional parameter broadcasting axis, if present,
-  - `QuantumTape.batch_size` contains the `batch_size` of its operations (see below).
+* Parameter broadcasting within operations and tapes was introduced.
   [(#2575)](https://github.com/PennyLaneAI/pennylane/pull/2575)
+  [(#2590)](https://github.com/PennyLaneAI/pennylane/pull/2590)
+  [(#2609)](https://github.com/PennyLaneAI/pennylane/pull/2609)
+
+  Parameter broadcasting refers to passing parameters with a (single) leading additional
+  dimension (compared to the expected parameter shape) to `Operator`'s.
+  Introducing this concept involves multiple changes:
+
+  1. New class attributes
+    - `Operator.ndim_params` can be specified by developers to provide the expected number of dimensions for each parameter
+      of an operator.
+    - `Operator.batch_size` returns the size of an additional parameter-broadcasting axis,
+      if present.
+    - `QuantumTape.batch_size` returns the `batch_size` of its operations (see logic below).
+    - `Device.capabilities()["supports_broadcasting"]` is a Boolean flag indicating whether a
+      device natively is able to apply broadcasted operators.
+  2. New functionalities
+    - `Operator`s use their new `ndim_params` attribute to set their new attribute `batch_size`
+      at instantiation. `batch_size=None` corresponds to unbroadcasted operators.
+    - `QuantumTape`s automatically determine their new `batch_size` attribute from the
+      `batch_size`s of their operations. For this, all `Operators` in the tape must have the same
+      `batch_size` or `batch_size=None`. That is, mixing broadcasted and unbroadcasted `Operators`
+      is allowed, but mixing broadcasted `Operators` with differing `batch_size` is not,
+      similar to NumPy broadcasting.
+    - A new tape `batch_transform` called `broadcast_expand` was added. It transforms a single
+      tape with `batch_size!=None` (broadcasted) into multiple tapes with `batch_size=None`
+      (unbroadcasted) each.
+    - `Device`s natively can handle broadcasted `QuantumTape`s by using `broadcast_expand` if
+      the new flag `capabilities()["supports_broadcasting"]` is set to `False` (the default).
+  3. Feature support
+    - Many parametrized operations now have the attribute `ndim_params` and
+      allow arguments with a broadcasting dimension in their numerical representations.
+      This includes all gates in `ops/qubit/parametric_ops` and `ops/qubit/matrix_ops`.
+      The broadcasted dimension is the first dimension in representations.
+      Note that the broadcasted parameter has to be passed as an `tensor` but not as a python
+      `list` or `tuple` for most operations.
+
+  **Example**
+
+  Instantiating a rotation gate with a one-dimensional array leads to a broadcasted `Operation`:
 
-  When providing an operator with the `ndim_params` attribute, it will
-  determine whether (and with which `batch_size`) its input parameter(s)
-  is/are broadcasted.
-  A `QuantumTape` can then infer from its operations whether it is batched.
-  For this, all `Operators` in the tape must have the same `batch_size` or `batch_size=None`.
-  That is, mixing broadcasted and unbroadcasted `Operators` is allowed, but mixing broadcasted
-  `Operators` with differing `batch_size` is not, similar to NumPy broadcasting.
+  ```pycon
+  >>> op = qml.RX(np.array([0.1, 0.2, 0.3], requires_grad=True), 0)
+  >>> op.batch_size
+  3
+  ```
+
+  It's matrix correspondingly is augmented by a leading dimension of size `batch_size`:
+
+  ```pycon
+  >>> np.round(op.matrix(), 4)
+  tensor([[[0.9988+0.j    , 0.    -0.05j  ],
+         [0.    -0.05j  , 0.9988+0.j    ]],
+        [[0.995 +0.j    , 0.    -0.0998j],
+         [0.    -0.0998j, 0.995 +0.j    ]],
+        [[0.9888+0.j    , 0.    -0.1494j],
+         [0.    -0.1494j, 0.9888+0.j    ]]], requires_grad=True)
+  >>> op.matrix().shape
+  (3, 2, 2)
+  ```
+
+  A tape with such an operation will detect the `batch_size` and inherit it:
+
+  ```pycon
+  >>> with qml.tape.QuantumTape() as tape:
+  >>>     qml.apply(op)
+  >>> tape.batch_size
+  3
+  ```
+
+  A tape may contain broadcasted and unbroadcasted `Operation`s
+
+  ```pycon
+  >>> with qml.tape.QuantumTape() as tape:
+  >>>     qml.apply(op)
+  >>>     qml.RY(1.9, 0)
+  >>> tape.batch_size
+  3
+  ```
+
+  but not `Operation`s with differing (non-`None`) `batch_size`s:
+
+  ```pycon
+  >>> with qml.tape.QuantumTape() as tape:
+  >>>     qml.apply(op)
+  >>>     qml.RY(np.array([1.9, 2.4]), 0)
+  ValueError: The batch sizes of the tape operations do not match, they include 3 and 2.
+  ```
+
+  When creating a valid broadcasted tape, we can expand it into unbroadcasted tapes with
+  the new `broadcast_expand` transform, and execute the three tapes independently.
+
+  ```pycon
+  >>> with qml.tape.QuantumTape() as tape:
+  >>>     qml.apply(op)
+  >>>     qml.RY(1.9, 0)
+  >>>     qml.apply(op)
+  >>>     qml.expval(qml.PauliZ(0))
+  >>> tapes, fn = qml.transforms.broadcast_expand(tape)
+  >>> len(tapes)
+  3
+  >>> dev = qml.device("default.qubit", wires=1)
+  >>> fn(qml.execute(tapes, dev, None))
+  array([-0.33003414, -0.34999899, -0.38238817])
+  ```
+
+  However, devices will handle this automatically under the hood:
+
+  ```pycon
+  >>> qml.execute([tape], dev, None)[0]
+  array([-0.33003414, -0.34999899, -0.38238817])
+  ```
 
 * Boolean mask indexing of the parameter-shift Hessian
   [(#2538)](https://github.com/PennyLaneAI/pennylane/pull/2538)
@@ -133,11 +232,17 @@
   for `qml.QueuingContext.update_info` in a variety of places.
   [(#2612)](https://github.com/PennyLaneAI/pennylane/pull/2612)
 
-* `BasisEmbedding` can accept an int as argument instead of a list of bits (optionally). Example: `qml.BasisEmbedding(4, wires = range(4))` is now equivalent to `qml.BasisEmbedding([0,1,0,0], wires = range(4))` (because 4=0b100). 
+* `BasisEmbedding` can accept an int as argument instead of a list of bits (optionally).
   [(#2601)](https://github.com/PennyLaneAI/pennylane/pull/2601)
+
+  Example:
+
+  `qml.BasisEmbedding(4, wires = range(4))` is now equivalent to
+  `qml.BasisEmbedding([0,1,0,0], wires = range(4))` (because `4=0b100`). 
 
 * Introduced a new `is_hermitian` property to determine if an operator can be used in a measurement process.
   [(#2629)](https://github.com/PennyLaneAI/pennylane/pull/2629)
+
 <h3>Breaking changes</h3>
 
 * The `qml.queuing.Queue` class is now removed.
@@ -179,7 +284,8 @@
   as trainable do not have any impact on the QNode output.
   [(#2584)](https://github.com/PennyLaneAI/pennylane/pull/2584)
 
-* `QNode`'s now can interpret variations on the interface name, like `"tensorflow"` or `"jax-jit"`, when requesting backpropagation. 
+* `QNode`'s now can interpret variations on the interface name, like `"tensorflow"` 
+  or `"jax-jit"`, when requesting backpropagation. 
   [(#2591)](https://github.com/PennyLaneAI/pennylane/pull/2591)
 
 * Fixed a bug for `diff_method="adjoint"` where incorrect gradients were

diff --git a/pennylane/_device.py b/pennylane/_device.py
@@ -108,7 +108,7 @@ class Device(abc.ABC):
     """
 
     # pylint: disable=too-many-public-methods,too-many-instance-attributes
-    _capabilities = {"model": None}
+    _capabilities = {"model": None, "supports_broadcasting": False}
     """The capabilities dictionary stores the properties of a device. Devices can add their
     own custom properties and overwrite existing ones by overriding the ``capabilities()`` method."""
 
@@ -705,11 +705,6 @@ def batch_transform(self, circuit):
             the sequence of circuits to be executed, and a post-processing function
             to be applied to the list of evaluated circuit results.
         """
-
-        # If the observable contains a Hamiltonian and the device does not
-        # support Hamiltonians, or if the simulation uses finite shots, or
-        # if the Hamiltonian explicitly specifies an observable grouping,
-        # split tape into multiple tapes of diagonalizable known observables.
         supports_hamiltonian = self.supports_observable("Hamiltonian")
         finite_shots = self.shots is not None
         grouping_known = all(
@@ -723,26 +718,49 @@ def batch_transform(self, circuit):
         return_types = [m.return_type for m in circuit.observables]
 
         if hamiltonian_in_obs and ((not supports_hamiltonian or finite_shots) or grouping_known):
+            # If the observable contains a Hamiltonian and the device does not
+            # support Hamiltonians, or if the simulation uses finite shots, or
+            # if the Hamiltonian explicitly specifies an observable grouping,
+            # split tape into multiple tapes of diagonalizable known observables.
             try:
-                return qml.transforms.hamiltonian_expand(circuit, group=False)
+                circuits, hamiltonian_fn = qml.transforms.hamiltonian_expand(circuit, group=False)
 
             except ValueError as e:
                 raise ValueError(
                     "Can only return the expectation of a single Hamiltonian observable"
                 ) from e
-
-        if (
+        elif (
             len(circuit._obs_sharing_wires) > 0
             and not hamiltonian_in_obs
             and not qml.measurements.Sample in return_types
             and not qml.measurements.Probability in return_types
         ):
             # Check for case of non-commuting terms and that there are no Hamiltonians
             # TODO: allow for Hamiltonians in list of observables as well.
-            return qml.transforms.split_non_commuting(circuit)
+            circuits, hamiltonian_fn = qml.transforms.split_non_commuting(circuit)
+
+        else:
+            # otherwise, return the output of an identity transform
+            circuits, hamiltonian_fn = [circuit], lambda res: res[0]
+
+        # Check whether the circuit was broadcasted (then the Hamiltonian-expanded
+        # ones will be as well) and whether broadcasting is supported
+        if circuit.batch_size is None or self.capabilities().get("supports_broadcasting"):
+            # If the circuit wasn't broadcasted or broadcasting is supported, no action required
+            return circuits, hamiltonian_fn
+
+        # Expand each of the broadcasted Hamiltonian-expanded circuits
+        expanded_tapes, expanded_fn = qml.transforms.map_batch_transform(
+            qml.transforms.broadcast_expand, circuits
+        )
+
+        # Chain the postprocessing functions of the broadcasted-tape expansions and the Hamiltonian
+        # expansion. Note that the application order is reversed compared to the expansion order,
+        # i.e. while we first applied `hamiltonian_expand` to the tape, we need to process the
+        # results from the broadcast expansion first.
+        total_processing = lambda results: hamiltonian_fn(expanded_fn(results))
 
-        # otherwise, return an identity transform
-        return [circuit], lambda res: res[0]
+        return expanded_tapes, total_processing
 
     @property
     def op_queue(self):

diff --git a/pennylane/math/single_dispatch.py b/pennylane/math/single_dispatch.py
@@ -42,11 +42,13 @@ def _i(name):
 ar.register_function("builtins", "block_diag", lambda x: _scipy_block_diag(*x))
 ar.register_function("numpy", "gather", lambda x, indices: x[np.array(indices)])
 ar.register_function("numpy", "unstack", list)
+ar.register_function("builtins", "unstack", list)
 
 # the following is required to ensure that SciPy sparse Hamiltonians passed to
 # qml.SparseHamiltonian are not automatically 'unwrapped' to dense NumPy arrays.
 ar.register_function("scipy", "to_numpy", lambda x: x)
 ar.register_function("scipy", "shape", np.shape)
+ar.register_function("scipy", "ndim", np.ndim)
 
 
 def _scatter_element_add_numpy(tensor, index, value):

diff --git a/pennylane/operation.py b/pennylane/operation.py
@@ -190,29 +190,38 @@ def expand_matrix(base_matrix, wires, wire_order):
     # TODO[Maria]: In future we should consider making ``utils.expand`` differentiable and calling it here.
     wire_order = Wires(wire_order)
     n = len(wires)
-    interface = qml.math._multi_dispatch(base_matrix)  # pylint: disable=protected-access
+    shape = qml.math.shape(base_matrix)
+    batch_dim = shape[0] if len(shape) == 3 else None
+    interface = qml.math.get_interface(base_matrix)  # pylint: disable=protected-access
 
     # operator's wire positions relative to wire ordering
     op_wire_pos = wire_order.indices(wires)
 
     identity = qml.math.reshape(
-        qml.math.eye(2 ** len(wire_order), like=interface), [2] * len(wire_order) * 2
+        qml.math.eye(2 ** len(wire_order), like=interface), [2] * (len(wire_order) * 2)
     )
-    axes = (list(range(n, 2 * n)), op_wire_pos)
+    # The first axis entries are range(n, 2n) for batch_dim=None and range(n+1, 2n+1) else
+    axes = (list(range(-n, 0)), op_wire_pos)
 
     # reshape op.matrix()
     op_matrix_interface = qml.math.convert_like(base_matrix, identity)
-    mat_op_reshaped = qml.math.reshape(op_matrix_interface, [2] * n * 2)
+    shape = [batch_dim] + [2] * (n * 2) if batch_dim else [2] * (n * 2)
+    mat_op_reshaped = qml.math.reshape(op_matrix_interface, shape)
     mat_tensordot = qml.math.tensordot(
         mat_op_reshaped, qml.math.cast_like(identity, mat_op_reshaped), axes
     )
 
     unused_idxs = [idx for idx in range(len(wire_order)) if idx not in op_wire_pos]
     # permute matrix axes to match wire ordering
     perm = op_wire_pos + unused_idxs
-    mat = qml.math.moveaxis(mat_tensordot, wire_order.indices(wire_order), perm)
+    sources = wire_order.indices(wire_order)
+    if batch_dim:
+        perm = [p + 1 for p in perm]
+        sources = [s + 1 for s in sources]
 
-    mat = qml.math.reshape(mat, (2 ** len(wire_order), 2 ** len(wire_order)))
+    mat = qml.math.moveaxis(mat_tensordot, sources, perm)
+    shape = [batch_dim] + [2 ** len(wire_order)] * 2 if batch_dim else [2 ** len(wire_order)] * 2
+    mat = qml.math.reshape(mat, shape)
 
     return mat
 
@@ -804,7 +813,9 @@ def label(self, decimals=None, base_label=None, cache=None):
 
         if len(qml.math.shape(params[0])) != 0:
             # assume that if the first parameter is matrix-valued, there is only a single parameter
-            # this holds true for all current operations and templates
+            # this holds true for all current operations and templates unless parameter broadcasting
+            # is used
+            # TODO[dwierichs]: Implement a proper label for broadcasted operators
             if (
                 cache is None
                 or not isinstance(cache.get("matrices", None), list)
@@ -926,7 +937,8 @@ def _check_batching(self, params):
             ]
             if not qml.math.allclose(first_dims, first_dims[0]):
                 raise ValueError(
-                    f"Batching was attempted but the batched dimensions do not match: {first_dims}."
+                    "Broadcasting was attempted but the broadcasted dimensions "
+                    f"do not match: {first_dims}."
                 )
             self._batch_size = first_dims[0]
 
@@ -1409,7 +1421,7 @@ def matrix(self, wire_order=None):
         canonical_matrix = self.compute_matrix(*self.parameters, **self.hyperparameters)
 
         if self.inverse:
-            canonical_matrix = qml.math.conj(qml.math.T(canonical_matrix))
+            canonical_matrix = qml.math.conj(qml.math.moveaxis(canonical_matrix, -2, -1))
 
         if wire_order is None or self.wires == Wires(wire_order):
             return canonical_matrix

diff --git a/pennylane/ops/functions/matrix.py b/pennylane/ops/functions/matrix.py
@@ -141,6 +141,6 @@ def _matrix(tape, wire_order=None):
 
     for op in tape.operations:
         U = matrix(op, wire_order=wire_order)
-        unitary_matrix = qml.math.dot(U, unitary_matrix)
+        unitary_matrix = qml.math.tensordot(U, unitary_matrix, axes=[[-1], [-2]])
 
     return unitary_matrix
diff --git a/pennylane/ops/qubit/attributes.py b/pennylane/ops/qubit/attributes.py
@@ -199,3 +199,35 @@ def __contains__(self, obj):
 representation using ``np.linalg.eigvals``, which fails for some tensor types that the matrix
 may be cast in on backpropagation devices.
 """
+
+supports_broadcasting = Attribute(
+    [
+        "QubitUnitary",
+        "ControlledQubitUnitary",
+        "DiagonalQubitUnitary",
+        "RX",
+        "RY",
+        "RZ",
+        "PhaseShift",
+        "ControlledPhaseShift",
+        "Rot",
+        "MultiRZ",
+        "PauliRot",
+        "CRX",
+        "CRY",
+        "CRZ",
+        "CRot",
+        "U1",
+        "U2",
+        "U3",
+        "IsingXX",
+        "IsingYY",
+        "IsingZZ",
+    ]
+)
+"""Attribute: Operations that support parameter broadcasting.
+
+For such operations, the input parameters are allowed to have a single leading additional
+broadcasting dimension, creating the operation with a ``batch_size`` and leading to
+broadcasted tapes when used in a ``QuantumTape``.
+"""