Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chunk Hamiltonian, PauliSentence, LinearCombination [sc-65680] #873

Merged
merged 37 commits into from
Sep 5, 2024
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
e92539e
WIP
vincentmr Aug 26, 2024
891c165
WIP
vincentmr Aug 26, 2024
e285aaf
Simplify
vincentmr Aug 27, 2024
0556f34
Auto update version from '0.38.0-dev49' to '0.38.0-dev52'
ringo-but-quantum Aug 27, 2024
6e19fc8
Merge branch 'master' into chunck_hamiltonian
vincentmr Aug 27, 2024
352cfca
Apply suggestions from code review [skip ci]
vincentmr Aug 28, 2024
742f115
Fix split_obs [skip ci].
vincentmr Aug 28, 2024
db789b2
Remove obsolete _chunk_iterable [skip ci].
vincentmr Aug 28, 2024
099a641
Fix pylint warnings [skip ci].
vincentmr Aug 28, 2024
070fabd
Fix docstring [skip ci].
vincentmr Aug 28, 2024
21091b1
Remove obsolete unreachable branches. [skip ci]
vincentmr Aug 28, 2024
f000217
Simplify tests [skip ci]
vincentmr Aug 28, 2024
5bca9b9
trigger ci
vincentmr Aug 28, 2024
308dbb3
Remove property.
vincentmr Aug 28, 2024
f0d7f42
Update changelog
vincentmr Aug 28, 2024
3593d43
Update jac shape.
vincentmr Aug 28, 2024
e8d6682
Merge branch 'master' into chunck_hamiltonian
vincentmr Aug 29, 2024
a88f2aa
Auto update version from '0.38.0-dev52' to '0.38.0-dev53'
ringo-but-quantum Aug 29, 2024
c94d55e
Optimize applyInPlace hamiltonian.
vincentmr Aug 29, 2024
d5f434d
Remove obsolete function.
vincentmr Aug 29, 2024
cbdf0e9
Fix import
vincentmr Aug 29, 2024
c926b55
Fix TestSerializeObs
vincentmr Aug 29, 2024
24a33b5
Trigger CIs
AmintorDusko Aug 30, 2024
f4adcda
Remove LAPACK=ONfrom LGPU C++ tests.
vincentmr Aug 30, 2024
5b1ac6c
Merge branch 'master' into chunck_hamiltonian
vincentmr Aug 30, 2024
607ba06
Auto update version from '0.38.0-dev53' to '0.38.0-dev54'
ringo-but-quantum Aug 30, 2024
f77262c
trigger CIs
AmintorDusko Sep 3, 2024
7a56ee5
trigger ci
vincentmr Sep 3, 2024
5713ece
Merge remote-tracking branch 'origin/master' into chunck_hamiltonian
vincentmr Sep 3, 2024
47e2ef2
Revert pyproject
vincentmr Sep 3, 2024
5715a1c
Update version,
vincentmr Sep 3, 2024
3687d1e
Auto update version from '0.39.0-dev0' to '0.39.0-dev1'
ringo-but-quantum Sep 3, 2024
82af6ec
Merge remote-tracking branch 'origin/master' into chunck_hamiltonian
vincentmr Sep 5, 2024
3b8600e
Auto update version from '0.39.0-dev2' to '0.39.0-dev3'
ringo-but-quantum Sep 5, 2024
66ff9b4
Merge branch 'master' into chunck_hamiltonian
vincentmr Sep 5, 2024
1570216
Auto update version from '0.39.0-dev3' to '0.39.0-dev4'
ringo-but-quantum Sep 5, 2024
f4b8425
Import getenv only
vincentmr Sep 5, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .github/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,9 @@

### Improvements

* Smarter defaults for the `split_obs` argument in the serializer. The serializer splits linear combinations into chunks instead of all their terms.
[(#873)](https://github.com/PennyLaneAI/pennylane-lightning/pull/873/)

* Updated calls of ``size_t`` to ``std::size_t`` everywhere.
[(#816)](https://github.com/PennyLaneAI/pennylane-lightning/pull/816/)

Expand Down
49 changes: 38 additions & 11 deletions pennylane_lightning/core/_serialize.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
r"""
Helper functions for serializing quantum tapes.
"""
from itertools import islice
from typing import List, Sequence, Tuple

import numpy as np
Expand Down Expand Up @@ -47,14 +48,20 @@
}


def _chunk_iterable(iteration, num_chunks):
vincentmr marked this conversation as resolved.
Show resolved Hide resolved
"""Lazy-evaluated chunking of given iterable from https://stackoverflow.com/a/22045226"""
iteration = iter(iteration)
return iter(lambda: tuple(islice(iteration, num_chunks)), ())


class QuantumScriptSerializer:
"""Serializer class for `pennylane.tape.QuantumScript` data.

Args:
device_name: device shortname.
use_csingle (bool): whether to use np.complex64 instead of np.complex128
use_mpi (bool, optional): If using MPI to accelerate calculation. Defaults to False.
split_obs (bool, optional): If splitting the observables in a list. Defaults to False.
split_obs (Union[bool, int], optional): If splitting the observables in a list. Defaults to False.

"""

Expand Down Expand Up @@ -214,17 +221,34 @@
obs = observable.obs if isinstance(observable, Tensor) else observable.operands
return self.tensor_obs([self._ob(o, wires_map) for o in obs])

def _chunk_ham_terms(self, coeffs, ops, split_num: int = 1) -> List:
"Create split_num sub-Hamiltonians from a single high term-count Hamiltonian"
num_terms = len(coeffs)
step_size = num_terms // split_num + bool(num_terms % split_num)
c_coeffs = list(_chunk_iterable(coeffs, step_size))
c_ops = list(_chunk_iterable(ops, step_size))
return c_coeffs, c_ops

def _hamiltonian(self, observable, wires_map: dict = None):
coeffs, ops = observable.terms()
coeffs = np.array(unwrap(coeffs)).astype(self.rtype)
if self.split_obs:
ops_l = []
for t in ops:
term_cpp = self._ob(t, wires_map)
if isinstance(term_cpp, Sequence):
ops_l.extend(term_cpp)

Check warning on line 240 in pennylane_lightning/core/_serialize.py

View check run for this annotation

Codecov / codecov/patch

pennylane_lightning/core/_serialize.py#L236-L240

Added lines #L236 - L240 were not covered by tests
else:
ops_l.append(term_cpp)
vincentmr marked this conversation as resolved.
Show resolved Hide resolved
c, o = self._chunk_ham_terms(coeffs, ops_l, self.split_obs)
hams = [self.hamiltonian_obs(c_coeffs, c_obs) for (c_coeffs, c_obs) in zip(c, o)]
return hams

Check warning on line 245 in pennylane_lightning/core/_serialize.py

View check run for this annotation

Codecov / codecov/patch

pennylane_lightning/core/_serialize.py#L242-L245

Added lines #L242 - L245 were not covered by tests

terms = [self._ob(t, wires_map) for t in ops]
# TODO: This is in case `_hamiltonian` is called recursively which would cause a list
# to be passed where `_ob` expects an observable.
terms = [t[0] if isinstance(t, Sequence) and len(t) == 1 else t for t in terms]

if self.split_obs:
return [self.hamiltonian_obs([c], [t]) for (c, t) in zip(coeffs, terms)]
vincentmr marked this conversation as resolved.
Show resolved Hide resolved

return self.hamiltonian_obs(coeffs, terms)

def _sparse_hamiltonian(self, observable, wires_map: dict = None):
Expand Down Expand Up @@ -282,11 +306,14 @@
terms = [self._pauli_word(pw, wires_map) for pw in pwords]
coeffs = np.array(coeffs).astype(self.rtype)

if self.split_obs:
c, o = self._chunk_ham_terms(coeffs, terms, self.split_obs)
psentences = [self.hamiltonian_obs(c_coeffs, c_obs) for (c_coeffs, c_obs) in zip(c, o)]
return psentences

if len(terms) == 1 and coeffs[0] == 1.0:
return terms[0]

if self.split_obs:
return [self.hamiltonian_obs([c], [t]) for (c, t) in zip(coeffs, terms)]
vincentmr marked this conversation as resolved.
Show resolved Hide resolved
return self.hamiltonian_obs(coeffs, terms)

# pylint: disable=protected-access, too-many-return-statements
Expand Down Expand Up @@ -326,17 +353,17 @@
"""

serialized_obs = []
offset_indices = [0]
obs_indices = []
vincentmr marked this conversation as resolved.
Show resolved Hide resolved

for observable in tape.observables:
for i, observable in enumerate(tape.observables):
ser_ob = self._ob(observable, wires_map)
if isinstance(ser_ob, list):
serialized_obs.extend(ser_ob)
offset_indices.append(offset_indices[-1] + len(ser_ob))
obs_indices.extend([i] * len(ser_ob))
else:
serialized_obs.append(ser_ob)
offset_indices.append(offset_indices[-1] + 1)
return serialized_obs, offset_indices
obs_indices.append(i)
return serialized_obs, obs_indices

def serialize_ops(self, tape: QuantumTape, wires_map: dict = None) -> Tuple[
List[List[str]],
Expand Down
2 changes: 1 addition & 1 deletion pennylane_lightning/core/_version.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,4 @@
Version number (major.minor.patch[-label])
"""

__version__ = "0.38.0-dev51"
__version__ = "0.38.0-dev52"
21 changes: 10 additions & 11 deletions pennylane_lightning/core/lightning_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@
This module contains the base class for all PennyLane Lightning simulator devices,
and interfaces with C++ for improved performance.
"""
from itertools import islice, product
from typing import List
from itertools import product
from typing import List, Union

import numpy as np
import pennylane as qml
Expand All @@ -31,12 +31,6 @@
from ._version import __version__


def _chunk_iterable(iteration, num_chunks):
vincentmr marked this conversation as resolved.
Show resolved Hide resolved
"Lazy-evaluated chunking of given iterable from https://stackoverflow.com/a/22045226"
iteration = iter(iteration)
return iter(lambda: tuple(islice(iteration, num_chunks)), ())


class LightningBase(QubitDevice):
"""PennyLane Lightning Base device.

Expand Down Expand Up @@ -262,11 +256,16 @@ def _get_basis_state_index(self, state, wires):

# pylint: disable=too-many-function-args, assignment-from-no-return, too-many-arguments
def _process_jacobian_tape(
self, tape, starting_state, use_device_state, use_mpi: bool = False, split_obs: bool = False
self,
tape,
starting_state,
use_device_state,
use_mpi: bool = False,
split_obs: Union[bool, int] = False,
vincentmr marked this conversation as resolved.
Show resolved Hide resolved
):
state_vector = self._init_process_jacobian_tape(tape, starting_state, use_device_state)

obs_serialized, obs_idx_offsets = QuantumScriptSerializer(
obs_serialized, obs_indices = QuantumScriptSerializer(
self.short_name, self.use_csingle, use_mpi, split_obs
).serialize_observables(tape, self.wire_map)

Expand Down Expand Up @@ -309,7 +308,7 @@ def _process_jacobian_tape(
"tp_shift": tp_shift,
"record_tp_rows": record_tp_rows,
"all_params": all_params,
"obs_idx_offsets": obs_idx_offsets,
"obs_indices": obs_indices,
}

@staticmethod
Expand Down
80 changes: 24 additions & 56 deletions pennylane_lightning/lightning_gpu/lightning_gpu.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
from pennylane.measurements import Expectation, State
from pennylane.ops.op_math import Adjoint
from pennylane.wires import Wires
from scipy.sparse import csr_matrix

from pennylane_lightning.core._serialize import QuantumScriptSerializer, global_phase_diagonal
from pennylane_lightning.core._version import __version__
Expand Down Expand Up @@ -633,8 +634,12 @@
# Check adjoint diff support
self._check_adjdiff_supported_operations(tape.operations)

if self._mpi:
split_obs = False # with MPI batched means compute Jacobian one observables at a time, no point splitting linear combinations

Check warning on line 638 in pennylane_lightning/lightning_gpu/lightning_gpu.py

View check run for this annotation

Codecov / codecov/patch

pennylane_lightning/lightning_gpu/lightning_gpu.py#L638

Added line #L638 was not covered by tests
else:
split_obs = self._dp.getTotalDevices() if self._batch_obs else False
vincentmr marked this conversation as resolved.
Show resolved Hide resolved
processed_data = self._process_jacobian_tape(
tape, starting_state, use_device_state, self._mpi, self._batch_obs
tape, starting_state, use_device_state, self._mpi, split_obs
)

if not processed_data: # training_params is empty
Expand All @@ -653,68 +658,31 @@
adjoint_jacobian = _adj_dtype(self.use_csingle, self._mpi)()

if self._batch_obs: # Batching of Measurements
if not self._mpi: # Single-node path, controlled batching over available GPUs
num_obs = len(processed_data["obs_serialized"])
batch_size = (
num_obs
if isinstance(self._batch_obs, bool)
else self._batch_obs * self._dp.getTotalDevices()
)
jac = []
for chunk in range(0, num_obs, batch_size):
obs_chunk = processed_data["obs_serialized"][chunk : chunk + batch_size]
vincentmr marked this conversation as resolved.
Show resolved Hide resolved
jac_chunk = adjoint_jacobian.batched(
self._gpu_state,
obs_chunk,
processed_data["ops_serialized"],
trainable_params,
)
jac.extend(jac_chunk)
else: # MPI path, restrict memory per known GPUs
jac = adjoint_jacobian.batched(
self._gpu_state,
processed_data["obs_serialized"],
processed_data["ops_serialized"],
trainable_params,
)

jac = adjoint_jacobian.batched(
self._gpu_state,
processed_data["obs_serialized"],
processed_data["ops_serialized"],
trainable_params,
)
vincentmr marked this conversation as resolved.
Show resolved Hide resolved
else:
jac = adjoint_jacobian(
self._gpu_state,
processed_data["obs_serialized"],
processed_data["ops_serialized"],
trainable_params,
)

jac = np.array(jac) # only for parameters differentiable with the adjoint method
jac = jac.reshape(-1, len(trainable_params))
jac_r = np.zeros((len(tape.observables), processed_data["all_params"]))
if not self._batch_obs:
jac_r[:, processed_data["record_tp_rows"]] = jac
else:
# Reduce over decomposed expval(H), if required.
for idx in range(len(processed_data["obs_idx_offsets"][0:-1])):
if (
processed_data["obs_idx_offsets"][idx + 1]
- processed_data["obs_idx_offsets"][idx]
) > 1:
jac_r[idx, :] = np.sum(
jac[
processed_data["obs_idx_offsets"][idx] : processed_data[
"obs_idx_offsets"
][idx + 1],
:,
],
axis=0,
)
else:
jac_r[idx, :] = jac[
processed_data["obs_idx_offsets"][idx] : processed_data["obs_idx_offsets"][
idx + 1
],
:,
]

jac = np.array(jac)
has_shape0 = bool(len(jac))

num_obs = len(np.unique(processed_data["obs_indices"]))
rows = processed_data["obs_indices"]
cols = np.arange(len(rows), dtype=int)
data = np.ones(len(rows))
red_mat = csr_matrix((data, (rows, cols)), shape=(num_obs, len(rows)))
vincentmr marked this conversation as resolved.
Show resolved Hide resolved
jac = red_mat @ jac.reshape((len(rows), -1))
jac = jac.reshape(-1, len(trainable_params)) if has_shape0 else jac
jac_r = np.zeros((jac.shape[0], processed_data["all_params"]))
jac_r[:, processed_data["record_tp_rows"]] = jac
return self._adjoint_jacobian_processing(jac_r)

# pylint: disable=inconsistent-return-statements, line-too-long, missing-function-docstring
Expand Down
36 changes: 9 additions & 27 deletions pennylane_lightning/lightning_kokkos/lightning_kokkos.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@

import os
import sys
from os import getenv
from pathlib import Path
from typing import List
from warnings import warn
Expand All @@ -34,7 +33,7 @@

from pennylane_lightning.core._serialize import QuantumScriptSerializer, global_phase_diagonal
from pennylane_lightning.core._version import __version__
from pennylane_lightning.core.lightning_base import LightningBase, _chunk_iterable
from pennylane_lightning.core.lightning_base import LightningBase

try:
# pylint: disable=import-error, no-name-in-module
Expand Down Expand Up @@ -194,7 +193,7 @@
batch_obs=False,
kokkos_args=None,
): # pylint: disable=unused-argument, too-many-arguments
super().__init__(wires, shots=shots, c_dtype=c_dtype)
super().__init__(wires, shots=shots, c_dtype=c_dtype, batch_obs=batch_obs)

if kokkos_args is None:
self._kokkos_state = _kokkos_dtype(c_dtype)(self.num_wires)
Expand Down Expand Up @@ -717,33 +716,16 @@

trainable_params = processed_data["tp_shift"]

# If requested batching over observables, chunk into OMP_NUM_THREADS sized chunks.
# This will allow use of Lightning with adjoint for large-qubit numbers AND large
# numbers of observables, enabling choice between compute time and memory use.
requested_threads = int(getenv("OMP_NUM_THREADS", "1"))

adjoint_jacobian = AdjointJacobianC64() if self.use_csingle else AdjointJacobianC128()

if self._batch_obs and requested_threads > 1: # pragma: no cover
vincentmr marked this conversation as resolved.
Show resolved Hide resolved
obs_partitions = _chunk_iterable(processed_data["obs_serialized"], requested_threads)
jac = []
for obs_chunk in obs_partitions:
jac_local = adjoint_jacobian(
processed_data["state_vector"],
obs_chunk,
processed_data["ops_serialized"],
trainable_params,
)
jac.extend(jac_local)
else:
jac = adjoint_jacobian(
processed_data["state_vector"],
processed_data["obs_serialized"],
processed_data["ops_serialized"],
trainable_params,
)
jac = adjoint_jacobian(

Check warning on line 721 in pennylane_lightning/lightning_kokkos/lightning_kokkos.py

View check run for this annotation

Codecov / codecov/patch

pennylane_lightning/lightning_kokkos/lightning_kokkos.py#L721

Added line #L721 was not covered by tests
processed_data["state_vector"],
processed_data["obs_serialized"],
processed_data["ops_serialized"],
trainable_params,
)
jac = np.array(jac)
jac = jac.reshape(-1, len(trainable_params))
jac = jac.reshape(-1, len(trainable_params)) if len(jac) else jac

Check warning on line 728 in pennylane_lightning/lightning_kokkos/lightning_kokkos.py

View check run for this annotation

Codecov / codecov/patch

pennylane_lightning/lightning_kokkos/lightning_kokkos.py#L728

Added line #L728 was not covered by tests
jac_r = np.zeros((jac.shape[0], processed_data["all_params"]))
jac_r[:, processed_data["record_tp_rows"]] = jac
if hasattr(qml, "active_return"): # pragma: no cover
Expand Down
Loading
Loading