-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Save experiment and analysis config as artifacts #1424
base: main
Are you sure you want to change the base?
Save experiment and analysis config as artifacts #1424
Conversation
Added load method to `BaseExperiment` Added backend initialization to the experiment if available. `from_config` now also loads analysis if available.
Co-authored-by: ItamarGoldman <itamargoldman@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's only a partial review for now. I'm publishing it to emphasize some possible issues that I've spotted in BaseAnalysis.run
.
I'm intentionally not checking the Request changes
box, because I'm going on vacation in a few days.
|
||
Raises: | ||
QiskitError: If the experiment class is not stored, | ||
QiskitError: If the drawer class is not stored, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace "successful" with "successfully" in the subsequent line
type is currently stored, or the :meth:`~.BaseExperiment.from_config` class method of the | ||
appropriate experiment. | ||
This stores the current configuration of a :class:`.BaseDrawer` and can be used to reconstruct | ||
the drawer class using either the :meth:`drawer` property if the drawer class type is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Property of which class? And how is it related to whether the class type is stored?
def config(self) -> AnalysisConfig: | ||
"""Return the config dataclass for this analysis. We replicate the method from `BaseAnalysis` | ||
because we cannot directly modify the returned object. This limitation arises from our use of | ||
it hashable we use `dataclasses.dataclass(frozen=True)` to ensure its hashability. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last sentence is grammatically incorrect (and indeed it's not clear what it's trying to say).
else: | ||
self._artifacts[artifact.artifact_id] = artifact | ||
else: | ||
self._artifacts[artifact.artifact_id] = artifact |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic here is somewhat strange and unintuitive, I don't know if it's documented anywhere, and it's not clear where it comes from. From the code I see that multiple, say five calls to add_artifacts
with overwrite_name=False
will create a collection of five different artifacts having the same name. Until the sixth call sets overweite_name
to True
, in this case all the five existing artifacts with that name will remain in the collection, but will be overridden by the new artifact. However a sixth artifact with artifact_id
will not be added.
expdata.add_artifacts(result) | ||
if ( | ||
result.name != "analysis_config" | ||
): # Not saving constituent analysis configs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the flow that will make _run_analysis
return an analysis config?
ArtifactData(name="analysis_config", data=self.config()), | ||
overwrite_name=True, | ||
overwrite_id=True, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If replace_results
is False
then we create a copy of the ExperimentData
object. Both the original object and its copy will now contain the analysis config artifact, although the original one will not contain the corresponding analysis result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also concerned about the old analysis config that's stored in the experiment data and is now overridden. The concern applies for the two possible values of replace_results
.
args = tuple(getattr(self, "__init_args__", OrderedDict()).values()) | ||
kwargs = dict(getattr(self, "__init_kwargs__", OrderedDict())) | ||
# Only store non-default valued options | ||
options = dict((key, getattr(self._options, key)) for key in self._set_options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BaseAnalysis
and BaseCurveAnalysis
should call a common helper function, instead of duplicating the code. This is possible in spite of everything being hashable.
|
||
@dataclasses.dataclass(frozen=True) | ||
class PlotterConfig: | ||
"""Store configuration settings for a Drawer class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Store configuration settings for a Drawer class. | |
"""Store configuration settings for a Plotter class. |
cls: type = None | ||
options: Dict[str, Any] = dataclasses.field(default_factory=dict) | ||
figure_options: Dict[str, Any] = dataclasses.field(default_factory=dict) | ||
# drawer: Dict[str, Any] = dataclasses.field(default_factory=dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# drawer: Dict[str, Any] = dataclasses.field(default_factory=dict) |
"""Store configuration settings for a Drawer class. | ||
|
||
This stores the current configuration of a :class:`.BasePlotter` and can be used to reconstruct | ||
the plotter class using either the :meth:`plotter` property if the plotter class type is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as before.
|
||
Raises: | ||
QiskitError: If the plotter class is not stored, | ||
was not successful deserialized, or reconstruction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was not successful deserialized, or reconstruction | |
was not successfully deserialized, or reconstruction |
if not service: | ||
raise QiskitError("A service must be provided to save the experiment.") | ||
if not self.backend and not backend: | ||
raise QiskitError("Backend must be set to save the experiment.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the backend mandatory?
f"but you have versions {cur_versions}." | ||
) | ||
try: | ||
reconstructed_experiment = experiment_config.cls.from_config(experiment_config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just out of curiosity, does the cls
here have any effect? Or is it only for readability?
@@ -2404,7 +2409,7 @@ def copy(self, copy_results: bool = True) -> "ExperimentData": | |||
|
|||
with self._artifacts.lock: | |||
new_instance._artifacts = ThreadSafeOrderedDict() | |||
new_instance.add_artifacts(self._artifacts.values()) | |||
new_instance.add_artifacts(self._artifacts.values(), overwrite_id=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does it matter?
@@ -552,27 +554,34 @@ def image( | |||
def figure(self): | |||
"""Return figure object handler to be saved in the database.""" | |||
|
|||
def config(self) -> Dict: | |||
def config(self) -> DrawerConfig: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an interface change, do we have to follow some type of deprecation policy?
if "figure_options" in value: | ||
instance.set_figure_options(**value["figure_options"]) | ||
return instance | ||
return cls.from_config(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we have configs, do we still need __json_encode__
and __json_decode__
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these are still required, with __json_encode__
now returning PlotterConfig
and not dict, would that work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Various questions and comments at this stage. I couldn't fully test it due to issues with BaseExperiment.load
; once solved I'll run some manual tests
@@ -553,36 +554,39 @@ def _configure_drawer(self): | |||
# Use drawer.set_figure_options so figure options are serialized. | |||
self.drawer.set_figure_options(**_drawer_figure_options) | |||
|
|||
def config(self) -> Dict: | |||
def config(self) -> PlotterConfig: | |||
"""Return the config dictionary for this drawing.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Return the config dictionary for this drawing.""" | |
"""Return the config dataclass for this plotter.""" |
|
||
@classmethod | ||
def from_config(cls, config: Union[PlotterConfig, Dict]) -> "BasePlotter": | ||
"""Initialize a plotter class from analysis config""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Initialize a plotter class from analysis config""" | |
"""Initialize a plotter class from a plotter config""" |
@@ -552,27 +554,34 @@ def image( | |||
def figure(self): | |||
"""Return figure object handler to be saved in the database.""" | |||
|
|||
def config(self) -> Dict: | |||
def config(self) -> DrawerConfig: | |||
"""Return the config dictionary for this drawer.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Return the config dictionary for this drawer.""" | |
"""Return the config dataclass for this drawer.""" |
} | ||
@classmethod | ||
def from_config(cls, config: Union[DrawerConfig, Dict]) -> "BaseDrawer": | ||
"""Initialize a drawer class from analysis config""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Initialize a drawer class from analysis config""" | |
"""Initialize a drawer class from a drawer config""" |
ret.set_options(**config.options) | ||
if config.figure_options: | ||
ret.set_figure_options(**config.figure_options) | ||
return ret | ||
|
||
def __json_encode__(self): | ||
return self.config() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this now returning DrawerConfig
and not a dict, would this work?
if "figure_options" in value: | ||
instance.set_figure_options(**value["figure_options"]) | ||
return instance | ||
return cls.from_config(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these are still required, with __json_encode__
now returning PlotterConfig
and not dict, would that work?
from qiskit_experiments.framework import BaseExperiment, ExperimentData | ||
|
||
|
||
def load_all( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would change name to something a bit more informative, like load_experiment_and_data
@@ -422,3 +424,36 @@ def _initialize( | |||
DeprecationWarning, | |||
) | |||
self.set_options(data_subfit_map=data_subfit_map) | |||
|
|||
def config(self) -> AnalysisConfig: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have config
/from_config
with plotter handling specifically here in BaseCurveAnalysis
and not in BaseAnalysis
? What about analyzes not derived from BaseCurveAnalysis
, e.g. TomographyAnalysis
or StarkP1SpectAnalysis
?
data = service.experiment(experiment_id, json_decoder=ExperimentDecoder) | ||
|
||
# Recreate artifacts | ||
experiment_config_filename = "experiment_config.zip" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better store the name in one place to be used here and in `ExperimentData.init" (maybe in helpers.py to avoid cyclic dependency)
@@ -171,13 +179,20 @@ def config(self) -> ExperimentConfig: | |||
(key, getattr(self._transpile_options, key)) for key in self._set_transpile_options | |||
) | |||
run_options = dict((key, getattr(self._run_options, key)) for key in self._set_run_options) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an issue with config handling when it comes to serializing a backend object when one is provided in the __init__
method. In this case, the backend is included in the config e.g. via kwargs
and is serialized in way that upon deserialization (through BaseExperiment.load
) the following warning is emitted:
/tmp/qiskit-experiments/qiskit_experiments/framework/json.py:581: UserWarning: Could not deserialize instance of class <class 'qiskit_ibm_runtime.ibm_backend.IBMBackend'> from settings {}. Value is {'class': <class 'qiskit_ibm_runtime.ibm_backend.IBMBackend'>, 'settings': {}, 'version': '0.23.0'}
The following exception was raised:
Traceback (most recent call last):
File "/tmp/qiskit-experiments/qiskit_experiments/framework/json.py", line 338, in _deserialize_object
return cls(**settings)
TypeError: IBMBackend.init() missing 3 required positional arguments: 'configuration', 'service', and 'api_client'return _deserialize_object(obj_val)
The experiment still loads, but this warning is annoying.
cls, | ||
experiment_id: str, | ||
service: Optional[IBMExperimentService] = None, | ||
provider: Optional[Provider] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If service
it not None, but provider
is None, we get this warning:
/tmp/qiskit-experiments/qiskit_experiments/framework/base_experiment.py:286: UserWarning: Unable to retrieve backend.
warnings.warn("Unable to retrieve backend.")
I think this should be fixed - either don't allow an empty provider or update the doc string to explain the behavior when provider
is empty. Currently the doc string explicitly says that IBMProvider
is required
Summary
Adds
BaseExperiment.load
andBaseExperiment.save
methods and a helper functionload_all()
that loads both experiment and experiment data, with the option to run analysis:This PR incorporates changes from @ItamarGoldman's in #1423. To enable saving and loading, this PR saves experiment and analysis config as
experiment_config
andanalysis_config
in experiment data artifacts. Becauseanalysis_options
shouldn't be added as a new artifact every time analysis is rerun, this also addsoverwrite_name
option toadd_artifacts()
and renames the currentoverwrite
option tooverwrite_id
. have been incorporated.To-do
Notes
load_all()
is inefficient becauseBaseExperiment.load()
andExperimentData.load()
make separate and identical calls to the experiment service. Ideally this would be combined in a separate interface class in a future refactoring.