-
Notifications
You must be signed in to change notification settings - Fork 616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allowing more flexible cost functions for optimizers #959
Conversation
Issue #683 , implements the basic idea
Only tests and changelog to go
…ptimize_more_parameters
Codecov Report
@@ Coverage Diff @@
## master #959 +/- ##
==========================================
+ Coverage 97.86% 97.88% +0.01%
==========================================
Files 150 150
Lines 10876 10964 +88
==========================================
+ Hits 10644 10732 +88
Misses 232 232
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @albi3ro 🎉
I've left a few comments, mostly relating to comments, rearranging of logic, docstrings, and tests, but I didn't spot anything major.
In general, I really like your solution here.
-
opt.step(cost, x, y, z, data)
is a nice extension of the existing syntax (opt.step(cost, x)
). The logic extension is also quite straightforward; we simply loop over all*args
, and update those that are trainable. I think this is very intuitive, since a non-trainable parameter has a 'gradient' of 0, and you wouldn't expect the parameter to be updated anyway. -
It is also backwards compatible with the old approach! Which is really nice. I've created a PR over in the QML repo (Test new optimizers qml#189) to run all the demonstrations using this branch, to see if there are any issues (the QML demo is a good real-world test case of user-facing changes). As I write this, 73% of all demos have passed, with no failures 🤞
I only have two questions regarding the custom grad_fn
:
-
I'm curious why the tests passed before, since the test data did not previously conform to the same output shape as
qml.grad
. -
I'm curious if there is a path to retaining support for custom
grad_fn
s with scalar outputs. I don't think this is too important, since use of custom grad functions is rare. In the future, however, we might want to write the optimizers to support Torch/TF, so it might be something we re-consider down the line.
tests/test_optimize.py
Outdated
@pytest.mark.parametrize( | ||
"opt", | ||
[ | ||
GradientDescentOptimizer(stepsize), | ||
MomentumOptimizer(stepsize, momentum=gamma), | ||
NesterovMomentumOptimizer(stepsize, momentum=gamma), | ||
AdagradOptimizer(stepsize), | ||
RMSPropOptimizer(stepsize, decay=gamma), | ||
AdamOptimizer(stepsize, beta1=gamma, beta2=delta), | ||
RotosolveOptimizer(), | ||
], | ||
) | ||
class Test_over_opts: | ||
def test_kwargs(self, opt, tol): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is an indentation issue here; this class/decorator should be at the module level (no indentation).
Actually, I'm a bit surprised this works! I didn't think pytest would find/test nested classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I should take it out of the TestOptimizer
class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think what @josh146 meant was to make the new class be another stand-alone class, just as TestOptimizer
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do.
tests/test_optimize.py
Outdated
if getattr(opt, "reset", None): | ||
opt.reset() | ||
|
||
assert x_new_one != pytest.approx(x_new_two, abs=tol) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice test!
Perhaps we can be a bit more explicit though in checking that the cost function was called with the correct arguments. The mocker
pytest fixture can help us do this:
def test_kwargs(self, opt, tol, mocker):
"""Test that the keywords get passed and alter the function"""
class Cost:
@staticmethod
def kwargs_func(x, c=1.0):
return (x - c) ** 2
cost = Cost()
spy = mocker.spy(cost, "kwargs_func")
x_new_one = opt.step(cost.kwargs_func, 1.0, c=1.0)
assert spy.call_args_list[-1][1]["c"] == 1.0
x_new_two = opt.step(cost.kwargs_func, 1.0, c=2.0)
assert spy.call_args_list[-1][1]["c"] == 2.0
if getattr(opt, "reset", None):
opt.reset()
assert x_new_one != pytest.approx(x_new_two, abs=tol)
mocker.spy
creates a new Spy
object, that we can use to check the arguments/keyword arguments it was called with.
The reason I wrapped the cost function in a class is that I suddenly realised I don't know how to spy on a pure function 🤔 @antalszava @thisac do you guys have any idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, not quite sure actually! It seems to be good when there's a module object to pass as the 1st argument, so e.g., mocker.spy(qml, "about")
should be fine (see pytest-dev/pytest-mock#97 (comment)).
tests/test_optimize.py
Outdated
"""Test multiple arguments to function""" | ||
x_new, y_new = opt.step(func, *args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spy could also be used here to ensure that the function was called with the args you expect
tests/test_optimize.py
Outdated
|
||
assert len(args_new) == pytest.approx(2, abs=tol) | ||
assert args_new[0] != pytest.approx(x, abs=tol) | ||
assert args_new[1] == pytest.approx(data, abs=tol) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
An additional test you could add is one like in the changelog example; a cost function with trainable, non-trainable, and keyword arguments.
Co-authored-by: Josh Izaac <josh146@gmail.com>
Co-authored-by: Josh Izaac <josh146@gmail.com>
Co-authored-by: Josh Izaac <josh146@gmail.com>
Co-authored-by: Josh Izaac <josh146@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @albi3ro, a great change for improving how optimizers can work! 💯 Mostly picking up details here, so commenting for now. Left a couple of comments and questions for my own understanding.
tests/test_optimize.py
Outdated
grad_uni_fns = [np.cos, | ||
lambda x: np.exp(x / 10.) / 10., | ||
lambda x: 2 * x] | ||
grad_uni_fns = [lambda x: (np.cos(x), ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's better to make these more explicit by passing a single element list to tuple
, just because using the (x,)
syntax for creating single element tuples could be missed. Would just help perhaps for future reference with readability.
grad_uni_fns = [lambda x: (np.cos(x), ), | |
grad_uni_fns = [lambda x: tuple([np.cos(x)]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point. that does look a lot clearer
tests/test_optimize.py
Outdated
@pytest.mark.parametrize( | ||
"opt", | ||
[ | ||
GradientDescentOptimizer(stepsize), | ||
MomentumOptimizer(stepsize, momentum=gamma), | ||
NesterovMomentumOptimizer(stepsize, momentum=gamma), | ||
AdagradOptimizer(stepsize), | ||
RMSPropOptimizer(stepsize, decay=gamma), | ||
AdamOptimizer(stepsize, beta1=gamma, beta2=delta), | ||
RotosolveOptimizer(), | ||
], | ||
) | ||
class Test_over_opts: | ||
def test_kwargs(self, opt, tol): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think what @josh146 meant was to make the new class be another stand-alone class, just as TestOptimizer
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job @albi3ro! Looks really nice, and these changes are fantastic to have. 💯
I mainly have a few comments on code, conventions and docstrings, but overall it looks great!
pennylane/optimize/adagrad.py
Outdated
# Due to a bug in unflatten, input PennyLane tensors | ||
# are being unwrapped. Here, we cast them back to PennyLane | ||
# tensors. Long term, we should fix this bug in unflatten. | ||
# https://github.com/PennyLaneAI/pennylane/issues/966 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
"""Update x with one step of the optimizer and return the corresponding objective | ||
function value prior to the step. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few x
s in some docstrings that should be updated to args
with requires_grad=True
.
"""Update x with one step of the optimizer and return the corresponding objective | |
function value prior to the step. | |
"""Update differentiable arguments with one step of the optimizer and return the corresponding objective | |
function value prior to the step. |
@@ -47,86 +48,116 @@ def update_stepsize(self, stepsize): | |||
""" | |||
self._stepsize = stepsize | |||
|
|||
def step_and_cost(self, objective_fn, x, grad_fn=None): | |||
def step_and_cost(self, objective_fn, *args, grad_fn=None, **kwargs): | |||
"""Update x with one step of the optimizer and return the corresponding objective |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Update x with one step of the optimizer and return the corresponding objective | |
"""Update args with one step of the optimizer and return the corresponding objective |
grad_fn (function): Optional gradient function of the | ||
objective function with respect to the variables ``x``. | ||
If ``None``, the gradient function is computed automatically. | ||
Must match shape of autograd derivative. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be nicer to clarify a bit better what shape means in this case. E.g. "will always be a tuple containing ...".
pennylane/optimize/momentum.py
Outdated
self.accumulation = [None] * len(args) | ||
|
||
trained_index = 0 | ||
for index, arg in enumerate(args): | ||
if getattr(arg, "requires_grad", True): | ||
x_flat = _flatten(arg) | ||
grad_flat = _flatten(grad[trained_index]) | ||
trained_index += 1 | ||
|
||
self._update_momentum(index, grad_flat) | ||
|
||
x_new_flat = [e - a for a, e in zip(self.accumulation[index], x_flat)] | ||
|
||
args_new[index] = unflatten(x_new_flat, arg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems mildly annoying. I agree that there's probably a better way of rearranging the optimizers making this a bit easier for future development.
@albi3ro I think both I and @antalszava reviewed this at almost the exact same time. 😆 It seems like we commented on the same things in quite a few places (so don't get too confused when reading our comments/suggestions). |
Co-authored-by: antalszava <antalszava@gmail.com> Co-authored-by: Theodor <theodor@xanadu.ai>
…ptimize_more_parameters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing work @albi3ro! 💯
This is ready to merge now from my side, I just have one or two questions about the tests.
# list and tuple unpacking is also supported | ||
params = (x, y, data) | ||
params = opt.step(cost, *params) | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't forget to add your name to contributors! (unless you have done that already, and I missed it)
tests/test_optimize.py
Outdated
assert cost == pytest.approx(func(x, data, y, c=0.5), abs=tol) | ||
|
||
def test_steps_the_same(self, opt, opt_name, tol): | ||
"""Tests optimizing single parameter same as with several at a time""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure I follow this docstring. I find the 'same as with several' slightly hard to parse?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be good to expand this docstring to make the purpose of this test slightly clearer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From other tests, we know that the step for a single argument is correct. I want to make sure it has the same correct step regardless of position in the function call.
I will try to come up with a better way of explaining it concisely.
tests/test_optimize.py
Outdated
def test_kwargs(self, opt, opt_name, tol): | ||
"""Test that the keywords get passed and alter the function""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you end up getting mocker.spy
to work, to ensure that the keyword argument gets passed to this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good @albi3ro! 💯
Happy for this to be merged in once the mocker import is removed!
tests/test_optimize.py
Outdated
@@ -19,6 +19,7 @@ | |||
|
|||
import numpy as onp | |||
import pytest | |||
from pytest_mock import mocker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to import mocker
here; mocker
is a fixture, so pytest will automatically perform 'dependency injection' so that Python can find it in required test functions. Not a very pythonic system!
from pytest_mock import mocker | |
from pytest_mock import mocker |
assert kwargs2 == {"c": 2.0} | ||
assert kwargs3 == {"c": 3.0} | ||
|
||
assert cost_three == pytest.approx(wrapper.func(x, c=3.0), abs=tol) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯
[ch1081] |
forward = getattr(g, "forward", None) | ||
|
||
return grad, forward | ||
|
||
def apply_grad(self, grad, x): | ||
r"""Update the variables x to take a single optimization step. Flattens and unflattens | ||
def apply_grad(self, grad, args): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some feedback from the hackathon based upon a submission that uses compute_grad
and apply_grad
: they had some confusion because they needed to switch from passing x=params
to args=[params]
and also (for compute_grad
) they needed to explicitly pass kwargs={}
.
Should we consider making these methods be similar to step(*args, **kwargs)
? Or, alternatively just make these hidden methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trbromley I tend to agree, especially if this would be more of a user-facing method (not sure how much of an issue it would be otherwise). There was a brief discussion re. this when this was under review. Perhaps just making them hidden by preceding with an underscore would be good here. 🤔 @albi3ro @josh146
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm somewhat ambivalent here. We definitely considered compute_grad
and apply_grad
'private' methods when first designing the optimizers, as they were not intended to be used as the public optimizer API.
But it probably makes more sense to keep them public, and standardize the signatures, since the fact that users were finding and using them suggests they add value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless there's a larger demand from users for this function, it'd be inefficient to unpack and repack the values. It would add operations for no real purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm in favour of 'doing nothing' for now unless there is more feedback and demand.
As discussed in Issue #683
This allows cost functions to accept any combination of parameters and data. The Optimizer will determine which variables to increment based on the trait
requires_grad
.This change will allow users to do something like:
To Do: