Allowing more flexible cost functions for optimizers #959

albi3ro · 2020-12-09T20:00:49Z

As discussed in Issue #683

This allows cost functions to accept any combination of parameters and data. The Optimizer will determine which variables to increment based on the trait requires_grad.

This change will allow users to do something like:

def f(x,y,z):
     return x+y+z

x = np.array([1], requires_grad=True)
y = np.array([1], requires_grad=False)
z = np.array([1], requires_grad=True)

opt = qml.GradientDescentOptimizer()
x_new, y_old, z_new = opt.step(f,x,y,z)

To Do:

improve flow and readability in GradientDescent
Implement in the other optimizers
Tests
Changelog

Issue #683 , implements the basic idea

Only tests and changelog to go

…ptimize_more_parameters

codecov · 2020-12-16T21:30:09Z

Codecov Report

Merging #959 (f7e9d67) into master (09a6239) will increase coverage by 0.01%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #959      +/-   ##
==========================================
+ Coverage   97.86%   97.88%   +0.01%     
==========================================
  Files         150      150              
  Lines       10876    10964      +88     
==========================================
+ Hits        10644    10732      +88     
  Misses        232      232

Impacted Files	Coverage Δ
pennylane/optimize/adagrad.py	`100.00% <100.00%> (ø)`
pennylane/optimize/adam.py	`100.00% <100.00%> (ø)`
pennylane/optimize/gradient_descent.py	`100.00% <100.00%> (ø)`
pennylane/optimize/momentum.py	`100.00% <100.00%> (ø)`
pennylane/optimize/nesterov_momentum.py	`100.00% <100.00%> (ø)`
pennylane/optimize/qng.py	`100.00% <100.00%> (ø)`
pennylane/optimize/rms_prop.py	`100.00% <100.00%> (ø)`
pennylane/optimize/rotoselect.py	`95.65% <100.00%> (ø)`
pennylane/optimize/rotosolve.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 09a6239...f7e9d67. Read the comment docs.

…ptimize_more_parameters

josh146

Great work @albi3ro 🎉

I've left a few comments, mostly relating to comments, rearranging of logic, docstrings, and tests, but I didn't spot anything major.

In general, I really like your solution here.

opt.step(cost, x, y, z, data) is a nice extension of the existing syntax (opt.step(cost, x)). The logic extension is also quite straightforward; we simply loop over all *args, and update those that are trainable. I think this is very intuitive, since a non-trainable parameter has a 'gradient' of 0, and you wouldn't expect the parameter to be updated anyway.
It is also backwards compatible with the old approach! Which is really nice. I've created a PR over in the QML repo (Test new optimizers qml#189) to run all the demonstrations using this branch, to see if there are any issues (the QML demo is a good real-world test case of user-facing changes). As I write this, 73% of all demos have passed, with no failures 🤞

I only have two questions regarding the custom grad_fn:

I'm curious why the tests passed before, since the test data did not previously conform to the same output shape as qml.grad.
I'm curious if there is a path to retaining support for custom grad_fns with scalar outputs. I don't think this is too important, since use of custom grad functions is rare. In the future, however, we might want to write the optimizers to support Torch/TF, so it might be something we re-consider down the line.

.github/CHANGELOG.md

pennylane/optimize/gradient_descent.py

tests/test_optimize.py

josh146 · 2020-12-17T12:34:20Z

tests/test_optimize.py

+    @pytest.mark.parametrize(
+        "opt",
+        [
+            GradientDescentOptimizer(stepsize),
+            MomentumOptimizer(stepsize, momentum=gamma),
+            NesterovMomentumOptimizer(stepsize, momentum=gamma),
+            AdagradOptimizer(stepsize),
+            RMSPropOptimizer(stepsize, decay=gamma),
+            AdamOptimizer(stepsize, beta1=gamma, beta2=delta),
+            RotosolveOptimizer(),
+        ],
+    )
+    class Test_over_opts:
+        def test_kwargs(self, opt, tol):


I think there is an indentation issue here; this class/decorator should be at the module level (no indentation).

Actually, I'm a bit surprised this works! I didn't think pytest would find/test nested classes.

So I should take it out of the TestOptimizer class?

Think what @josh146 meant was to make the new class be another stand-alone class, just as TestOptimizer.

josh146 · 2020-12-17T12:41:38Z

tests/test_optimize.py

+            if getattr(opt, "reset", None):
+                opt.reset()
+
+            assert x_new_one != pytest.approx(x_new_two, abs=tol)


Nice test!

Perhaps we can be a bit more explicit though in checking that the cost function was called with the correct arguments. The mocker pytest fixture can help us do this:

def test_kwargs(self, opt, tol, mocker): """Test that the keywords get passed and alter the function""" class Cost: @staticmethod def kwargs_func(x, c=1.0): return (x - c) ** 2 cost = Cost() spy = mocker.spy(cost, "kwargs_func") x_new_one = opt.step(cost.kwargs_func, 1.0, c=1.0) assert spy.call_args_list[-1][1]["c"] == 1.0 x_new_two = opt.step(cost.kwargs_func, 1.0, c=2.0) assert spy.call_args_list[-1][1]["c"] == 2.0 if getattr(opt, "reset", None): opt.reset() assert x_new_one != pytest.approx(x_new_two, abs=tol)

mocker.spy creates a new Spy object, that we can use to check the arguments/keyword arguments it was called with.

The reason I wrapped the cost function in a class is that I suddenly realised I don't know how to spy on a pure function 🤔 @antalszava @thisac do you guys have any idea?

Hmm, not quite sure actually! It seems to be good when there's a module object to pass as the 1st argument, so e.g., mocker.spy(qml, "about") should be fine (see pytest-dev/pytest-mock#97 (comment)).

josh146 · 2020-12-17T12:42:49Z

tests/test_optimize.py

+            """Test multiple arguments to function"""
+            x_new, y_new = opt.step(func, *args)


Spy could also be used here to ensure that the function was called with the args you expect

josh146 · 2020-12-17T12:43:29Z

tests/test_optimize.py

+
+            assert len(args_new) == pytest.approx(2, abs=tol)
+            assert args_new[0] != pytest.approx(x, abs=tol)
+            assert args_new[1] == pytest.approx(data, abs=tol)


Nice!

An additional test you could add is one like in the changelog example; a cost function with trainable, non-trainable, and keyword arguments.

Co-authored-by: Josh Izaac <josh146@gmail.com>

antalszava

Hi @albi3ro, a great change for improving how optimizers can work! 💯 Mostly picking up details here, so commenting for now. Left a couple of comments and questions for my own understanding.

pennylane/optimize/gradient_descent.py

antalszava · 2020-12-17T23:20:45Z

tests/test_optimize.py

-grad_uni_fns = [np.cos,
-                lambda x: np.exp(x / 10.) / 10.,
-                lambda x: 2 * x]
+grad_uni_fns = [lambda x: (np.cos(x), ),


Maybe it's better to make these more explicit by passing a single element list to tuple, just because using the (x,) syntax for creating single element tuples could be missed. Would just help perhaps for future reference with readability.

Suggested change

grad_uni_fns = [lambda x: (np.cos(x), ),

grad_uni_fns = [lambda x: tuple([np.cos(x)]),

good point. that does look a lot clearer

tests/test_optimize.py

antalszava · 2020-12-17T23:24:07Z

tests/test_optimize.py

+    @pytest.mark.parametrize(
+        "opt",
+        [
+            GradientDescentOptimizer(stepsize),
+            MomentumOptimizer(stepsize, momentum=gamma),
+            NesterovMomentumOptimizer(stepsize, momentum=gamma),
+            AdagradOptimizer(stepsize),
+            RMSPropOptimizer(stepsize, decay=gamma),
+            AdamOptimizer(stepsize, beta1=gamma, beta2=delta),
+            RotosolveOptimizer(),
+        ],
+    )
+    class Test_over_opts:
+        def test_kwargs(self, opt, tol):


Think what @josh146 meant was to make the new class be another stand-alone class, just as TestOptimizer.

.github/CHANGELOG.md

thisac

Great job @albi3ro! Looks really nice, and these changes are fantastic to have. 💯

I mainly have a few comments on code, conventions and docstrings, but overall it looks great!

.github/CHANGELOG.md

thisac · 2020-12-17T22:12:02Z

pennylane/optimize/adagrad.py

+                    # Due to a bug in unflatten, input PennyLane tensors
+                    # are being unwrapped. Here, we cast them back to PennyLane
+                    # tensors. Long term, we should fix this bug in unflatten.
+                    # https://github.com/PennyLaneAI/pennylane/issues/966


pennylane/optimize/adagrad.py

thisac · 2020-12-17T22:25:57Z

pennylane/optimize/gradient_descent.py

        """Update x with one step of the optimizer and return the corresponding objective
        function value prior to the step.


There are a few xs in some docstrings that should be updated to args with requires_grad=True.

Suggested change

"""Update x with one step of the optimizer and return the corresponding objective

function value prior to the step.

"""Update differentiable arguments with one step of the optimizer and return the corresponding objective

function value prior to the step.

pennylane/optimize/gradient_descent.py

pennylane/optimize/adam.py

thisac · 2020-12-17T23:25:36Z

pennylane/optimize/gradient_descent.py

@@ -47,86 +48,116 @@ def update_stepsize(self, stepsize):
        """
        self._stepsize = stepsize

-    def step_and_cost(self, objective_fn, x, grad_fn=None):
+    def step_and_cost(self, objective_fn, *args, grad_fn=None, **kwargs):
        """Update x with one step of the optimizer and return the corresponding objective


Suggested change

"""Update x with one step of the optimizer and return the corresponding objective

"""Update args with one step of the optimizer and return the corresponding objective

thisac · 2020-12-17T23:27:09Z

pennylane/optimize/gradient_descent.py

            grad_fn (function): Optional gradient function of the
                objective function with respect to the variables ``x``.
                If ``None``, the gradient function is computed automatically.
+                Must match shape of autograd derivative.


Might be nicer to clarify a bit better what shape means in this case. E.g. "will always be a tuple containing ...".

pennylane/optimize/gradient_descent.py

thisac · 2020-12-17T23:30:25Z

pennylane/optimize/momentum.py

+            self.accumulation = [None] * len(args)
+
+        trained_index = 0
+        for index, arg in enumerate(args):
+            if getattr(arg, "requires_grad", True):
+                x_flat = _flatten(arg)
+                grad_flat = _flatten(grad[trained_index])
+                trained_index += 1
+
+                self._update_momentum(index, grad_flat)
+
+                x_new_flat = [e - a for a, e in zip(self.accumulation[index], x_flat)]
+
+                args_new[index] = unflatten(x_new_flat, arg)


That seems mildly annoying. I agree that there's probably a better way of rearranging the optimizers making this a bit easier for future development.

thisac · 2020-12-17T23:38:32Z

@albi3ro I think both I and @antalszava reviewed this at almost the exact same time. 😆 It seems like we commented on the same things in quite a few places (so don't get too confused when reading our comments/suggestions).

Co-authored-by: antalszava <antalszava@gmail.com> Co-authored-by: Theodor <theodor@xanadu.ai>

…ptimize_more_parameters

josh146

Amazing work @albi3ro! 💯

This is ready to merge now from my side, I just have one or two questions about the tests.

pennylane/optimize/rotosolve.py

josh146 · 2020-12-23T17:09:34Z

.github/CHANGELOG.md

+  # list and tuple unpacking is also supported
+  params = (x, y, data)
+  params = opt.step(cost, *params)
+  ```


Don't forget to add your name to contributors! (unless you have done that already, and I missed it)

josh146 · 2020-12-23T17:15:39Z

tests/test_optimize.py

+            assert cost == pytest.approx(func(x, data, y, c=0.5), abs=tol)
+
+    def test_steps_the_same(self, opt, opt_name, tol):
+        """Tests optimizing single parameter same as with several at a time"""


I'm not quite sure I follow this docstring. I find the 'same as with several' slightly hard to parse?

Could be good to expand this docstring to make the purpose of this test slightly clearer

From other tests, we know that the step for a single argument is correct. I want to make sure it has the same correct step regardless of position in the function call.

I will try to come up with a better way of explaining it concisely.

josh146 · 2020-12-23T17:16:30Z

tests/test_optimize.py

+    def test_kwargs(self, opt, opt_name, tol):
+        """Test that the keywords get passed and alter the function"""


Did you end up getting mocker.spy to work, to ensure that the keyword argument gets passed to this function?

josh146

Looks good @albi3ro! 💯

Happy for this to be merged in once the mocker import is removed!

josh146 · 2021-01-06T16:05:40Z

tests/test_optimize.py

@@ -19,6 +19,7 @@

 import numpy as onp
 import pytest
+from pytest_mock import mocker


No need to import mocker here; mocker is a fixture, so pytest will automatically perform 'dependency injection' so that Python can find it in required test functions. Not a very pythonic system!

Suggested change

from pytest_mock import mocker

from pytest_mock import mocker

josh146 · 2021-01-06T16:06:58Z

tests/test_optimize.py

+        assert kwargs2 == {"c": 2.0}
+        assert kwargs3 == {"c": 3.0}
+
+        assert cost_three == pytest.approx(wrapper.func(x, c=3.0), abs=tol)


josh146 · 2021-01-14T05:19:40Z

[ch1081]

trbromley · 2021-02-17T21:02:23Z

pennylane/optimize/gradient_descent.py

        forward = getattr(g, "forward", None)

        return grad, forward

-    def apply_grad(self, grad, x):
-        r"""Update the variables x to take a single optimization step. Flattens and unflattens
+    def apply_grad(self, grad, args):


Just some feedback from the hackathon based upon a submission that uses compute_grad and apply_grad: they had some confusion because they needed to switch from passing x=params to args=[params] and also (for compute_grad) they needed to explicitly pass kwargs={}.

Should we consider making these methods be similar to step(*args, **kwargs)? Or, alternatively just make these hidden methods.

@trbromley I tend to agree, especially if this would be more of a user-facing method (not sure how much of an issue it would be otherwise). There was a brief discussion re. this when this was under review. Perhaps just making them hidden by preceding with an underscore would be good here. 🤔 @albi3ro @josh146

I'm somewhat ambivalent here. We definitely considered compute_grad and apply_grad 'private' methods when first designing the optimizers, as they were not intended to be used as the public optimizer API.

But it probably makes more sense to keep them public, and standardize the signatures, since the fact that users were finding and using them suggests they add value.

Unless there's a larger demand from users for this function, it'd be inefficient to unpack and repack the values. It would add operations for no real purpose.

I'm in favour of 'doing nothing' for now unless there is more feedback and demand.

allowing more parameters to cost function in gradient descent

fe79cc6

Issue #683 , implements the basic idea

albi3ro added the WIP 🚧 Work-in-progress label Dec 9, 2020

albi3ro added 12 commits December 10, 2020 20:39

multiple args and kwargs support for most optimizers

9362b82

improved structure, return format

5609f5d

edit changelog

2c14788

near finished version of the operators

24d30e1

Only tests and changelog to go

update doc about provided gradient form

21e9e01

update test_optimize for new gradient form

a820b55

testing multiple arguments, non-training args, keywords

ad0906a

improved changelog

ed4ac91

linting

b30a552

linting

cb2b328

Merge remote-tracking branch 'origin/optimize_more_parameters' into o…

5c03e65

…ptimize_more_parameters

Merge branch 'master' into optimize_more_parameters

df0a7a8

albi3ro removed the WIP 🚧 Work-in-progress label Dec 16, 2020

albi3ro changed the title ~~[WIP] Allowing more flexible cost functions for optimizers~~ Allowing more flexible cost functions for optimizers Dec 16, 2020

albi3ro added 3 commits December 16, 2020 15:32

black formatting

4cf358c

Merge remote-tracking branch 'origin/optimize_more_parameters' into o…

b9a447d

…ptimize_more_parameters

different black parameters

7624f24

albi3ro added the review-ready 👌 PRs which are ready for review by someone from the core team. label Dec 16, 2020

albi3ro requested review from josh146, thisac and antalszava December 16, 2020 21:57

Merge branch 'master' into optimize_more_parameters

5e0c40b

josh146 requested changes Dec 17, 2020

View reviewed changes

albi3ro and others added 4 commits December 17, 2020 08:16

Update .github/CHANGELOG.md

12598c8

Co-authored-by: Josh Izaac <josh146@gmail.com>

changelog conform to black

4f35668

Co-authored-by: Josh Izaac <josh146@gmail.com>

wording change

193cb53

Co-authored-by: Josh Izaac <josh146@gmail.com>

wording change

095de9e

Co-authored-by: Josh Izaac <josh146@gmail.com>

antalszava reviewed Dec 17, 2020

View reviewed changes

thisac reviewed Dec 17, 2020

View reviewed changes

albi3ro and others added 9 commits December 18, 2020 09:26

Apply simple, local suggestions from code review

16c6bde

Co-authored-by: antalszava <antalszava@gmail.com> Co-authored-by: Theodor <theodor@xanadu.ai>

Most code review comments implemented

7ddbcbc

black on new tests

c53adb7

fix nesterov momentum

d9d03a9

Merge branch 'master' into optimize_more_parameters

afd0cfa

Merge remote-tracking branch 'origin/optimize_more_parameters' into o…

751a030

…ptimize_more_parameters

actually add rotoselect kwargs this time. nesterov test

90257ba

ran black on rotoselect

a35d782

minor docstring fixes

033af1b

albi3ro requested a review from josh146 December 22, 2020 20:22

albi3ro self-assigned this Dec 22, 2020

Merge branch 'master' into optimize_more_parameters

3c58644

josh146 approved these changes Dec 23, 2020

View reviewed changes

albi3ro added 5 commits December 28, 2020 10:16

name on changelog, tests in progress changing

f39a839

black

0eb4133

test rotosolve, fix rotosolve

bfe0a4b

Merge branch 'master' into optimize_more_parameters

c3d1e49

Merge branch 'master' into optimize_more_parameters

761bfed

josh146 approved these changes Jan 6, 2021

View reviewed changes

remove import of mocker

f7e9d67

albi3ro merged commit 3807064 into master Jan 6, 2021

albi3ro deleted the optimize_more_parameters branch January 6, 2021 16:51

albi3ro mentioned this pull request Jan 6, 2021

Update the provided optimizers to support more complicated cost function signatures. #683

Closed

albi3ro mentioned this pull request Jan 15, 2021

Update cost functions to support more flexible optimizer functionality. PennyLaneAI/qml#204

Open

josh146 mentioned this pull request Jan 30, 2021

Fixes bugs in optimizer docstrings #1053

Merged

trbromley reviewed Feb 17, 2021

View reviewed changes

		"""Test multiple arguments to function"""
		x_new, y_new = opt.step(func, *args)

	grad_uni_fns = [lambda x: (np.cos(x), ),
	grad_uni_fns = [lambda x: tuple([np.cos(x)]),

		"""Update x with one step of the optimizer and return the corresponding objective
		function value prior to the step.

	"""Update x with one step of the optimizer and return the corresponding objective
	"""Update args with one step of the optimizer and return the corresponding objective

		def test_kwargs(self, opt, opt_name, tol):
		"""Test that the keywords get passed and alter the function"""

	from pytest_mock import mocker
	from pytest_mock import mocker

Allowing more flexible cost functions for optimizers #959

Allowing more flexible cost functions for optimizers #959

Conversation

albi3ro commented Dec 9, 2020

codecov bot commented Dec 16, 2020 • edited Loading

Codecov Report

josh146 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antalszava left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thisac left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thisac commented Dec 17, 2020

josh146 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

josh146 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

josh146 commented Jan 14, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Dec 16, 2020 •

edited

Loading