Add online newton optimizer #258

eserie · 2021-12-05T22:56:00Z

No description provided.

mkunesch · 2021-12-14T12:47:52Z

Hi! Thanks a lot for the PR! Could you fix the pylint errors before we review (so that we can focus on the important things in the review)? Let me know if there are any issues or you have any questions! Thanks a lot!

eserie · 2021-12-15T09:26:03Z

Hi, sure. I fixed the pylint error in an additional commit. Feel free to ask me to change things.
For now, the dev is more or less as I originally developed it and the test coverage is certainly a bit too limited.

mkunesch · 2021-12-22T00:10:53Z

Hi! Thanks for fixing the pylint error - could you also do the same in the test file? I think that's still holding up the checks.
Due to the holidays it might take us slightly longer to review so there is no rush.

eserie · 2022-01-03T13:45:15Z

Hi! I fixed the tests! Normally all checks should pass now.

mkunesch · 2022-01-17T10:29:56Z

Great, thank you very much for fixing all the checks! I've assigned myself and should have time to review this week.

mkunesch

Hi! Thank you so much for the contribution again!

I have been reviewing your code, but in order to check the equations it would be very helpful if you could point me to exactly the equation, algorithm, or section in the paper that you have implemented so that I can compare it to the code more easily.

I will add the comments I have so far below so that they don't get lost in the meantime, but please don't make any changes yet as I'm still reviewing the code.

Thanks a lot for your help and also thanks again for this PR!

optax/_src/alias.py

optax/__init__.py

optax/_src/alias.py

optax/_src/transform_test.py

optax/_src/transform.py

mkunesch

Thanks a lot for pointing out the algorithm in the paper! I've now finished the first pass - I only have some additional questions on floating point arithmetic.

Thank you very much for the PR again!!

optax/_src/transform.py

mkunesch · 2022-01-27T16:28:07Z

optax/_src/transform.py

+  hessian_inv: base.Updates
+
+
+def sherman_morrison(a_inv, u, v):


If we decide to make this private, I'd also change the implementation to specific to our use case where u == v. In any case, please also add a docstring explaining what this does and what sherman-morrison is used for in this case. Thanks a lot!

optax/_src/transform.py

eserie · 2022-02-13T11:35:41Z

Hi @mkunesch, thanks for your review.
I tried to address your remarks and also made a rebase against the master branch.
Let me know if anything is wrong.

mkunesch · 2022-03-27T17:20:48Z

Hi @eserie!

How you are getting on with the other changes? No rush at all, but I wanted to make sure you are not waiting for my review of the changes you have already made in the meantime! Thanks a lot for making them - I've added one comment on the change re eps.

Thanks a lot!

eserie · 2022-04-09T09:40:36Z

Hi @mkunesch !
I think I reviewed all of your comments and I should have addressed them. Can you tell me if sounds good to you?

mkunesch · 2022-04-18T19:56:59Z

Hi!
Ah, I think GitHub collapsed these comments in the discussion view. There were a few more - I've tagged in you in some (but not all) of them so that GitHub shows them for you. They should also be visible in the Files Changed view.

Some of them are with regards to the tests - but we could also merge into experimental where we require less testing for new code.

Thanks a lot!

mtthss · 2022-07-14T09:07:35Z

Hello @eserie, can we get this submitted?

eserie · 2022-07-14T23:03:20Z

Hello @mtthss, @mkunesch , sorry I've been pretty busy the last 3 months. @mkunesch, I can see your comments in Files Changed view, sorry I missed them before. Hopefully, I can take a look at it by the end of the month, if that's ok with you.

eserie · 2022-08-03T23:06:33Z

Hi, I finally get to run over all your comment and added a new test for the multi-dimensional weights case.
I finally did a rebase of my changes on the current master branch and fuse all my commits in one.
I also ran the tests and checks as in the test.sh file, everithing looks good.

mtthss · 2022-08-23T08:52:09Z

There is still one change requested by @mkunesch, could you address and then we get this submitted? thanks a lot for the contribution!

mtthss · 2022-08-23T08:58:56Z

This branch has conflicts that must be resolved

mkunesch · 2022-08-23T16:03:22Z

Hi! Thanks a lot @eserie for making the changes. I just returned from holiday and will make sure to review by the end of the week.

eserie · 2022-08-25T21:14:51Z

Hi @mkunesch, thank you! Let me know if everything looks good to you. Unfortunatly, I accidentally closed the pull request while rebasing my branch. Can you open it again? The last commit to reattach is fa028f5.

mkunesch

Hi! Thanks a lot for making the changes!

I did another pass and have a few detailed comments (mostly formatting).

The checks currently fail - but I think that's due to tree_multimap being deprecated. You can just replace it by tree_map and it should work.

Thanks a lot again!

optax/_src/alias.py

optax/_src/alias_test.py

optax/_src/transform.py

optax/_src/transform_test.py

eserie · 2022-08-31T15:48:36Z

Hi @mkunesch, thanks for all your comments. I tried to address them all, hope all is ok now!

mkunesch · 2022-09-06T22:26:41Z

Hi!

Thanks a lot! Before merging I wanted to make sure I could use it myself, but had difficulty finding parameters that could find the minimum of a parabola.

Would it be possible to add the optimizer to the test test_optimization in alias_test.py that runs all the optimizers on a parabola and ensures they find the correct minimum? That way there would be examples of the parameters that work.

Other than that, the PR looks great to me up to minor formatting edits (full stops, spaces, capitalization etc) that I could fix during merging if that's ok with you! (But if you prefer I'm also happy to make file comments and you can fix them).

Thanks again!

eserie · 2022-09-13T21:23:33Z

Hi!

I added the online_newton_step optimizer to the test test_optimization in alias_test.py. However, I could not find parameters that fits for the two loss functions. So I addded a special condition in test code to adjust the parameters for one of the loss functions. I hope this is ok.

No problem for me if you want to fix the formatting issues you have seen (on my side the checks done in the test.sh all passed).

NB: to find the parameters, I played with this code in an interactive session:

import jax
import jax.numpy as jnp
import optax
from matplotlib import pyplot as plt
from optax._src import alias
from optax._src import update
from optax._src import numerics

a = 1.0
b = 100.0

initial_params = jnp.array([0.0, 0.0])
final_params = jnp.array([a, a**2])


def fun(params):
    return numerics.abs_sq(a - params[0]) + b * numerics.abs_sq(
        params[1] - params[0] ** 2
    )

opt = alias.online_newton_step(5.0e-1, eps=1.)

params = initial_params
state = opt.init(params)


@jax.jit
def step_(state_params, i):
    state, params = state_params
    val, grad = jax.value_and_grad(fun)(params)
    updates, state = opt.update(grad, state)
    params = update.apply_updates(params, updates)
    return (state, params), (params)


state, params_history = jax.lax.scan(step_, (state, params), jnp.arange(10000))
plt.plot(jnp.linalg.norm(params_history - final_params[None, :], axis=1))
plt.title("$| w - w^* |$")
jnp.linalg.norm(params_history[-1] - final_params)

mkunesch · 2022-10-24T16:31:43Z

Hi @eserie,

Thanks a lot for adding the optimizer to the tests and finding parameter combinations that work!

We have experimented more with the optimizer in the past month (trying it on various functions and some deep learning work loads such as MNIST) and we have concluded that this optimizer may not be a good fit for the core optax API at this moment.

Optax is currently focused on optimizers that can be substituted into most deep learning training loops (ideally with the default parameters), and we have found that when using ONS on DL problems, finding the right parameters can be tricky and it may run out of memory for typical DL networks.

From the paper and what you have written, ONS is at its best in online learning on time series and streaming data, so we would suggest publishing this optimizer as part of a repository that specializes in these applications. You can of course still use the optax machinery and make this work with optax, but we would suggest providing this as a third-party component that uses optax rather than integrating it into the core API. We would of course be happy to prominently post a link to it here in this PR (or a new issue) so that people can find it easily if they search for it in optax.

Sorry to have decided this towards the end of the code review. We will try to prevent this in the future by creating a series of problems every optimizer should solve with standard parameters before starting the review. Still, thank you very much for making all the changes and we hope that the comments are useful for you for your own open-source version!

Thanks a lot for filing the PR again and we hope you understand the decision!

eserie · 2022-10-25T20:52:44Z

Hi @mkunesch,

Thanks a lot to have take time to experiment with it!
I perfectly understand your decision and completely aggree that this optimization method is not suited for batch optimization.

It’s true that I’ve been experimenting with it mostly in online learning settings and started to think the same thing then you when I wrote this last test…

To make the link, I already have an implementation of the ONS method in the open-source project wax-ml (https://github.com/eserie/wax-ml/blob/main/wax/optim/newton.py), I will reflect in it the adjustements we made during this review!

In all case, it was a real pleasure to interact with you and thank you all for maintaining this great project!

google-cla bot added the cla: no label Dec 5, 2021

google-cla bot added cla: yes copybara label for automatic import and removed cla: no labels Dec 14, 2021

mkunesch self-requested a review January 17, 2022 10:29

mkunesch requested changes Jan 23, 2022

View reviewed changes

mkunesch requested changes Jan 27, 2022

View reviewed changes

mkunesch self-assigned this Feb 8, 2022

eserie force-pushed the newton branch from d7da988 to c8b69ee Compare February 13, 2022 11:31

eserie force-pushed the newton branch from c8522e6 to de18169 Compare August 3, 2022 22:49

eserie closed this Aug 25, 2022

eserie force-pushed the newton branch from de18169 to 0fa805b Compare August 25, 2022 20:16

Add "online newton step" optimizer

fa028f5

mkunesch reopened this Aug 28, 2022

mkunesch requested changes Aug 28, 2022

View reviewed changes

eserie added 10 commits August 31, 2022 16:38

fix: replace tree_multimap by tree_map

945d43a

fix: break lines in docstring.

1b1e98b

fix: better docstring

6f47897

fix: add type annotation and rework doctring

465bea2

fix: add blank line

a834750

fix: rename hinv in h_inv

abc2976

fix: rename ref in correct_inverse

19a3173

fix: clarify test

3f01534

feat: use chex.all_variants decorator in test.

e5f7255

fix: correct flake8

e545c41

feat: add online_newton_step to test_optimization

2d52b56

mtthss closed this Oct 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add online newton optimizer #258

Add online newton optimizer #258

eserie commented Dec 5, 2021

mkunesch commented Dec 14, 2021 •

edited

Loading

eserie commented Dec 15, 2021

mkunesch commented Dec 22, 2021

eserie commented Jan 3, 2022

mkunesch commented Jan 17, 2022 •

edited

Loading

mkunesch left a comment •

edited

Loading

mkunesch left a comment

mkunesch Jan 27, 2022

eserie commented Feb 13, 2022

mkunesch commented Mar 27, 2022

eserie commented Apr 9, 2022

mkunesch commented Apr 18, 2022 •

edited

Loading

mtthss commented Jul 14, 2022

eserie commented Jul 14, 2022

eserie commented Aug 3, 2022

mtthss commented Aug 23, 2022

mtthss commented Aug 23, 2022

mkunesch commented Aug 23, 2022

eserie commented Aug 25, 2022

mkunesch left a comment •

edited

Loading

eserie commented Aug 31, 2022

mkunesch commented Sep 6, 2022 •

edited

Loading

eserie commented Sep 13, 2022 •

edited

Loading

mkunesch commented Oct 24, 2022

eserie commented Oct 25, 2022

Add online newton optimizer #258

Add online newton optimizer #258

Conversation

eserie commented Dec 5, 2021

mkunesch commented Dec 14, 2021 • edited Loading

eserie commented Dec 15, 2021

mkunesch commented Dec 22, 2021

eserie commented Jan 3, 2022

mkunesch commented Jan 17, 2022 • edited Loading

mkunesch left a comment • edited Loading

Choose a reason for hiding this comment

mkunesch left a comment

Choose a reason for hiding this comment

mkunesch Jan 27, 2022

Choose a reason for hiding this comment

eserie commented Feb 13, 2022

mkunesch commented Mar 27, 2022

eserie commented Apr 9, 2022

mkunesch commented Apr 18, 2022 • edited Loading

mtthss commented Jul 14, 2022

eserie commented Jul 14, 2022

eserie commented Aug 3, 2022

mtthss commented Aug 23, 2022

mtthss commented Aug 23, 2022

mkunesch commented Aug 23, 2022

eserie commented Aug 25, 2022

mkunesch left a comment • edited Loading

Choose a reason for hiding this comment

eserie commented Aug 31, 2022

mkunesch commented Sep 6, 2022 • edited Loading

eserie commented Sep 13, 2022 • edited Loading

mkunesch commented Oct 24, 2022

eserie commented Oct 25, 2022

mkunesch commented Dec 14, 2021 •

edited

Loading

mkunesch commented Jan 17, 2022 •

edited

Loading

mkunesch left a comment •

edited

Loading

mkunesch commented Apr 18, 2022 •

edited

Loading

mkunesch left a comment •

edited

Loading

mkunesch commented Sep 6, 2022 •

edited

Loading

eserie commented Sep 13, 2022 •

edited

Loading