Implmentation of Quasi-Hyperbolic Momentum for Adam #81

niteya-shah · 2019-02-12T06:53:06Z

Hello , This PR implements the update rule of QHAdam as discussed in Paper .https://arxiv.org/pdf/1810.06801v3.pdf
This PR should be straight forward as it is a simple update rule change like Nadam .

TODO list

Implement QHAdam update rule
Write Tests for QHAdam
Validate the improvements

Empty File , will discuss with assignee about the PR

QHAdam updates and an overloaded constructor for QHadam

1) Added SGD Update QHSGD 2) fixed some Adam issues wrt QHAdam

niteya-shah · 2019-02-23T19:33:30Z

@zoq can you please review my PR ? The Paper also speaks of QHAdamR and QHAdamW which I will implement after my other PR is implemented

zoq · 2019-02-23T21:08:52Z

tests/adam_test.cpp

+/**
+ * Tests the QHadam optimizer using a simple test function.
+ */
+TEST_CASE("SimpleQHdamTestFunction", "[AdamTest]")


Picky style comment, this should be SimpleQHAdamTestFunction.

zoq · 2019-02-23T21:09:08Z

tests/adam_test.cpp

+/**
+ * Run QHAdam on logistic regression and make sure the results are acceptable.
+ */
+TEST_CASE("QHadamLogisticRegressionTest", "[AdamTest]")


See comment above.

zoq · 2019-02-23T21:11:24Z

include/ensmallen_bits/sgd/update_policies/quasi_hyperbolic_update.hpp

+  /**
+   * Construct the Quasi Hyperbolic update policy with the given parameters.
+   * @param v The quasi hyperbolic term.
+   * @param momentum The momentum ter.


term instead of ter.

zoq · 2019-02-23T21:13:53Z

include/ensmallen_bits/sgd/update_policies/quasi_hyperbolic_update.hpp

+ * QHSGD). This allows this method to combine the features of many optimisers
+ * and provide better optimisation control.
+ *
+ * TODO: Paper information


zoq · 2019-02-23T21:15:20Z

include/ensmallen_bits/adam/qhadam_update.hpp

+
+/**
+ *
+ * TODO: Fill in these details as well on info about algorithm


Good idea 👍

zoq · 2019-02-23T21:40:35Z

include/ensmallen_bits/adam/qhadam_update.hpp

+    const double biasCorrection1 = 1.0 - std::pow(beta1, iteration);
+    const double biasCorrection2 = 1.0 - std::pow(beta2, iteration);
+
+    const auto mDash = m / biasCorrection1;


Do you. mind to use double here to be consistent with the rest of the codebase?

zoq · 2019-02-23T21:42:10Z

include/ensmallen_bits/adam/qhadam_update.hpp

+
+    iterate -= stepSize * ((((1 - v1) * gradient) + v1 * mDash) /
+               (arma::sqrt(((1 - v2) * (gradient % gradient)) +
+               v2 * vDash) + epsilon ));


Can you remove the extra space at the end.

zoq · 2019-02-23T21:43:44Z

include/ensmallen_bits/adam/qhadam_update.hpp

+    const auto mDash = m / biasCorrection1;
+    const auto vDash = v / biasCorrection2;
+
+    iterate -= stepSize * ((((1 - v1) * gradient) + v1 * mDash) /


Do you mind to add a comment here, "QHAdam recovers Adam when ν1=ν2= 1.".

zoq · 2019-02-23T21:44:19Z

include/ensmallen_bits/adam/qhadam_update.hpp

+  // The number of iterations.
+  double iteration;
+
+  //The first quasi-hyperbolic term.


Missing space right after //. Same for v2.

zoq · 2019-02-23T21:44:49Z

include/ensmallen_bits/adam/qhadam_update.hpp

+   *
+   * @param epsilon The epsilon value used to initialise the squared gradient
+   *        parameter.
+   * @param beta1 The smoothing parameter.


Looks like the order is wrong.

niteya-shah · 2019-02-24T21:09:51Z

@zoq sorry for the TODO , I had forgotten that I had put them , I have made the changes.

incorrect parameter type of double assigned to variable

niteya-shah · 2019-03-27T08:39:51Z

@zoq can you review my PR?

zoq · 2019-03-27T21:33:20Z

include/ensmallen_bits/qhadam/qhadam.hpp

+/**
+ * @file adamw.hpp
+ * @author Niteya Shah
+ *


Can you add add the license and a description.

zoq · 2019-03-27T21:33:40Z

include/ensmallen_bits/qhadam/qhadam.hpp

+ * @author Niteya Shah
+ *
+ */
+#ifndef ENSMALLEN_ADAM_ADAMW_HPP


This is not AdamW

zoq · 2019-03-27T21:35:22Z

include/ensmallen_bits/qhadam/qhadam.hpp

+   * @param beta1 Exponential decay rate for the first moment estimates.
+   * @param beta2 Exponential decay rate for the weighted infinity norm
+            estimates.
+   * @param eps Value used to initialise the mean squared gradient parameter.


Do you mind to rename this one to epsilon to match the rest of the code?

zoq · 2019-03-27T21:35:36Z

include/ensmallen_bits/qhadam/qhadam.hpp

+   * @param resetPolicy If true, parameters are reset before every Optimize
+   *        call; otherwise, their values are retained.
+   */
+


We can remove the extra line here.

zoq · 2019-03-27T21:36:20Z

include/ensmallen_bits/qhadam/qhadam.hpp

+  double& V2() { return optimizer.UpdatePolicy().V2(); }
+
+  private:
+  //! The Stochastic Gradient Descent object with AdamW policy.


We use QHAdam here.

zoq · 2019-03-27T21:38:43Z

include/ensmallen_bits/qhadam/qhadam_update.hpp

+/**
+ * QHAdam is a optimising strategy based on the Quasi-Hyperbolic step when
+ * applied to the Adam Optimiser . QH updates can be considered to a weighted
+ * average of the momentum . QHAdam , based on its paramterisation can recover


Looks like there are some spaces we can remove.

zoq · 2019-03-27T21:39:20Z

include/ensmallen_bits/qhadam/qhadam_update.hpp

+  /**
+   * Construct the QHAdam update policy with the given parameters.
+   *
+   * @param v1 The first quasi-hyperbolic term.


The parameter description order dosn't match with the one below.

zoq · 2019-03-27T21:40:39Z

include/ensmallen_bits/sgd/update_policies/quasi_hyperbolic_update.hpp

+ public:
+  /**
+   * Construct the Quasi Hyperbolic update policy with the given parameters.
+   * @param v The quasi hyperbolic term.


Can you add another empty line between the method description and the beginning of the parameter description.

zoq · 2019-03-27T21:42:34Z

include/ensmallen_bits/sgd/update_policies/quasi_hyperbolic_update.hpp

+   */
+   QHUpdate(const double v = 0.7,
+            const double momentum = 0.999) :
+     momentum(momentum),


Can you use 4 spaces instead of 2 here.

zoq · 2019-03-27T21:43:26Z

tests/quasi_hyperbolic_momentum_sgd_test.cpp

+/**
+ * @file quasi_hyperbolic_momentum_sgd_test.cpp
+ * @author Niteya Shah
+ *


See comment above.

zoq · 2019-04-14T20:18:25Z

doc/optimizers.md

+
+#### See also:
+
+  * [Quasi-Hyperbolic Momentom and Adam For Deep Learning](https://arxiv.org/pdf/1810.06801.pdf)


Should be "Momentum".

zoq · 2019-04-14T20:19:42Z

doc/optimizers.md

+ *An optimizer for [differentiable separable functions](#differentiable-separable-functions).*
+
+ QHAdam is an optimizer which implements the QHAdam Adam algorithm
+ which uses Quasi-Hyperbolic Descent with the Adam Optimizer. This Method is the Adam Variant of the Quasi -


Should be "This method is the Adam variant of the quasi-hyperbolic update for Adam".

zoq

Looks good to me, there are some minor style issue which I can fix during the merge process.

mlpack-bot

Second approval provided automatically after 24 hours. 👍

rcurtin

@niteya-shah nice contribution, sorry it took me so long to review this. There are some number of little style issues and documentation bits, but I think they can all be handled during merge. @zoq if you want, I can do the merge and style/doc fixes, just let me know. I saw that you had a couple changes you wanted to make too. In any case, if you merge it, we should add something to HISTORY.md and then I'll run my release script. :)

rcurtin · 2019-05-12T17:11:21Z

doc/function_types.md

@@ -508,6 +508,7 @@ The following optimizers can be used with differentiable functions:
 - [AdaDelta](#adadelta)
 - [AdaGrad](#adagrad)
 - [Adam](#adam)
+ - [QHAdam](#qhadam)


Just a minor note, maybe we can address it during merge---all the other optimizers are arranged alphabetically, we should probably keep it like that.

rcurtin · 2019-05-12T17:12:44Z

doc/optimizers.md

@@ -1394,6 +1394,114 @@ double Optimize(arma::mat& X);
 * [Semidefinite programming on Wikipedia](https://en.wikipedia.org/wiki/Semidefinite_programming)
 * [Semidefinite programs](#semidefinite-programs) (includes example usage of `PrimalDualSolver`)

+## Quasi-Hyperbolic Momentum Update


Should this be called Quasi-Hyperbolic Momentum Update SGD (QHSGD)? The anchors will need to be updated. (In fact the qhsgd anchor right now doesn't point anywhere.)

rcurtin · 2019-05-12T17:14:02Z

doc/optimizers.md

+
+ Quasi Hyperbolic Momentum Update is an update policy for SGD where the Quasi Hyperbolic terms are added to the
+ parametrisation. Simply put QHM’s update rule is a weighted average of momentum’s and plain SGD’s
+ update rule.


This is a bit confusing for users since it's not described as an optimizer. I would suggest something like this:

Quasi-hyperbolic momentum update SGD (QHSGD) is an SGD-like optimizer with momentum where quasi-hyperbolic terms are added to the parametrization. The update rule for this optimizer is a weighted average of momentum SGD and vanilla SGD.

rcurtin · 2019-05-12T17:17:03Z

doc/optimizers.md

+ different values of those terms can recover the following Adam Polices
+ QHAdam recovers Adam when ν1 = ν2 = 1
+ QHAdam recovers RMSProp when ν1 = 0 and ν2 = 1
+ QHAdam reovers NAdam when ν1 = β1 and ν2 = 1


I think we can do a little cleanup here too:

QHAdam is an optimizer that uses quasi-hyperbolic descent with the Adam optimizer. This replaces the moment estimators of Adam with quasi-hyperbolic terms, and different values of the `v1` and `v2` parameters are equivalent to the following other optimizers: * When `v1 = v2 = 1`, `QHAdam` is equivalent to `Adam`. * When `v1 = 0` and `v2 = 1`, `QHAdam` is equivalent to `RMSProp`. * When `v1 = beta1` and `v2 = 1`, `QHAdam` is equivalent to `Nadam`.

I really like the reductions in the documentation.

rcurtin · 2019-05-12T17:17:23Z

doc/optimizers.md

+#### See also:
+  * [Quasi-Hyperbolic Momentum and Adam For Deep Learning](https://arxiv.org/pdf/1810.06801.pdf)
+  * [SGD in Wikipedia](https://en.wikipedia.org/wiki/Stochastic_gradient_descent)
+  * [SGD](#standard-sgd)


Might be worth also linking to Adam, Nadam, and RMSprop? (And vice versa from those?)

rcurtin · 2019-05-12T17:17:50Z

include/ensmallen_bits/qhadam/qhadam.hpp

+ *
+ * QHAdam can optimize differentiable separable functions.
+ * For more details, see the documentation on function
+ * types included with this distribution or on the ensmallen website.


The lines look a little short here, they can probably be streamlined.

rcurtin · 2019-05-12T17:18:18Z

include/ensmallen_bits/qhadam/qhadam.hpp

+          const bool resetPolicy = true);
+
+   /**
+   * Optimize the given function using QHAdam. The given starting point will be


The spacing seems off on this comment, it should be of the form

/** * comment */

(note the alignment of the *)

rcurtin · 2019-05-12T17:18:39Z

include/ensmallen_bits/qhadam/qhadam_impl.hpp

+ // In case it hasn't been included yet.
+#include "qhadam.hpp"
+
+ namespace ens {


Spacing seems off here too.

rcurtin · 2019-05-12T17:20:04Z

tests/quasi_hyperbolic_momentum_sgd_test.cpp

+/*
+* Tests the Quasi Hyperbolic Momentum SGD update policy.
+*/
+TEST_CASE("QHSGDSpeedUpTestFunction", "[QHMomentumSGDTest]")


The name SpeedUp probably isn't meaningful here, just QHSGDTestFunction should be fine.

mlpack-bot · 2019-05-12T17:21:20Z

Hello there! Thanks for your contribution. I see that this is your first contribution to mlpack. If you'd like to add your name to the list of contributors in src/mlpack/core.hpp and COPYRIGHT.txt and you haven't already, please feel free to push a change to this PR---or, if it gets merged before you can, feel free to open another PR.

In addition, if you'd like some stickers to put on your laptop, I'd be happy to help get them in the mail for you. Just send an email with your physical mailing address to stickers@mlpack.org, and then one of the mlpack maintainers will put some stickers in an envelope for you. It may take a few weeks to get them, depending on your location. 👍

mlpack-bot · 2019-05-12T17:21:20Z

Hello there! Thanks for your contribution. I see that this is your first contribution to mlpack. If you'd like to add your name to the list of contributors in src/mlpack/core.hpp and COPYRIGHT.txt and you haven't already, please feel free to push a change to this PR---or, if it gets merged before you can, feel free to open another PR.

In addition, if you'd like some stickers to put on your laptop, I'd be happy to help get them in the mail for you. Just send an email with your physical mailing address to stickers@mlpack.org, and then one of the mlpack maintainers will put some stickers in an envelope for you. It may take a few weeks to get them, depending on your location. 👍

rcurtin · 2019-05-12T17:22:06Z

Sorry for the erroneous mlpack-bot message :( (but if you want stickers, as always, we're happy to send them)

rcurtin · 2019-05-15T01:29:19Z

@niteya-shah thanks for the contribution! I am releasing 1.15.0 with the new support now. I made some style fixes in 7e8108d.

niteya-shah added 5 commits February 12, 2019 12:15

First Commit

39af38d

Empty File , will discuss with assignee about the PR

Implementation of QHAdam Update

160c1f3

QHAdam updates and an overloaded constructor for QHadam

Added inline to template Specialisation

9f481ab

Fixes to Adam and addition of QHSGD

60672c1

1) Added SGD Update QHSGD 2) fixed some Adam issues wrt QHAdam

Added Tests for QHAdam

adb7efe

zoq reviewed Feb 23, 2019

View reviewed changes

niteya-shah and others added 4 commits February 24, 2019 04:58

Style Fixes and additional documentation

d1976ed

Style Fixes and additional documentation

30ffe57

Merge remote-tracking branch 'niteya-shah/QHAdam' into QHAdam

0434e7c

Update qhadam_update.hpp

e1d9917

Fix to Parameter type

63238ad

incorrect parameter type of double assigned to variable

rcurtin added c: optimizers s: needs review t: added feature labels Mar 10, 2019

niteya-shah and others added 9 commits March 18, 2019 10:24

Added documentation

d8a7274

fix to resolve conflict

6bc84bf

Merge branch 'master' into QHAdam

35d1868

Added changes to reflect those of AdamW

32ab0d7

Documentation Fixes and Added test for QHSGD

9ec5eb4

added to function types

97b9494

documentation fix

a40b2a1

Changed some test parameters

38d913e

documentation fixes

692b692

Documentation FIxes and parameterisation change

438aea6

zoq reviewed Mar 27, 2019

View reviewed changes

Documentation Fixes

6a81c91

zoq reviewed Apr 14, 2019

View reviewed changes

doc fixes

14cddea

zoq approved these changes Apr 21, 2019

View reviewed changes

mlpack-bot bot approved these changes Apr 22, 2019

View reviewed changes

rcurtin approved these changes May 12, 2019

View reviewed changes

documentation fixes wrt review

0f17a19

rcurtin merged commit 0f17a19 into mlpack:master May 15, 2019


		#### See also:

		* [Quasi-Hyperbolic Momentom and Adam For Deep Learning](https://arxiv.org/pdf/1810.06801.pdf)

Implmentation of Quasi-Hyperbolic Momentum for Adam #81

Implmentation of Quasi-Hyperbolic Momentum for Adam #81

Conversation

niteya-shah commented Feb 12, 2019 • edited Loading

niteya-shah commented Feb 23, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

niteya-shah commented Feb 24, 2019

niteya-shah commented Mar 27, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zoq left a comment

Choose a reason for hiding this comment

mlpack-bot bot left a comment

Choose a reason for hiding this comment

rcurtin left a comment

Choose a reason for hiding this comment

rcurtin May 12, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mlpack-bot bot commented May 12, 2019

mlpack-bot bot commented May 12, 2019

rcurtin commented May 12, 2019

rcurtin commented May 15, 2019

niteya-shah commented Feb 12, 2019 •

edited

Loading

rcurtin May 12, 2019 •

edited

Loading