Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfixes of Issue #36 #37

Closed
wants to merge 2 commits into from
Closed

Conversation

JP-SystemsX
Copy link
Contributor

Addressing Issue #36

The _get_param function of the finetuning arguments should now:

  • Retrieve All trainable Parameters as long as naming convention is kept
  • Avoid Exploding learning rate due to bug in calculation of learning rate during Layer wise decreasing Learning rate
  • Raise exception if a model doesn't keep the naming convention

Still has problems with:

  • Models that don't keep naming conventions like ALBERT

_get_layer_params function didn't include all necessary Parameters and still didn't throw exception
layer wise decay became easily exponentiated by a negative number which made the lr explode
@chschroeder
Copy link
Contributor

Awesome! Thank you for your work and the PR 👍 .

There is a small hurdle before I can take a close look: These changes would be better on the v1.3.x branch, so that I can quickly publish a v1.3.1 bugfix release. On the main branch, where this is now, everything past that point is unreleased. If we started there, the fix would have to wait for the v2.0.0 release (which needs some more time).

Sorry if that has been unclear. I can assist you with the git rebase tomorrow, I am too tired now ;).

@JP-SystemsX
Copy link
Contributor Author

Oh sorry my bad I pressed the wrong button, I actually wanted those changes being applied to the dev branch where I also added them and not to the main. (As is written in your 'how to contribute docs')
Would a new PR to the dev branch already solve the problem?

@chschroeder
Copy link
Contributor

In this case dev is also not suitable, since in particularly dev already contains v2.0.0 changes. Please start from the the v1.3.x branch. (Then I can make a bugfix release, and we can just merge the changes into dev afterwards.)

I think you could update this PR but if a new PR is easier feel free to just open a new one. Thanks!

@JP-SystemsX
Copy link
Contributor Author

Yeah new PR was far easier.
I cherry picked those commits over to a new branch based on v1.3.x and opened a new PR based on that branch in PR #38

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants