Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework the Kaiming initialization. #573

Merged
merged 2 commits into from
Nov 20, 2022
Merged

Conversation

LaurentMazare
Copy link
Owner

Add support for the following in Kaiming initialization:

  • Use FanIn or FanOut rather than always defaulting to FanIn.
  • Add support for using a normal distribution.
  • Specify the non-linearity that follows this layer so that the gain is adapted.

This is a breaking change so will only be included in the next major release. It also changes the default behavior when using Kaiming uniform (which is the default for most linear/conv layers): the gain is multiplied by sqrt(2) because of the non-linearity part and there is also a bugfix included here that multiplies the bounds by sqrt(3). Overall the initial values will be scaled up by a factor of sqrt(6).

@LaurentMazare LaurentMazare merged commit df445ed into main Nov 20, 2022
@LaurentMazare LaurentMazare deleted the kaiming-initialization branch May 23, 2023 07:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant