Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-init weights to avoid recompile #609

Closed
PilCAki opened this issue Aug 27, 2015 · 20 comments
Closed

Re-init weights to avoid recompile #609

PilCAki opened this issue Aug 27, 2015 · 20 comments

Comments

@PilCAki
Copy link

PilCAki commented Aug 27, 2015

If you'd like to average NN's with different inits, currently there is no easy method for getting a different random init of weights without recompiling.

Currently, I can train nets on medium sized data in about 1 minute but it sometimes takes several minutes to compile if I've got particularly complex structure.

It's possible to save the weights and re-init with the original weights. However, there should be method to reset to a "like-new" state non-identical to first init without having to recompile.

@fchollet
Copy link
Collaborator

Thought about it, I think I agree.

Would you be interested in submitting a PR for a layer.reset() method, from which we could build a model.reset() method? (note that the layer method should also work for containers, recursively, since containers must have the same API as the layers).

@PilCAki
Copy link
Author

PilCAki commented Aug 28, 2015

Yes, I would be interested in that. I'll see what I can do, thanks :)

@pkch
Copy link

pkch commented Dec 2, 2015

In the meantime, it might be worth clarifying in the documentation how to get a set of random init weights. At this point, I don't think it occurs just from calling recompile?

@fgolemo
Copy link

fgolemo commented Mar 7, 2016

After some experimenting around:

  • you can go through each layer and call its build() function, which should reset the weights, but doesn't affect the compiled model
  • you can recompile, but that doesn't reset the weights
  • you can do both in that order and it will work (as in randomize the weights and biases)

I tried implementing that here (only for Sequential model so far): #1908
please give me feedback if that works for you. I tried with a few different models and it seemed to successfully reset.

@seven7e
Copy link

seven7e commented Sep 5, 2016

@fgolemo Hi, I also meet the need for reseting weights at the start of each round of cross validation. Is the reset() function available in the release now?

@gokceneraslan
Copy link
Contributor

No.

@stale stale bot added the stale label Jun 13, 2017
@stale
Copy link

stale bot commented Jun 13, 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

@daferna
Copy link

daferna commented Jun 15, 2017

I would like this feature, because I feel like something suspicious is happening with the memory allocator (at least using the Theano backend). I am using the functional API where I am trying to do cross validation, and even if I (1) delete the python variables for my network, (2) recreate the network and recompile it, (3) print out the auto-generated Theano tensor names using model.summary() (verifying that they are actually different) and re-fit the network with different data, I notice my per-fold loss is (nearly) monotonically decreasing.

For example, my fold losses come out to be something like: [1.17, 0.22, 0.08, 0.004, 0.04], which seems like later folds somehow are getting access to the weights of the previous fold.

@stale stale bot removed the stale label Jun 15, 2017
@vmalyi
Copy link

vmalyi commented Jul 11, 2017

I also feel a need of layer.reset() method for the case when I train multiple models with the same configuration but different data.

Thanks!

@stale
Copy link

stale bot commented Oct 9, 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

@stale stale bot added the stale label Oct 9, 2017
@HeikoSchuett
Copy link

Hi! Has there been any progress on this? I would also love to have this feature for crossvalidation of models!

@stale stale bot removed the stale label Nov 1, 2017
@AndersAsa
Copy link

Also interested

@javedqadruddin
Copy link

This seems to work: https://www.codementor.io/nitinsurya/how-to-re-initialize-keras-model-weights-et41zre2g

@bschreck
Copy link

@javedqadruddin That doesn't work for recurrent layers

@MinnML
Copy link

MinnML commented Aug 28, 2018

I have the same issue. My workaround for now is to put all the lines of model creation and compile in a function, do "del model" at the end of each iteration, and recreate using that function. So for example, my loss was starts from roughly the same point and not monotonically decreasing.

@ceylanb
Copy link

ceylanb commented Jun 17, 2019

I have the same issue. My workaround for now is to put all the lines of model creation and compile in a function, do "del model" at the end of each iteration, and recreate using that function. So for example, my loss was starts from roughly the same point and not monotonically decreasing.

Memory leak occurred when I create the model and delete it using del model . It does not work. I just need to set layer weights randomly and in this process, layer initializer should be used instead of numpy.

This seems to work: https://www.codementor.io/nitinsurya/how-to-re-initialize-keras-model-weights-et41zre2g

I have tried this solution but an error occured because of empty layer initializer. The error was "*** AttributeError: 'NoneType' object has no attribute 'run'"

@raullalves
Copy link

Any progress on re-initializing a model for Recurrent layers ?

@vedal
Copy link

vedal commented Apr 7, 2020

Is this currently on the roadmap?

@Korrakas
Copy link

Kindly but strongly revive this thread.
K-Fold cross validation isn't an esoteric stuff to do with your data, yet the impossibility to do a simple & clear weight reinitialization makes it unnecessary complicated. Resetting the initial weights, while kinda working, 1) is hacky 2) introduce a bias that doesn't fit one of KFolds objective - evaluating an architecture.

@GF-Huang
Copy link

Any solutions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests