-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-init weights to avoid recompile #609
Comments
Thought about it, I think I agree. Would you be interested in submitting a PR for a |
Yes, I would be interested in that. I'll see what I can do, thanks :) |
In the meantime, it might be worth clarifying in the documentation how to get a set of random init weights. At this point, I don't think it occurs just from calling recompile? |
After some experimenting around:
I tried implementing that here (only for Sequential model so far): #1908 |
@fgolemo Hi, I also meet the need for reseting weights at the start of each round of cross validation. Is the reset() function available in the release now? |
No. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed. |
I would like this feature, because I feel like something suspicious is happening with the memory allocator (at least using the Theano backend). I am using the functional API where I am trying to do cross validation, and even if I (1) delete the python variables for my network, (2) recreate the network and recompile it, (3) print out the auto-generated Theano tensor names using model.summary() (verifying that they are actually different) and re-fit the network with different data, I notice my per-fold loss is (nearly) monotonically decreasing. For example, my fold losses come out to be something like: [1.17, 0.22, 0.08, 0.004, 0.04], which seems like later folds somehow are getting access to the weights of the previous fold. |
I also feel a need of layer.reset() method for the case when I train multiple models with the same configuration but different data. Thanks! |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed. |
Hi! Has there been any progress on this? I would also love to have this feature for crossvalidation of models! |
Also interested |
@javedqadruddin That doesn't work for recurrent layers |
I have the same issue. My workaround for now is to put all the lines of model creation and compile in a function, do "del model" at the end of each iteration, and recreate using that function. So for example, my loss was starts from roughly the same point and not monotonically decreasing. |
Memory leak occurred when I create the model and delete it using
I have tried this solution but an error occured because of empty layer initializer. The error was "*** AttributeError: 'NoneType' object has no attribute 'run'" |
Any progress on re-initializing a model for Recurrent layers ? |
Is this currently on the roadmap? |
Kindly but strongly revive this thread. |
Any solutions? |
If you'd like to average NN's with different inits, currently there is no easy method for getting a different random init of weights without recompiling.
Currently, I can train nets on medium sized data in about 1 minute but it sometimes takes several minutes to compile if I've got particularly complex structure.
It's possible to save the weights and re-init with the original weights. However, there should be method to reset to a "like-new" state non-identical to first init without having to recompile.
The text was updated successfully, but these errors were encountered: