training for classification does not converge #19

zwangab91 · 2018-07-27T06:52:31Z

I tried to train the classification models for alexnet and inception, with the hyperparameters in train.py ( 'learning_rate_decay_type': 'exponential', 'learning_rate': '0.01', 'learning_rate_decay_factor': '0.1'), but the loss fluctuates around 6 and 11 respectively for the two models. I tried to tune the learning rate in the range from 1e-5 to 0.1, but the training still shows no sign of convergence (even after 10,000 steps). Could you inform me of the hyperparameters chosen for the training of the classification models in order to reproduce the results, and the final values of the cross-entropy loss?

yuantailing · 2018-07-27T08:27:32Z

We didn't tune hyper-parameters. The hyper-parameters we used is what you find in git.
I forget the cross-entropy loss. But the loss is only cross-entropy loss as I know.
10,000 steps is far from convergence, we trained 100,000 steps. 1 epoch is 800,000 / 64 = 12,500. Don't pray net learning well before 1 epoch.

Please be patient, I believe you can reproduce the exact result (the only problem is random seed) without any modification.

zwangab91 · 2018-07-30T10:23:36Z

Thanks! The loss did drop down to around 2 after 5 epochs of training.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

training for classification does not converge #19

training for classification does not converge #19

zwangab91 commented Jul 27, 2018 •

edited

Loading

yuantailing commented Jul 27, 2018 •

edited

Loading

zwangab91 commented Jul 30, 2018

training for classification does not converge #19

training for classification does not converge #19

Comments

zwangab91 commented Jul 27, 2018 • edited Loading

yuantailing commented Jul 27, 2018 • edited Loading

zwangab91 commented Jul 30, 2018

zwangab91 commented Jul 27, 2018 •

edited

Loading

yuantailing commented Jul 27, 2018 •

edited

Loading