You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(I apologize if this question is better suited for StackOverflow, but I figure posting it here will reach the right audience in a shorter amount of time.)
I'm training this CTC-cost model on the Librispeech "train-other-500" dataset, which contains 500 hours of speech audio+transcripts. I'm using the "dev-other" data set for development, which is apparently a more challenging audio set to model.
I trained the model over 20 epochs and have provided the distribution of the costs below.
Since the validation performance plateaus at around iter=25000, I decided to checkpoint the model here and continue running the model using an exponential learning-rate decay schedule. The learning rate is decreased after each epoch (starting from iter=25000). The CTC costs using this learning-rate decay schedule are shown below after a few epochs:
Unfortunately, this strategy doesn't appear to improve the model performance. Does anyone have any suggestions on how to improve the model other than what I've described above?
The text was updated successfully, but these errors were encountered:
From the looks of it, your model seems to have high variance. You should try reducing the initial learning rate, add in regularization (dropout/augment with noise) or play with the model architecture if these ideas don't work.
(I apologize if this question is better suited for StackOverflow, but I figure posting it here will reach the right audience in a shorter amount of time.)
I'm training this CTC-cost model on the Librispeech "train-other-500" dataset, which contains 500 hours of speech audio+transcripts. I'm using the "dev-other" data set for development, which is apparently a more challenging audio set to model.
I trained the model over 20 epochs and have provided the distribution of the costs below.
The weights are updated according to Nesterov momentum.
Since the validation performance plateaus at around iter=25000, I decided to checkpoint the model here and continue running the model using an exponential learning-rate decay schedule. The learning rate is decreased after each epoch (starting from iter=25000). The CTC costs using this learning-rate decay schedule are shown below after a few epochs:
Unfortunately, this strategy doesn't appear to improve the model performance. Does anyone have any suggestions on how to improve the model other than what I've described above?
The text was updated successfully, but these errors were encountered: