Pre-trained models on ESC-50 #40

Antoine101 · 2024-01-15T09:36:25Z

Hi Khaled,

I want to use the following checkpoints.

Just to make sure, when you say pre-trained models on ESC-50 in this case, you mean (in chronological order):

Using a model trained on ImageNet
To then train it on Audioset
And later fine-tune on it ESC-50

If so, how can I know which config of default_cfgs in model.py was used for these checkpoints above?

Also, have you pre-trained on all ESC-50 folds at once? During a cross-validation in machine learning with sklearn's GridSearch, the model is ultimately refit on all folds with the best hyperparams config found. Shouldn't we do the same in Deep Learning?

Cheers

Antoine

The text was updated successfully, but these errors were encountered:

kkoutini · 2024-01-24T17:29:38Z

Hi yes they are trained exactly ImageNet -> Audioset -> ESC-50.
There is a model for each fold: the model with fold1 in its name is trained on all folds except fold 1.
I'm not sure I understand your last question completly but the hyper-parameters used are the same for all folds, passt default config can be found here and here are some examples how to run it. One thing to note, is that the config of the pretrained (specified by arch) should match the config of the model you're trying to fine-tune. I mean you cannot load for example PaSST-L arch=passt_l_kd_p16_128_ap47 while using PaSST-S config, or for example, changing the patch size or overlap. In these cases, the weights shapes won't match when loading the pre-trained models.

Antoine101 · 2024-01-25T10:43:59Z

Thank you for getting back to me Khaled!

I was trying to do a parallel with sklearn's GridSearchCV which implements cross-validation and has a refit parameter. I basically means that your model is trained on validated on all combination of folds to find the best combination of HP but once its found, the model is refit with this HP conf on the whole dataset (all folds merged).
So I wondered if the same should be done in Deep Learning. The more data, the better, so I would assume that cross-val here just gives you a average performance across all folds but retraining your model on all folds together at the end would give you even better performances.

Now the thing is that ESC-50 is a challenge with pre-made folds and no held-out test set. So you wouldn't be able to test your model trained on all folds.

Anyway that's not really related to your framework, I was just curious to know.

kkoutini · 2024-01-25T12:42:18Z

Hi! thanks for the explanation. I don't know if there a best way to do it since training on all the folds for all hyper-paramters can be slow for large models, but off-course the results will be less noisy.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pre-trained models on ESC-50 #40

Pre-trained models on ESC-50 #40

Antoine101 commented Jan 15, 2024

kkoutini commented Jan 24, 2024

Antoine101 commented Jan 25, 2024

kkoutini commented Jan 25, 2024

Pre-trained models on ESC-50 #40

Pre-trained models on ESC-50 #40

Comments

Antoine101 commented Jan 15, 2024

kkoutini commented Jan 24, 2024

Antoine101 commented Jan 25, 2024

kkoutini commented Jan 25, 2024