Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-trained models on ESC-50 #40

Open
Antoine101 opened this issue Jan 15, 2024 · 3 comments
Open

Pre-trained models on ESC-50 #40

Antoine101 opened this issue Jan 15, 2024 · 3 comments

Comments

@Antoine101
Copy link

Hi Khaled,

I want to use the following checkpoints.
image

Just to make sure, when you say pre-trained models on ESC-50 in this case, you mean (in chronological order):

  1. Using a model trained on ImageNet
  2. To then train it on Audioset
  3. And later fine-tune on it ESC-50

If so, how can I know which config of default_cfgs in model.py was used for these checkpoints above?

Also, have you pre-trained on all ESC-50 folds at once? During a cross-validation in machine learning with sklearn's GridSearch, the model is ultimately refit on all folds with the best hyperparams config found. Shouldn't we do the same in Deep Learning?

Cheers

Antoine

@kkoutini
Copy link
Owner

Hi yes they are trained exactly ImageNet -> Audioset -> ESC-50.
There is a model for each fold: the model with fold1 in its name is trained on all folds except fold 1.
I'm not sure I understand your last question completly but the hyper-parameters used are the same for all folds, passt default config can be found here and here are some examples how to run it. One thing to note, is that the config of the pretrained (specified by arch) should match the config of the model you're trying to fine-tune. I mean you cannot load for example PaSST-L arch=passt_l_kd_p16_128_ap47 while using PaSST-S config, or for example, changing the patch size or overlap. In these cases, the weights shapes won't match when loading the pre-trained models.

@Antoine101
Copy link
Author

Thank you for getting back to me Khaled!

I was trying to do a parallel with sklearn's GridSearchCV which implements cross-validation and has a refit parameter. I basically means that your model is trained on validated on all combination of folds to find the best combination of HP but once its found, the model is refit with this HP conf on the whole dataset (all folds merged).
So I wondered if the same should be done in Deep Learning. The more data, the better, so I would assume that cross-val here just gives you a average performance across all folds but retraining your model on all folds together at the end would give you even better performances.

Now the thing is that ESC-50 is a challenge with pre-made folds and no held-out test set. So you wouldn't be able to test your model trained on all folds.

Anyway that's not really related to your framework, I was just curious to know.

@kkoutini
Copy link
Owner

Hi! thanks for the explanation. I don't know if there a best way to do it since training on all the folds for all hyper-paramters can be slow for large models, but off-course the results will be less noisy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants