Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about whether to use additional datasets #5

Open
challenge-my opened this issue Dec 3, 2024 · 22 comments
Open

Questions about whether to use additional datasets #5

challenge-my opened this issue Dec 3, 2024 · 22 comments

Comments

@challenge-my
Copy link

challenge-my commented Dec 3, 2024

Hi Kevin,

I am very interested in your great work at DCASE2023.Therefore,I have some questions . Did you use GPU for training in your TensorFlow code? I got an error during the training process. Have you encountered this before? How did you handle it? Thank you. Here is the error message and error code.

Thank you!
Olaf

image

@wilkinghoff
Copy link
Owner

Hi Olaf,

Thank you for your interest! Yes, I used GPUs for training.

It seems that something went wrong during training. Maybe the scale parameter of the loss function somehow blew up. Sometimes I also observed this. Does this error occur every time you are training?

If you want you can try to replace the loss with the AdaProj loss, which is numerically much more stable: https://github.com/wilkinghoff/AdaProj

Best,
Kevin

@challenge-my
Copy link
Author

First of all, thank you for your advice. I will run the code in the link you sent on my server. I have been encountering the above problem for about 8 times. Secondly, can I ask two questions? In the ./eval_data directory, I put the additional dataset and the evaluation dataset together according to the machine type, and then put the development dataset in the ./dev_data directory. I want to confirm whether this is correct? Thank you

@challenge-my
Copy link
Author

challenge-my commented Dec 4, 2024

Hi Kevin,
It still doesn't work. I ran the code repeatedly for more than ten times, and it still reported the above error. Do you have any good suggestions? Thank you

@wilkinghoff
Copy link
Owner

First of all, thank you for your advice. I will run the code in the link you sent on my server. I have been encountering the above problem for about 8 times. Secondly, can I ask two questions? In the ./eval_data directory, I put the additional dataset and the evaluation dataset together according to the machine type, and then put the development dataset in the ./dev_data directory. I want to confirm whether this is correct? Thank you

Hi Olaf,

yes, I think so. You should have a structure like this:
/dev_data/machine_type1/train/file42
or
/eval_data/machine_type3/test/file66

Best,
Kevin

@wilkinghoff
Copy link
Owner

Hi Kevin, It still doesn't work. I ran the code repeatedly for more than ten times, and it still reported the above error. Do you have any good suggestions? Thank you image

Hi Olaf,

have you deleted the stored network weights between runs?

To make sure that the scale parameter is really the issue, you can replace line 41 in subcluster_adacos.py with
"if training:" --> "if 0:"

Best,
Kevin

@challenge-my
Copy link
Author

Hi Kevin,
After I deleted the stored model weights, I found that the numerical error above was solved, but there was a type error. I tried many times but couldn't solve it. Have you encountered this problem? Thank you

image

@challenge-my
Copy link
Author

Hi Kevin,
I would like to thank you for sharing your code. I encountered some difficulties when trying to run it locally and was unable to get it to work properly. To better proceed with my work, I was wondering if you could send me the version that is running successfully on your machine, which would help me better understand and resolve the issues. My email address is lixiang1@stu.xmu.edu.cn. I would greatly appreciate your help.

Additionally, I will make sure to cite your work wherever I use your code, acknowledging your contribution.

Thank you again for your support!

Best regards,
Olaf

@wilkinghoff
Copy link
Owner

Hi Kevin, After I deleted the stored model weights, I found that the numerical error above was solved, but there was a type error. I tried many times but couldn't solve it. Have you encountered this problem? Thank you

image

Hi Olaf,

that's good! I sometimes noticed this error when storing/re-loading the extracted features to hard disk within the same run. You can either just re-run the script after the features are already stored or only store them in RAM to solve the issue.

Best,
Kevin

@wilkinghoff
Copy link
Owner

Hi Kevin, I would like to thank you for sharing your code. I encountered some difficulties when trying to run it locally and was unable to get it to work properly. To better proceed with my work, I was wondering if you could send me the version that is running successfully on your machine, which would help me better understand and resolve the issues. My email address is lixiang1@stu.xmu.edu.cn. I would greatly appreciate your help.

Additionally, I will make sure to cite your work wherever I use your code, acknowledging your contribution.

Thank you again for your support!

Best regards, Olaf

Hi Olaf,

Sorry, but I cannot provide you any other code as this is the version I have been running on my machine.

Best,
Kevin

@challenge-my
Copy link
Author

Hi Kevin, After I deleted the stored model weights, I found that the numerical error above was solved, but there was a type error. I tried many times but couldn't solve it. Have you encountered this problem? Thank you
image

Hi Olaf,

that's good! I sometimes noticed this error when storing/re-loading the extracted features to hard disk within the same run. You can either just re-run the script after the features are already stored or only store them in RAM to solve the issue.

Best, Kevin

Thank you very much,Kevin. Best wishes

@challenge-my
Copy link
Author

Hi Kevin, I would like to thank you for sharing your code. I encountered some difficulties when trying to run it locally and was unable to get it to work properly. To better proceed with my work, I was wondering if you could send me the version that is running successfully on your machine, which would help me better understand and resolve the issues. My email address is lixiang1@stu.xmu.edu.cn. I would greatly appreciate your help.
Additionally, I will make sure to cite your work wherever I use your code, acknowledging your contribution.
Thank you again for your support!
Best regards, Olaf

Hi Olaf,

Sorry, but I cannot provide you any other code as this is the version I have been running on my machine.

Best, Kevin

Thank you~

@challenge-my
Copy link
Author

Hi,Kevin
First of all, thank you for your answer above. Secondly, I have fixed the error above, but I encountered a problem when I ran the code again. I would be very grateful if you could tell me what this folder ('./dcase2023_task2_evaluator-main/ground_truth_data/ground_truth_') is for and where the files inside it come from? How should I get these files? You can see the red part in the picture for the specific directory and code. Thank you
image

Best, Olaf

@wilkinghoff
Copy link
Owner

Hi Olaf,
Happy that I can help. You can find the files in this official repo: https://github.com/nttcslab/dcase2023_task2_evaluator
Best,
Kevin

@challenge-my
Copy link
Author

Hi,Kevin
Thanks to your help, I finally ran the results, but my results on the evaluation set are very poor, basically only about 50%, far less than the accuracy in your paper, some results are even about 20% lower than yours. What do you think is the reason? Is it the parameters? For example, bachsize or not enough rounds of training? Thank you

image

Best,Olaf

@wilkinghoff
Copy link
Owner

Hi Olaf,

the results for the evaluation set are completely random and the reason is that the audio recordings and label files do not correspond to each other because they are not sorted correctly. This depends on your operating system. You can try some sorting of the file lists before loading (both for the files themselves and the corresponding labels).

Best,
Kevin

@challenge-my
Copy link
Author

Hi,Kevin
Thank you for your answer. May I ask another question? Are the recordings and label files that you are referring to not corresponding to each other the parts in the picture below? That is, are they the files in the ./dcase2023_task2_evaluator-main/ground_truth_data and ./dcase2023_task2_evaluator-main/ground_truth_domain directories? If so, how should they correspond? I am a beginner, so I am not very familiar with this aspect, but I have been trying my best to learn, and I would be very grateful if you could teach me how to correspond the recording files with the labels, thank you~

image

@wilkinghoff
Copy link
Owner

Hi Olaf,
sure! Sorry, I could have specified that a bit more clearly. You are right. The mismatch comes from the way the evaluation files are loaded because there I am just looping over the directory (see lines 357/358 in main_statex+featex.py). Depending on the OS this may result in another order as the alphabetical one. So the only thing you need to change, is to sort the array eval_files alphabetically and use the same sorting indices to change the order of all eval_raw, eval_ids, eval_normal, eval_atts and eval_domains directly after loading them.
I hope this helps.
Best,
Kevin

@challenge-my
Copy link
Author

Hi Kevin,
Thank you for your suggestion. I tried your suggestion. I sorted eval_files in lexicographic order, and then I sorted eval_raw, eval_ids, eval_normal, eval_atts and eval_domains in the same order, and then stored them in npy format, but the result of evaluation set is still very poor. The red part is the code I added, which is just the effect of sorting. Do you think it is the problem of the newly added code or some other reason? Thank you
image
image

@wilkinghoff
Copy link
Owner

Hi Olaf,
sorry, my bad! The sorting should be done for the test files of course and not for the eval files. I hope that this solves your issues.
Best,
Kevin

@challenge-my
Copy link
Author

Hi Kevin,
Thank you very much for your help, I have successfully solved my problem, I will cite you wherever I use your work, thanks again.

I have another small question. Do the results in the red box correspond to the mixed results in your paper?

image
image

Best,
Olaf

@wilkinghoff
Copy link
Owner

Hi Olaf,
glad that it is finally working and sorry for the hassle! Thanks again for your interest.
Yes, they do somehow but you should take the output of the last lines of the script:

print('final results for development set')
print(np.round(np.mean(final_results_dev*100, axis=0), 2))
print(np.round(np.std(final_results_dev*100, axis=0), 2))
print('final results for evaluation set')
print(np.round(np.mean(final_results_eval*100, axis=0), 2))
print(np.round(np.std(final_results_eval*100, axis=0), 2))

as these lines take the mean over 10 independent trials and thus are more reliable results then just taking the results of a single run.

Best,
Kevin

@challenge-my
Copy link
Author

Hi, Kevin

Thanks for your help, thanks again and I wish you all the best in the future.

Olaf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants