-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
It seems that the this code reproduced results can not achieve the results in the original paper ? #1
Comments
ok, maybe i will try to do some image pre-processing and tune the super parameters to achieve that. |
Thanks for your sharing code. Maybe there are many tricks in the original implementation. But the performance margin with the paper reported results are too large. Hope you can perfectly reproduce the results in the future! |
ok,i try to pre-process the image and keep the training process same as the paper. |
@YihangLou , Hi, today I modified something, and get a new result: |
@YihangLou , i modify the optimizer, so the newest result now is 0.9354 |
Hi @tengshaofeng, |
Can you provide a trained model |
@josianerodrigues , use only the train.py, train_pre.py is just my back up for code. |
@123moon , i have provide the model of the final epoch. its accuracy is 0.9332. |
you provide the MODEL is 92-32,Do you have a model for the dataset imagenet 224???我可能用英语说不清,打扰你了,你有关于图像维数是224*224的训练模型吗??你的代码对我很有帮助,可是关于这个数据集,我没办法下载下来,所以请求你的帮助 |
@123moon , |
嗯呢,我看到了,我想问的是,有木有训练好的模型吖,我这个要跑起来要好久呢,我电脑内存不足,哎 |
没有这个耶,我电脑也没那么多存储放那么大的数据 |
Hi @tengshaofeng
|
@josianerodrigues , it is a trick for learning. when i decrease the learning rate, the loss decrease quickly. |
thanks for the explanation :) |
Hi @tengshaofeng Thanks. |
@estelle1722 , i use the 448input, it can convenience well. |
Hi, |
@ondrejba , what is your best result now? |
My best accuracy was 94.32%, which is close to 95.01% reported in the paper, but it does not beat ResNet-164 with less parameters. |
@ondrejba , ok,your result are really better. have you read the ResidualAttentionModel_92_32input architecture in my code? If there are some difference with yours? or if you can share the code with me? |
I'm sorry for the delay. I'll look at your code over the weekend. |
@ondrejba thanks |
I noticed many difference just from looking at residual_attention_network.py:
I bet there are more differences but I don't have time to go through the whole attention module. |
I'm actually surprised that you achieved such a good CIFAR accuracy with max pooling at the start of the network. |
Hi @ondrejba, If possible could you make your code available? |
Hello, After the first convolution ... there are all the other convolutions in the network followed by average pooling and a single fully-connected layer. This architecture is described in the Residual Attention Networks paper as well as the Identity Mappings paper that is a follow up to the Deep Residual Learning paper. Cheers, |
Thank you :) |
You're welcome! |
请问你有出现过这个问题嘛,
|
@zhongleilz 请参照#3 |
Can anyone provide or refer me to trained models for CIFAR-10, CIFAR-100 or ImageNet-2017? @tengshaofeng I saw the Attention-92 without mixup trained model, could you also upload it for the two results with mixup? |
@simi2525 , you can train it yourself. Because trained models for githup is a little big. And when I train the model I have not saved the best model. Sorry. |
If you have enough time please have a try imagenet without wd。I use wd with 1e-4 can't reach paper result。 |
@PistonY Did you use this implementation? |
Yes,and I do some simplification.But in Gluon not Pytorch. |
@tengshaofeng for my current project, all I need are trained models, the one already uploaded is good enough. If I get the time to tinker with it in order to get the initial paper results, I'll be sure to let you know. |
@simi2525 ,um, the uploaded trained model is actually better than the one in initial paper. The one in paper is accuracy of 95.01% , and the uploaded one is accuracy of 95.4%. Both is based on attention-92. |
|
感觉用了mixup这种 就是结果上了97%也没有啥意义啊 毕竟大家用这个都可以上去 |
@sankin1770 , 没用mixup,也有acc 95.4%,比原文中的高, 我这个项目只是复现论文的结果罢了。 |
@sankin1770 天真. |
好吧 接受你的批评 可我还是想不明白用mixup有什么创新 大家用了都能提升 |
多跑跑就知道提升哪怕0.1的精度有多难.方法不在于创新,而在于有用.而且mixup算是大的创新了,但是使用局限性也高. |
是的 我初学 见谅 |
@PistonY , 你用了啥方法,提高到97%, 求指教 |
谢谢你们的帮助 我自己改进后也达到97% |
@PistonY , u can always give me surprise. thanks. |
你们两个大佬官方胡互吹 哈哈 |
@sankin1770 谢谢你的批判性建议 |
@sankin1770 你用pytorch复现了那篇论文里面的方法吗?都用了什么到的97? |
@tengshaofeng @sankin1770 And welcome to have a look our new FaceRecognition project Gluon-Face |
I run the code by python train.py. Then I got the following errors. Do you know how to fix it?
|
@ArtechStark It looks like you are using Python 3. The divide function in Python 3 is different in Python 2, which causes float result in nn.conv2D input. |
@skyguidance Thank you very much! The problem is solved now. |
请问您如何改进的,能不能说一下细节,非常感谢。 |
excuse me,your code bring me a big help about my research,but,when i run the train.py,it appears the following errors,do you konw how to fix it? thank you!
BrokenPipeError: [Errno 32] Broken pipe |
Install the torch compiled with cuda,if gpu is available
发自我的iPhone
…------------------ Original ------------------
From: November666 <notifications@github.com>
Date: Wed,Oct 23,2019 6:41 PM
To: tengshaofeng/ResidualAttentionNetwork-pytorch <ResidualAttentionNetwork-pytorch@noreply.github.com>
Cc: bojohn <307149416@qq.com>, Mention <mention@noreply.github.com>
Subject: Re: [tengshaofeng/ResidualAttentionNetwork-pytorch] It seems that the this code reproduced results can not achieve the results in the original paper ? (#1)
excuse me,your code bring me a big help about my research,but,when i run the train.py,it appears the following errors,do you konw how to fix it? thank you!
Traceback (most recent call last):
File "D:/wcj/code/python/ClassicNetwork-master/demo15.py", line 104, in
model=ft_net().cuda()
File "C:\Users\cg\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 311, in cuda
return self._apply(lambda t: t.cuda(device))
File "C:\Users\cg\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 208, in _apply
module._apply(fn)
File "C:\Users\cg\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 208, in _apply
module._apply(fn)
File "C:\Users\cg\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 230, in _apply
param_applied = fn(param)
File "C:\Users\cg\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 311, in
return self.apply(lambda t: t.cuda(device))
File "C:\Users\cg\Anaconda3\lib\site-packages\torch\cuda_init.py", line 178, in _lazy_init
check_driver()
File "C:\Users\cg\Anaconda3\lib\site-packages\torch\cuda_init.py", line 92, in _check_driver
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
No description provided.
The text was updated successfully, but these errors were encountered: