It seems that the this code reproduced results can not achieve the results in the original paper ? #1

YihangLou · 2018-03-20T03:46:47Z

No description provided.

tengshaofeng · 2018-03-20T07:37:09Z

ok, maybe i will try to do some image pre-processing and tune the super parameters to achieve that.
but this code performance well in my own implement about medical images recognition.

YihangLou · 2018-03-20T08:05:20Z

Thanks for your sharing code. Maybe there are many tricks in the original implementation. But the performance margin with the paper reported results are too large. Hope you can perfectly reproduce the results in the future!

tengshaofeng · 2018-03-20T08:24:45Z

ok，i try to pre-process the image and keep the training process same as the paper.
and the current code does not do the padding ,crop, flip and so on, and i use the adam,(the paper is sgd),
and i only trained 100 epochs(about 204 epochs in the paper).

tengshaofeng · 2018-03-21T13:24:20Z

@YihangLou , Hi, today I modified something, and get a new result:
accuracy for cifar10 test set : 92.66%

tengshaofeng · 2018-04-04T01:39:48Z

@YihangLou ， i modify the optimizer, so the newest result now is 0.9354

josianerodrigues · 2018-05-16T15:11:03Z

Hi @tengshaofeng,
This result you got (0.9354) was using only the ResidualAttentionModel_92_32input network in the train.py file? Or do you first pretrain the network using the train_pre.py file and then train using the train.py file?

123moon · 2018-05-18T05:24:56Z

Can you provide a trained model

tengshaofeng · 2018-05-18T06:04:54Z

@josianerodrigues , use only the train.py, train_pre.py is just my back up for code.

tengshaofeng · 2018-05-18T07:14:12Z

@123moon , i have provide the model of the final epoch. its accuracy is 0.9332.

123moon · 2018-05-19T15:09:00Z

you provide the MODEL is 92-32,Do you have a model for the dataset imagenet 224???我可能用英语说不清，打扰你了，你有关于图像维数是224*224的训练模型吗？？你的代码对我很有帮助，可是关于这个数据集，我没办法下载下来，所以请求你的帮助

tengshaofeng · 2018-05-22T02:02:39Z

@123moon ,
有224*224的训练模型的呀， residual_attention_network.py文件中的ResidualAttentionModel_92类就是。
下载imagenet可以访问http://image-net.org/download，需要你自己注册一下

123moon · 2018-05-22T02:16:16Z

嗯呢，我看到了，我想问的是，有木有训练好的模型吖，我这个要跑起来要好久呢，我电脑内存不足，哎

tengshaofeng · 2018-05-22T06:45:28Z

没有这个耶，我电脑也没那么多存储放那么大的数据

josianerodrigues · 2018-05-22T17:50:50Z

Hi @tengshaofeng
Could you tell me what the effect of resetting the learning rate at a particular epoch?

# Decaying Learning Rate
if (epoch+1) / float(total_epoch) == 0.3 or (epoch+1) / float(total_epoch) == 0.6 or (epoch+1) / float(total_epoch) == 0.9:
        lr /= 10
        print('reset learning rate to:', lr)
        for param_group in optimizer.param_groups:
             param_group['lr'] = lr
             print(param_group['lr'])

tengshaofeng · 2018-05-23T01:25:12Z

@josianerodrigues ， it is a trick for learning. when i decrease the learning rate, the loss decrease quickly.
It means that when i use lr=0.1 to train 90 epochs, i found loss tending to converge, then i decrease the lr=0.01, the loss decrease again.

josianerodrigues · 2018-05-23T16:39:16Z

thanks for the explanation :)

zhangrong1722 · 2018-06-10T02:44:34Z

Hi @tengshaofeng
I also work on medical images.You mentioned this code worked well in your own implement about medical images recognition.I am in trouble when classify a medical image dataset.Could you tell me more details about it in your convenience?Or could you add my qq if possible?My qq number is 1922525328.

Thanks.

tengshaofeng · 2018-06-11T06:21:21Z

@estelle1722 , i use the 448input, it can convenience well.

ondrejbiza · 2018-08-06T15:33:59Z

Hi,
I also could not reproduce the results of the paper (with my implementation in Tensorflow) on CIFAR-10 even after exchanging a few emails with the author.

tengshaofeng · 2018-08-07T07:39:23Z

@ondrejba , what is your best result now?

ondrejbiza · 2018-08-07T18:42:49Z

My best accuracy was 94.32%, which is close to 95.01% reported in the paper, but it does not beat ResNet-164 with less parameters.

tengshaofeng · 2018-08-08T09:59:03Z

@ondrejba ， ok，your result are really better. have you read the ResidualAttentionModel_92_32input architecture in my code？ If there are some difference with yours? or if you can share the code with me?

ondrejbiza · 2018-08-14T15:58:46Z

I'm sorry for the delay. I'll look at your code over the weekend.

tengshaofeng · 2018-08-15T02:21:49Z

@ondrejba thanks

ondrejbiza · 2018-08-20T22:03:11Z

I noticed many difference just from looking at residual_attention_network.py:

I use filter size 3 in the first convolution, you use 5 (probably not important)
I don't use max pooling (downsampling 32x32 images after the first convolution is not a good idea)
my filter counts for the three scales are [64, 128, 256] whereas you have [256, 512, 1024] filters

I bet there are more differences but I don't have time to go through the whole attention module.
I hope this helps.

ondrejbiza · 2018-08-20T22:04:49Z

I'm actually surprised that you achieved such a good CIFAR accuracy with max pooling at the start of the network.

josianerodrigues · 2018-08-21T17:27:34Z

Hi @ondrejba, If possible could you make your code available?
Did you get 94% accuracy on what dataset and with which network? ResNet-164? What do you use after the first convolution? Sorry for taking your time.

ondrejbiza · 2018-08-21T18:43:26Z

Hello,
I got 94.32% accuracy with Attention92 on CIFAR-10. The 95.01% accuracy I mentioned is also for Attention92 evaluated on CIFAR-10; it was reported in the Residual Attention Networks paper but I didn't manage to replicate the results.
I will look into open-sourcing my code.

After the first convolution ... there are all the other convolutions in the network followed by average pooling and a single fully-connected layer. This architecture is described in the Residual Attention Networks paper as well as the Identity Mappings paper that is a follow up to the Deep Residual Learning paper.

Cheers,
Ondrej

josianerodrigues · 2018-08-24T13:45:36Z

Thank you :)

ondrejbiza · 2018-08-24T15:56:26Z

You're welcome!
Let me know if you manage to reproduce Fei Wang's results.

zhongleilz · 2018-10-27T02:46:11Z

请问你有出现过这个问题嘛，
TypeError: new() received an invalid combination of arguments - got (float, int, int, int), but expected one of:

(torch.device device)
(torch.Storage storage)
(Tensor other)
(tuple of ints size, torch.device device)
(object data, torch.device device)

tengshaofeng · 2018-10-29T05:51:47Z

@zhongleilz 请参照#3

simi2525 · 2018-11-01T22:26:52Z

Can anyone provide or refer me to trained models for CIFAR-10, CIFAR-100 or ImageNet-2017?

@tengshaofeng I saw the Attention-92 without mixup trained model, could you also upload it for the two results with mixup?

tengshaofeng · 2018-11-02T09:21:41Z

@simi2525 , you can train it yourself. Because trained models for githup is a little big. And when I train the model I have not saved the best model. Sorry.

PistonY · 2018-11-05T03:01:50Z

If you have enough time please have a try imagenet without wd。I use wd with 1e-4 can't reach paper result。

ondrejbiza · 2018-11-05T08:57:06Z

@PistonY Did you use this implementation?

PistonY · 2018-11-05T09:00:05Z

@PistonY Did you use this implementation?

Yes,and I do some simplification.But in Gluon not Pytorch.

simi2525 · 2018-11-05T09:02:56Z

@tengshaofeng for my current project, all I need are trained models, the one already uploaded is good enough. If I get the time to tinker with it in order to get the initial paper results, I'll be sure to let you know.

tengshaofeng · 2018-11-05T09:14:55Z

@simi2525 ,um, the uploaded trained model is actually better than the one in initial paper. The one in paper is accuracy of 95.01% , and the uploaded one is accuracy of 95.4%. Both is based on attention-92.

sankin1770 · 2018-12-11T02:49:29Z

@simi2525 ,um, the uploaded trained model is actually better than the one in initial paper. The one in paper is accuracy of 95.01% , and the uploaded one is accuracy of 95.4%. Both is based on attention-92.
May I ask what is the highest test accuracy of cifar10 in the papers you know to be employed at present？

sankin1770 · 2018-12-12T07:42:27Z

@PistonY , hi, I can not find the paper AttentionResNeXt , Can you provide the paper name? 还是说你自己把Attention和ResNext结合，自己的一个尝试？所以97%是自己给自己定的目标，并不是某篇论文里的最好结果？

感觉用了mixup这种就是结果上了97%也没有啥意义啊毕竟大家用这个都可以上去

tengshaofeng · 2018-12-12T07:47:36Z

@sankin1770 , 没用mixup，也有acc 95.4%，比原文中的高，我这个项目只是复现论文的结果罢了。

PistonY · 2018-12-27T02:35:03Z

@sankin1770 天真．
@tengshaofeng 终于到97%了，太不容易了．

sankin1770 · 2018-12-27T06:11:44Z

@sankin1770 天真．
@tengshaofeng 终于到97%了，太不容易了．

好吧接受你的批评可我还是想不明白用mixup有什么创新大家用了都能提升

PistonY · 2018-12-27T10:41:34Z

@sankin1770 天真．
@tengshaofeng 终于到97%了，太不容易了．

好吧接受你的批评可我还是想不明白用mixup有什么创新大家用了都能提升

多跑跑就知道提升哪怕0.1的精度有多难.方法不在于创新,而在于有用.而且mixup算是大的创新了,但是使用局限性也高．

sankin1770 · 2018-12-28T09:36:55Z

@sankin1770 天真．
@tengshaofeng 终于到97%了，太不容易了．

好吧接受你的批评可我还是想不明白用mixup有什么创新大家用了都能提升

多跑跑就知道提升哪怕0.1的精度有多难.方法不在于创新,而在于有用.而且mixup算是大的创新了,但是使用局限性也高．

是的我初学见谅

tengshaofeng · 2018-12-29T06:23:40Z

@PistonY , 你用了啥方法，提高到97%，求指教

PistonY · 2019-01-02T08:30:50Z

@tengshaofeng https://arxiv.org/pdf/1812.01187.pdf

sankin1770 · 2019-01-09T07:14:31Z

@tengshaofeng https://arxiv.org/pdf/1812.01187.pdf

谢谢你们的帮助我自己改进后也达到97%

tengshaofeng · 2019-01-09T07:27:47Z

@PistonY , u can always give me surprise. thanks.

sankin1770 · 2019-01-09T07:29:06Z

@PistonY , u can always give me surprise. thanks.

你们两个大佬官方胡互吹哈哈

tengshaofeng · 2019-01-09T07:29:56Z

@sankin1770 谢谢你的批判性建议

PistonY · 2019-01-09T10:24:22Z

@sankin1770 你用pytorch复现了那篇论文里面的方法吗?都用了什么到的97?

PistonY · 2019-01-09T10:28:37Z

@tengshaofeng @sankin1770 And welcome to have a look our new FaceRecognition project Gluon-Face

Hiiamein · 2019-02-28T10:24:52Z

I run the code by python train.py. Then I got the following errors. Do you know how to fix it?

model = ResidualAttentionModel().cuda()
  File "/cluster/home/it_stu19/ResidualAttentionNetwork-pytorch/Residual-Attention-Network/model/residual_attention_network.py", line 236, in __init__
    self.residual_block1 = ResidualBlock(32, 128)  # 32*32
  File "/cluster/home/it_stu19/ResidualAttentionNetwork-pytorch/Residual-Attention-Network/model/basic_layers.py", line 16, in __init__
    self.conv1 = nn.Conv2d(input_channels, output_channels/4, 1, 1, bias = False)
  File "/cluster/apps/anaconda3/5.3.0/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 315, in __init__
    False, _pair(0), groups, bias)
  File "/cluster/apps/anaconda3/5.3.0/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 38, in __init__
    out_channels, in_channels // groups, *kernel_size))
TypeError: new() received an invalid combination of arguments - got (float, int, int, int), but expected one of:
 * (torch.device device)
 * (torch.Storage storage)
 * (Tensor other)
 * (tuple of ints size, torch.device device)
 * (object data, torch.device device)

skyguidance · 2019-03-01T06:09:31Z

@ArtechStark It looks like you are using Python 3. The divide function in Python 3 is different in Python 2, which causes float result in nn.conv2D input.
Change ResidualBlock output_layer calculation in basic_layers.py will solve this problem.

Hiiamein · 2019-03-02T05:21:12Z

@skyguidance Thank you very much! The problem is solved now.

gden138 · 2019-05-20T02:53:25Z

@tengshaofeng https://arxiv.org/pdf/1812.01187.pdf

谢谢你们的帮助我自己改进后也达到97%

请问您如何改进的，能不能说一下细节，非常感谢。

November666 · 2019-10-23T10:40:56Z

excuse me，your code bring me a big help about my research，but,when i run the train.py,it appears the following errors,do you konw how to fix it? thank you!
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.

ForkingPickler(file, protocol).dump(obj)

BrokenPipeError: [Errno 32] Broken pipe

tengshaofeng · 2019-10-25T01:55:29Z

Install the torch compiled with cuda，if gpu is available 发自我的iPhone

…

------------------ Original ------------------ From: November666 <notifications@github.com> Date: Wed,Oct 23,2019 6:41 PM To: tengshaofeng/ResidualAttentionNetwork-pytorch <ResidualAttentionNetwork-pytorch@noreply.github.com> Cc: bojohn <307149416@qq.com>, Mention <mention@noreply.github.com> Subject: Re: [tengshaofeng/ResidualAttentionNetwork-pytorch] It seems that the this code reproduced results can not achieve the results in the original paper ? (#1) excuse me，your code bring me a big help about my research，but,when i run the train.py,it appears the following errors,do you konw how to fix it? thank you! Traceback (most recent call last): File "D:/wcj/code/python/ClassicNetwork-master/demo15.py", line 104, in model=ft_net().cuda() File "C:\Users\cg\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 311, in cuda return self._apply(lambda t: t.cuda(device)) File "C:\Users\cg\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 208, in _apply module._apply(fn) File "C:\Users\cg\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 208, in _apply module._apply(fn) File "C:\Users\cg\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 230, in _apply param_applied = fn(param) File "C:\Users\cg\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 311, in return self.apply(lambda t: t.cuda(device)) File "C:\Users\cg\Anaconda3\lib\site-packages\torch\cuda_init.py", line 178, in _lazy_init check_driver() File "C:\Users\cg\Anaconda3\lib\site-packages\torch\cuda_init.py", line 92, in _check_driver raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

It seems that the this code reproduced results can not achieve the results in the original paper ? #1

It seems that the this code reproduced results can not achieve the results in the original paper ? #1

Comments

YihangLou commented Mar 20, 2018

tengshaofeng commented Mar 20, 2018

YihangLou commented Mar 20, 2018

tengshaofeng commented Mar 20, 2018

tengshaofeng commented Mar 21, 2018 • edited Loading

tengshaofeng commented Apr 4, 2018

josianerodrigues commented May 16, 2018

123moon commented May 18, 2018

tengshaofeng commented May 18, 2018

tengshaofeng commented May 18, 2018

123moon commented May 19, 2018

tengshaofeng commented May 22, 2018

123moon commented May 22, 2018

tengshaofeng commented May 22, 2018 • edited Loading

josianerodrigues commented May 22, 2018 • edited Loading

tengshaofeng commented May 23, 2018

josianerodrigues commented May 23, 2018

zhangrong1722 commented Jun 10, 2018

tengshaofeng commented Jun 11, 2018

ondrejbiza commented Aug 6, 2018

tengshaofeng commented Aug 7, 2018

ondrejbiza commented Aug 7, 2018

tengshaofeng commented Aug 8, 2018

ondrejbiza commented Aug 14, 2018

tengshaofeng commented Aug 15, 2018

ondrejbiza commented Aug 20, 2018 • edited Loading

ondrejbiza commented Aug 20, 2018

josianerodrigues commented Aug 21, 2018 • edited Loading

ondrejbiza commented Aug 21, 2018

josianerodrigues commented Aug 24, 2018

ondrejbiza commented Aug 24, 2018

zhongleilz commented Oct 27, 2018

tengshaofeng commented Oct 29, 2018

simi2525 commented Nov 1, 2018

tengshaofeng commented Nov 2, 2018

PistonY commented Nov 5, 2018

ondrejbiza commented Nov 5, 2018

PistonY commented Nov 5, 2018 • edited Loading

simi2525 commented Nov 5, 2018

tengshaofeng commented Nov 5, 2018 • edited Loading

sankin1770 commented Dec 11, 2018 • edited Loading

sankin1770 commented Dec 12, 2018

tengshaofeng commented Dec 12, 2018

PistonY commented Dec 27, 2018

sankin1770 commented Dec 27, 2018

PistonY commented Dec 27, 2018 • edited Loading

sankin1770 commented Dec 28, 2018

tengshaofeng commented Dec 29, 2018

PistonY commented Jan 2, 2019

sankin1770 commented Jan 9, 2019

tengshaofeng commented Jan 9, 2019

sankin1770 commented Jan 9, 2019

tengshaofeng commented Jan 9, 2019

PistonY commented Jan 9, 2019 • edited Loading

PistonY commented Jan 9, 2019

Hiiamein commented Feb 28, 2019

skyguidance commented Mar 1, 2019

Hiiamein commented Mar 2, 2019

gden138 commented May 20, 2019

November666 commented Oct 23, 2019 • edited Loading

tengshaofeng commented Oct 25, 2019 via email

tengshaofeng commented Mar 21, 2018 •

edited

Loading

tengshaofeng commented May 22, 2018 •

edited

Loading

josianerodrigues commented May 22, 2018 •

edited

Loading

ondrejbiza commented Aug 20, 2018 •

edited

Loading

josianerodrigues commented Aug 21, 2018 •

edited

Loading

PistonY commented Nov 5, 2018 •

edited

Loading

tengshaofeng commented Nov 5, 2018 •

edited

Loading

sankin1770 commented Dec 11, 2018 •

edited

Loading

PistonY commented Dec 27, 2018 •

edited

Loading

PistonY commented Jan 9, 2019 •

edited

Loading

November666 commented Oct 23, 2019 •

edited

Loading