模型库概览

概述

基于ImageNet1k分类数据集，PaddleClas支持的36种系列分类网络结构以及对应的175个图像分类预训练模型如下所示，训练技巧、每个系列网络结构的简单介绍和性能评估将在相应章节展现。

评估环境

CPU的评估环境基于骁龙855（SD855）。
Intel CPU的评估环境基于Intel(R) Xeon(R) Gold 6148。
GPU评估环境基于V100和TensorRT。

如果您觉得此文档对您有帮助，欢迎star我们的项目：https://github.com/PaddlePaddle/PaddleClas

预训练模型列表及下载地址

ResNet及其Vd系列
- ResNet系列^[1](论文地址)
  - ResNet18
  - ResNet34
  - ResNet50
  - ResNet101
  - ResNet152
- ResNet_vc、ResNet_vd系列^[2](论文地址)
轻量级模型系列
- PP-LCNet系列^[28](论文地址)
- MobileNetV3系列^[3](论文地址)
- MobileNetV2系列^[4](论文地址)
- MobileNetV1系列^[5](论文地址)
- ShuffleNetV2系列^[6](论文地址)
- GhostNet系列^[23](论文地址)
- MixNet系列^[29](论文地址)
- ReXNet系列^[30](论文地址)
SEResNeXt与Res2Net系列
- ResNeXt系列^[7](论文地址)
- ResNeXt_vd系列
- SE_ResNet_vd系列^[8](论文地址)
- SE_ResNeXt系列
  - SE_ResNeXt50_32x4d
  - SE_ResNeXt101_32x4d
- SE_ResNeXt_vd系列
  - SE_ResNeXt50_vd_32x4d
  - SENet154_vd
- Res2Net系列^[9](论文地址)
Inception系列
- GoogLeNet系列^[10](论文地址)
  - GoogLeNet
- InceptionV3系列^[26](论文地址)
  - InceptionV3
- InceptionV4系列^[11](论文地址)
  - InceptionV4
- Xception系列^[12](论文地址)
HRNet系列
- HRNet系列^[13](论文地址)
DPN与DenseNet系列
- DPN系列^[14](论文地址)
  - DPN68
  - DPN92
  - DPN98
  - DPN107
  - DPN131
- DenseNet系列^[15](论文地址)
EfficientNet与ResNeXt101_wsl系列
- EfficientNet系列^[16](论文地址)
- ResNeXt101_wsl系列^[17](论文地址)
ResNeSt与RegNet系列
- ResNeSt系列^[24](论文地址)
  - ResNeSt50_fast_1s1x64d
  - ResNeSt50
- RegNet系列^[25](paper link)
  - RegNetX_4GF
Transformer系列
- Swin-transformer系列^[27](论文地址)
- ViT系列^[31](论文地址)
- DeiT系列^[32](论文地址)
- LeViT系列^[33](论文地址)
- Twins系列^[34](论文地址)
- TNT系列^[35](论文地址)
  - TNT_small
其他模型
- AlexNet系列^[18](论文地址)
  - AlexNet
- SqueezeNet系列^[19](论文地址)
  - SqueezeNet1_0
  - SqueezeNet1_1
- VGG系列^[20](论文地址)
  - VGG11
  - VGG13
  - VGG16
  - VGG19
- DarkNet系列^[21](论文地址)
  - DarkNet53
- RepVGG系列^[36](论文地址)
  - RepVGG_A0
  - RepVGG_A1
  - RepVGG_A2
  - RepVGG_B0
  - RepVGG_B1s
  - RepVGG_B2
  - RepVGG_B1g2
  - RepVGG_B1g4
  - RepVGG_B2g4
  - RepVGG_B3g4
- HarDNet系列^[37](论文地址)
- DLA系列^[38](论文地址)
  - DLA102
  - DLA102x2
  - DLA102x
  - DLA169
  - DLA34
  - DLA46_c
  - DLA60
  - DLA60x_c
  - DLA60x
- RedNet系列^[39](论文地址)
  - RedNet26
  - RedNet38
  - RedNet50
  - RedNet101
  - RedNet152

注意：以上模型中EfficientNetB1-B7的预训练模型转自pytorch版EfficientNet，ResNeXt101_wsl系列预训练模型转自官方repo，剩余预训练模型均基于飞桨训练得到的，并在configs里给出了相应的训练超参数。

参考文献

[1] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.

[2] He T, Zhang Z, Zhang H, et al. Bag of tricks for image classification with convolutional neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019: 558-567.

[3] Howard A, Sandler M, Chu G, et al. Searching for mobilenetv3[C]//Proceedings of the IEEE International Conference on Computer Vision. 2019: 1314-1324.

[4] Sandler M, Howard A, Zhu M, et al. Mobilenetv2: Inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 4510-4520.

[5] Howard A G, Zhu M, Chen B, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications[J]. arXiv preprint arXiv:1704.04861, 2017.

[6] Ma N, Zhang X, Zheng H T, et al. Shufflenet v2: Practical guidelines for efficient cnn architecture design[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 116-131.

[7] Xie S, Girshick R, Dollár P, et al. Aggregated residual transformations for deep neural networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1492-1500.

[8] Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7132-7141.

[9] Gao S, Cheng M M, Zhao K, et al. Res2net: A new multi-scale backbone architecture[J]. IEEE transactions on pattern analysis and machine intelligence, 2019.

[10] Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 1-9.

[11] Szegedy C, Ioffe S, Vanhoucke V, et al. Inception-v4, inception-resnet and the impact of residual connections on learning[C]//Thirty-first AAAI conference on artificial intelligence. 2017.

[12] Chollet F. Xception: Deep learning with depthwise separable convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1251-1258.

[13] Wang J, Sun K, Cheng T, et al. Deep high-resolution representation learning for visual recognition[J]. arXiv preprint arXiv:1908.07919, 2019.

[14] Chen Y, Li J, Xiao H, et al. Dual path networks[C]//Advances in neural information processing systems. 2017: 4467-4475.

[15] Huang G, Liu Z, Van Der Maaten L, et al. Densely connected convolutional networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 4700-4708.

[16] Tan M, Le Q V. Efficientnet: Rethinking model scaling for convolutional neural networks[J]. arXiv preprint arXiv:1905.11946, 2019.

[17] Mahajan D, Girshick R, Ramanathan V, et al. Exploring the limits of weakly supervised pretraining[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 181-196.

[18] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C]//Advances in neural information processing systems. 2012: 1097-1105.

[19] Iandola F N, Han S, Moskewicz M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size[J]. arXiv preprint arXiv:1602.07360, 2016.

[20] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556, 2014.

[21] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 779-788.

[22] Ding X, Guo Y, Ding G, et al. Acnet: Strengthening the kernel skeletons for powerful cnn via asymmetric convolution blocks[C]//Proceedings of the IEEE International Conference on Computer Vision. 2019: 1911-1920.

[23] Han K, Wang Y, Tian Q, et al. GhostNet: More features from cheap operations[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 1580-1589.

[24] Zhang H, Wu C, Zhang Z, et al. Resnest: Split-attention networks[J]. arXiv preprint arXiv:2004.08955, 2020.

[25] Radosavovic I, Kosaraju R P, Girshick R, et al. Designing network design spaces[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 10428-10436.

[26] C.Szegedy, V.Vanhoucke, S.Ioffe, J.Shlens, and Z.Wojna. Rethinking the inception architecture for computer vision. arXiv preprint arXiv:1512.00567, 2015.

[27] Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin and Baining Guo. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows.

[28]Cheng Cui, Tingquan Gao, Shengyu Wei, Yuning Du, Ruoyu Guo, Shuilong Dong, Bin Lu, Ying Zhou, Xueying Lv, Qiwen Liu, Xiaoguang Hu, Dianhai Yu, Yanjun Ma. PP-LCNet: A Lightweight CPU Convolutional Neural Network.

[29]Mingxing Tan, Quoc V. Le. MixConv: Mixed Depthwise Convolutional Kernels.

[30]Dongyoon Han, Sangdoo Yun, Byeongho Heo, YoungJoon Yoo. Rethinking Channel Dimensions for Efficient Model Design.

[31]Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby. AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE.

[32]Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Herve Jegou. Training data-efficient image transformers & distillation through attention.

[33]Benjamin Graham, Alaaeldin El-Nouby, Hugo Touvron, Pierre Stock, Armand Joulin, Herve Jegou, Matthijs Douze. LeViT: a Vision Transformer in ConvNet’s Clothing for Faster Inference.

[34]Xiangxiang Chu, Zhi Tian, Yuqing Wang, Bo Zhang, Haibing Ren, Xiaolin Wei, Huaxia Xia, Chunhua Shen. Twins: Revisiting the Design of Spatial Attention in Vision Transformers.

[35]Kai Han, An Xiao, Enhua Wu, Jianyuan Guo, Chunjing Xu, Yunhe Wang. Transformer in Transformer.

[36]Xiaohan Ding, Xiangyu Zhang, Ningning Ma, Jungong Han, Guiguang Ding, Jian Sun. RepVGG: Making VGG-style ConvNets Great Again.

[37]Ping Chao, Chao-Yang Kao, Yu-Shan Ruan, Chien-Hsiang Huang, Youn-Long Lin. HarDNet: A Low Memory Traffic Network.

[38]Fisher Yu, Dequan Wang, Evan Shelhamer, Trevor Darrell. Deep Layer Aggregation.

[39]Duo Lim Jie Hu, Changhu Wang, Xiangtai Li, Qi She, Lei Zhu, Tong Zhang, Qifeng Chen. Involution: Inverting the Inherence of Convolution for Visual Recognition.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

models_intro.md

models_intro.md

模型库概览

概述

评估环境

预训练模型列表及下载地址

参考文献

Files

models_intro.md

Latest commit

History

models_intro.md

File metadata and controls

模型库概览

概述

评估环境

预训练模型列表及下载地址

参考文献