python框架和模型库_一个pytorch库，拥有最先进的架构，预训练模型和实时更新结果...

最新推荐文章于 2024-09-09 19:11:23 发布

weixin_39566493

最新推荐文章于 2024-09-09 19:11:23 发布

阅读量226

点赞数

文章标签： python框架和模型库

PytorchInsight

This is a pytorch lib with state-of-the-art architectures, pretrained models and real-time updated results.

This repository aims to accelarate the advance of Deep Learning Research, make reproducible results and easier for doing researches, and in Pytorch.

Including Papers (to be updated):

Attention Models

SENet: Squeeze-and-excitation Networks

SKNet: Selective Kernel Networks

CBAM: Convolutional Block Attention Module

GCNet: GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond

BAM: Bottleneck Attention Module

SGENet: Spatial Group-wise Enhance: Enhancing Semantic Feature Learning in Convolutional Networks

SRMNet: SRM: A Style-based Recalibration Module for Convolutional Neural Networks

Non-Attention Models

OctNet: Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution

imagenet_tricks.py: Bag of Tricks for Image Classification with Convolutional Neural Networks

Understanding the Disharmony between Weight Normalization Family and Weight Decay: e-shifted L2 Regularizer

Generalization Bound Regularizer: A Unified Framework for Understanding Weight Decay

mixup: Beyond Empirical Risk Minimization

CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features

Trained Models and Performance Table

Single crop validation error on ImageNet-1k (center 224x224 crop from resized image with shorter side = 256).

classifiaction training settings for media and large models

Details

RandomResizedCrop, RandomHorizontalFlip; 0.1 init lr, total 100 epochs, decay at every 30 epochs; SGD with naive softmax cross entropy loss, 1e-4 weight decay, 0.9 momentum, 8 gpus, 32 images per gpu

Examples

ResNet50

Note

The newest code adds one default operation: setting all bias wd = 0, please refer to the theoretical analysis of "Generalization Bound Regularizer: A Unified Framework for Understanding Weight Decay" (to appear), thereby the training accuracy can be slightly boosted

classifiaction training settings for mobile/small models

Details

RandomResizedCrop, RandomHorizontalFlip; 0.4 init lr, total 300 epochs, 5 linear warm up epochs, cosine lr decay; SGD with softmax cross entropy loss and label smoothing 0.1, 4e-5 weight decay on conv weights, 0 weight decay on all other weights, 0.9 momentum, 8 gpus, 128 images per gpu

Examples

ShuffleNetV2

Typical Training & Testing Tips:

Small Models

ShuffleNetV2_1x

python -m torch.distributed.launch --nproc_per_node=8 imagenet_mobile.py --cos -a shufflenetv2_1x --data /path/to/imagenet1k/ \

--epochs 300 --wd 4e-5 --gamma 0.1 -c checkpoints/imagenet/shufflenetv2_1x --train-batch 128 --opt-level O0 --nowd-bn # Triaing

python -m torch.distributed.launch --nproc_per_node=2 imagenet_mobile.py -a shufflenetv2_1x --data /path/to/imagenet1k/ \

-e --resume ../pretrain/shufflenetv2_1x.pth.tar --test-batch 100 --opt-level O0 # Testing, ~69.6% top-1 Acc

Large Models

SGE-ResNet

python -W ignore imagenet.py -a sge_resnet101 --data /path/to/imagenet1k/ --epochs 100 --schedule 30 60 90 \

--gamma 0.1 -c checkpoints/imagenet/sge_resnet101 --gpu-id 0,1,2,3,4,5,6,7 # Training

python -m torch.distributed.launch --nproc_per_node=8 imagenet_fast.py -a sge_resnet101 --data /path/to/imagenet1k/ \

--epochs 100 --schedule 30 60 90 --wd 1e-4 --gamma 0.1 -c checkpoints/imagenet/sge_resnet101 --train-batch 32 \

--opt-level O0 --wd-all --label-smoothing 0. --warmup 0 # Training (faster)

python -W ignore imagenet.py -a sge_resnet101 --data /path/to/imagenet1k/ --gpu-id 0,1 -e --resume ../pretrain/sge_resnet101.pth.tar \

# Testing ~78.8% top-1 Acc

python -m torch.distributed.launch --nproc_per_node=2 imagenet_fast.py -a sge_resnet101 --data /path/to/imagenet1k/ -e --resume \

../pretrain/sge_resnet101.pth.tar --test-batch 100 --opt-level O0 # Testing (faster) ~78.8% top-1 Acc

WS-ResNet with e-shifted L2 regularizer, e = 1e-3

python -m torch.distributed.launch --nproc_per_node=8 imagenet_fast.py -a ws_resnet50 --data /share1/public/public/imagenet1k/ \

--epochs 100 --schedule 30 60 90 --wd 1e-4 --gamma 0.1 -c checkpoints/imagenet/es1e-3_ws_resnet50 --train-batch 32 \

--opt-level O0 --label-smoothing 0. --warmup 0 --nowd-conv --mineps 1e-3 --el2

Results of "SGENet: Spatial Group-wise Enhance: Enhancing Semantic Feature Learning in Convolutional Networks"

Note the following results (old) do not set the bias wd = 0 for large models

Classification

Model

GFLOPs

Top-1 Acc

Top-5 Acc

Download1

Download2

log

Detection

Model

GFLOPs

Detector

Neck

AP50:95 (%)

AP50 (%)

AP75 (%)

Download

ResNet50

23.51M

88.0

Faster RCNN

FPN

37.5

59.1

40.6

SGE-ResNet50

23.51M

88.1

Faster RCNN

FPN

38.7

60.8

41.7

ResNet50

23.51M

88.0

Mask RCNN

FPN

38.6

60.0

41.9

SGE-ResNet50

23.51M

88.1

Mask RCNN

FPN

39.6

61.5

42.9

ResNet50

23.51M

88.0

Cascade RCNN

FPN

41.1

59.3

44.8

SGE-ResNet50

23.51M

88.1

Cascade RCNN

FPN

42.6

61.4

46.2

ResNet101

42.50M

167.9

Faster RCNN

FPN

39.4

60.7

43.0

SE-ResNet101

47.28M

168.3

Faster RCNN

FPN

40.4

61.9

44.2

SGE-ResNet101

42.50M

168.1

Faster RCNN

FPN

41.0

63.0

44.3

ResNet101

42.50M

167.9

Mask RCNN

FPN

40.4

61.6

44.2

SE-ResNet101

47.28M

168.3

Mask RCNN

FPN

41.5

63.0

45.3

SGE-ResNet101

42.50M

168.1

Mask RCNN

FPN

42.1

63.7

46.1

ResNet101

42.50M

167.9

Cascade RCNN

FPN

42.6

60.9

46.4

SE-ResNet101

47.28M

168.3

Cascade RCNN

FPN

43.4

62.2

47.2

SGE-ResNet101

42.50M

168.1

Cascade RCNN

FPN

44.4

63.2

48.4

Results of "Understanding the Disharmony between Weight Normalization Family and Weight Decay: e-shifted L2 Regularizer"

Note that the following models are with bias wd = 0.

Classification

Model

Top-1

Download

WS-ResNet50

76.74

WS-ResNet50(e = 1e-3)

76.86

WS-ResNet101

78.07

WS-ResNet101(e = 1e-6)

78.29

WS-ResNeXt50(e = 1e-3)

77.88

WS-ResNeXt101(e = 1e-3)

78.80

WS-DenseNet201(e = 1e-8)

77.59

WS-ShuffleNetV1(e = 1e-8)

68.09

WS-ShuffleNetV2(e = 1e-8)

69.70

WS-MobileNetV1(e = 1e-6)

73.60

Results of "Generalization Bound Regularizer: A Unified Framework for Understanding Weight Decay"

To appear

Citation

If you find our related works useful in your research, please consider citing the paper:

@inproceedings{li2019selective,

title={Selective Kernel Networks},

author={Li, Xiang and Wang, Wenhai and Hu, Xiaolin and Yang, Jian},

journal={IEEE Conference on Computer Vision and Pattern Recognition},

year={2019}

}

@inproceedings{li2019spatial,

title={Spatial Group-wise Enhance: Enhancing Semantic Feature Learning in Convolutional Networks},

author={Li, Xiang and Hu, Xiaolin and Xia, Yan and Yang, Jian},

journal={arXiv preprint arXiv:1905.09646},

year={2019}

}

@inproceedings{li2019understanding,

title={Understanding the Disharmony between Weight Normalization Family and Weight Decay: e-shifted L2 Regularizer},

author={Li, Xiang and Chen, Shuo and Yang, Jian},

journal={arXiv preprint arXiv:},

year={2019}

}

@inproceedings{li2019generalization,

title={Generalization Bound Regularizer: A Unified Framework for Understanding Weight Decay},

author={Li, Xiang and Chen, Shuo and Gong, Chen and Xia, Yan and Yang, Jian},

journal={arXiv preprint arXiv:},

year={2019}

}

weixin_39566493

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫