VGG16

Models from the BMVC-2014 paper “Return of the Devil in the Details: Delving Deep into Convolutional Nets”

The models are trained on the ILSVRC-2012 dataset. The details can be found on the project page or in the following BMVC-2014 paper:

Return of the Devil in the Details: Delving Deep into Convolutional
Nets K. Chatfield, K. Simonyan, A. Vedaldi, A. Zisserman British
Machine Vision Conference, 2014 (arXiv ref. cs1405.3531)

Please cite the paper if you use the models.

Models:

VGG_CNN_S: 13.1% top-5 error on ILSVRC-2012-val
VGG_CNN_M: 13.7% top-5 error on ILSVRC-2012-val
VGG_CNN_M_2048: 13.5% top-5 error on ILSVRC-2012-val
VGG_CNN_M_1024: 13.7% top-5 error on ILSVRC-2012-val
VGG_CNN_M_128: 15.6% top-5 error on ILSVRC-2012-val
VGG_CNN_F: 16.7% top-5 error on ILSVRC-2012-val

Models used by the VGG team in ILSVRC-2014
The models are the improved versions of the models used by the VGG team in the ILSVRC-2014 competition. The details can be found on the project page or in the following arXiv paper:

Very Deep Convolutional Networks for Large-Scale Image Recognition K.
Simonyan, A. Zisserman arXiv:1409.1556

Please cite the paper if you use the models.

Models:

16-layer: 7.5% top-5 error on ILSVRC-2012-val, 7.4% top-5 error on ILSVRC-2012-test
19-layer: 7.5% top-5 error on ILSVRC-2012-val, 7.3% top-5 error on ILSVRC-2012-test

In the paper, the models are denoted as configurations D and E, trained with scale jittering. The combination of the two models achieves 7.1% top-5 error on ILSVRC-2012-val, and 7.0% top-5 error on ILSVRC-2012-test.

VGG16 for face recognition: caffemodel,
http://www.robots.ox.ac.uk/~vgg/software/vgg_face/

注释
gist.github.com 被墙无法访问解决办法
windows下 打开C:\Windows\System32\drivers\etc\hosts文件
编辑器打开,在最后行添加192.30.253.118 gist.github.com
保存。

从头开始训练,将参数的“gaussian”初始方式 修改为“xavier”, 否则训练的时候可能不会收敛。参数的初始化方式很重要。

train.prototxt

layer {
  name: "data"
  type: "Python"
  top: "Features"
  top: "Headposes"
  python_param {
    module: "python_read_data_for_softmax"
    layer: "AllDataLayer"
    param_str: "{\'phase\': \'train\', \'dataset_name\': \'lfw\', \'data_type\': \'image\',\'batch_size\': 48,\'cross_id\':0}"
  }
}

layer {  
  bottom: "Features"  
  top: "conv1_1"  
  name: "conv1_1"  
  type: "Convolution"  
  param {  
    lr_mult: 1  
    decay_mult: 1  
  }  
  param {  
    lr_mult: 2  
    decay_mult: 0  
  }  
  convolution_param {  
    num_output: 64  
    pad: 1  
    kernel_size: 3  
    weight_filler {  
      type: "xavier"  
      std: 0.01  
    }  
    bias_filler {  
      type: "constant"  
      value: 0  
    }  
  }  
}  
layer {  
  bottom: "conv1_1"  
  top: "conv1_1"  
  name: "relu1_1"  
  type: "ReLU"  
}  
layer {  
  bottom: "conv1_1"  
  top: "conv1_2"  
  name: "conv1_2"  
  type: "Convolution"  
  param {  
    lr_mult: 1  
    decay_mult: 1  
  }  
  param {  
    lr_mult: 2  
    decay_mult: 0  
  }  
  convolution_param {  
    num_output: 64  
    pad: 1  
    kernel_size: 3  
    weight_filler {  
      type: "xavier"  
      std: 0.01  
    }  
    bias_filler {  
      type: "constant"  
      value: 0  
    }  
  }  
}  
layer {  
  bottom: "conv1_2"  
  top: "conv1_2"  
  name: "relu1_2"  
  type: "ReLU"  
}  
layer {  
  bottom: "conv1_2"  
  top: "pool1"  
  name: "pool1"  
  type: "Pooling"  
  pooling_param {  
    pool: MAX  
    kernel_size: 2  
    stride: 2  
  }  
}  
layer {  
  bottom: "pool1"  
  top: "conv2_1"  
  name: "conv2_1"  
  type: "Convolution"  
  param {  
    lr_mult: 1  
    decay_mult: 1  
  }  
  param {  
    lr_mult: 2  
    decay_mult: 0  
  }  
  convolution_param {  
    num_output: 128  
    pad: 1  
    kernel_size: 3  
    weight_filler {  
      type: "xavier"  
      std: 0.01  
    }  
    bias_filler {  
      type: "constant"  
      value: 0  
    }  
  }  
}  
layer {  
  bottom: "conv2_1"  
  top: "conv2_1"  
  name: "relu2_1"  
  type: "ReLU"  
}  
layer {  
  bottom: "conv2_1"  
  top: "conv2_2"  
  name: "conv2_2"  
  type: "Convolution"  
  param {  
    lr_mult: 1  
    decay_mult: 1  
  }  
  param {  
    lr_mult: 2  
    decay_mult: 0  
  }  
  convolution_param {  
    num_output: 128  
    pad: 1  
    kernel_size: 3  
    weight_filler {  
      type: "xavier"  
      std: 0.01  
    }  
    bias_filler {  
      type: "constant"  
      value: 0  
    }  
  }  
}  
layer {  
  bottom: "conv2_2"  
  top: "conv2_2"  
  name: "relu2_2"  
  type: "ReLU"  
}  
layer {  
  bottom: "conv2_2"  
  top: "pool2"  
  name: "pool2"  
  type: "Pooling"  
  pooling_param {  
    pool: MAX  
    kernel_size: 2  
    stride: 2  
  }  
}  
layer {  
  bottom: "pool2"  
  top: "conv3_1"  
  name: "conv3_1"  
  type: "Convolution"  
  param {  
    lr_mult: 1  
    decay_mult: 1  
  }  
  param {  
    lr_mult: 2  
    decay_mult: 0  
  }  
  convolution_param {  
    num_output: 256  
    pad: 1  
    kernel_size: 3  
    weight_filler {  
      type: "xavier"  
      std: 0.01  
    }  
    bias_filler {  
      type: "constant"  
      value: 0  
    }  
  }  
}  
layer {  
  bottom: "conv3_1"  
  top: "conv3_1"  
  name: "relu3_1"  
  type: "ReLU"  
}  
layer {  
  bottom: "conv3_1"  
  top: "conv3_2"  
  name: "conv3_2"  
  type: "Convolution"  
  param {  
    lr_mult: 1  
    decay_mult: 1  
  }  
  param {  
    lr_mult: 2  
    decay_mult: 0  
  }  
  convolution_param {  
    num_output: 256  
    pad: 1  
    kernel_size: 3  
    weight_filler {  
      type: "xavier"  
      std: 0.01  
    }  
    bias_filler {  
      type: "constant"  
      value: 0  
    }  
  }  
}  
layer {  
  bottom: "conv3_2"  
  top: "conv3_2"  
  name: "relu3_2"  
  type: "ReLU"  
}  
layer {  
  bottom: "conv3_2"  
  top: "conv3_3"  
  name: "conv3_3"  
  type: "Convolution"  
  param {  
    lr_mult: 1  
    decay_mult: 1  
  }  
  param {  
    lr_mult: 2  
    decay_mult: 0  
  }  
  convolution_param {  
    num_output: 256  
    pad: 1  
    kernel_size: 3  
    weight_filler {  
      type: "xavier"  
      std: 0.01  
    }  
    bias_filler {  
      type: "constant"  
      value: 0  
    }  
  }  
}  
layer {  
  bottom: "conv3_3"  
  top: "conv3_3"  
  name: "relu3_3"  
  type: "ReLU"  
}  
layer {  
  bottom: "conv3_3"  
  top: "pool3"  
  name: "pool3"  
  type: "Pooling"  
  pooling_param {  
    pool: MAX  
    kernel_size: 2  
    stride: 2  
  }  
}  
layer {  
  bottom: "pool3"  
  top: "conv4_1"  
  name: "conv4_1"  
  type: "Convolution"  
  param {  
    lr_mult: 1  
    decay_mult: 1  
  }  
  param {  
    lr_mult: 2  
    decay_mult: 0  
  }  
  convolution_param {  
    num_output: 512  
    pad: 1  
    kernel_size: 3  
    weight_filler {  
      type: "xavier"  
      std: 0.01  
    }  
    bias_filler {  
      type: "constant"  
      value: 0  
    }  
  }  
}  
layer {  
  bottom: "conv4_1"  
  top: "conv4_1"  
  name: "relu4_1"  
  type: "ReLU"  
}  
layer {  
  bottom: "conv4_1"  
  top: "conv4_2"  
  name: "conv4_2"  
  type: "Convolution"  
  param {  
    lr_mult: 1  
    decay_mult: 1  
  }  
  param {  
    lr_mult: 2  
    decay_mult: 0  
  }  
  convolution_param {  
    num_output: 512  
    pad: 1  
    kernel_size: 3  
    weight_filler {  
      type: "xavier"  
      std: 0.01  
    }  
    bias_filler {  
      type: "constant"  
      value: 0  
    }  
  }  
}  
layer {  
  bottom: "conv4_2"  
  top: "conv4_2"  
  name: "relu4_2"  
  type: "ReLU"  
}  
layer {  
  bottom: "conv4_2"  
  top: "conv4_3"  
  name: "conv4_3"  
  type: "Convolution"  
  param {  
    lr_mult: 1  
    decay_mult: 1  
  }  
  param {  
    lr_mult: 2  
    decay_mult: 0  
  }  
  convolution_param {  
    num_output: 512  
    pad: 1  
    kernel_size: 3  
    weight_filler {  
      type: "xavier"  
      std: 0.01  
    }  
    bias_filler {  
      type: "constant"  
      value: 0  
    }  
  }  
}  
layer {  
  bottom: "conv4_3"  
  top: "conv4_3"  
  name: "relu4_3"  
  type: "ReLU"  
}  
layer {  
  bottom: "conv4_3"  
  top: "pool4"  
  name: "pool4"  
  type: "Pooling"  
  pooling_param {  
    pool: MAX  
    kernel_size: 2  
    stride: 2  
  }  
}  
layer {  
  bottom: "pool4"  
  top: "conv5_1"  
  name: "conv5_1"  
  type: "Convolution"  
  param {  
    lr_mult: 1  
    decay_mult: 1  
  }  
  param {  
    lr_mult: 2  
    decay_mult: 0  
  }  
  convolution_param {  
    num_output: 512  
    pad: 1  
    kernel_size: 3  
    weight_filler {  
      type: "xavier"  
      std: 0.01  
    }  
    bias_filler {  
      type: "constant"  
      value: 0  
    }  
  }  
}  
layer {  
  bottom: "conv5_1"  
  top: "conv5_1"  
  name: "relu5_1"  
  type: "ReLU"  
}  
layer {  
  bottom: "conv5_1"  
  top: "conv5_2"  
  name: "conv5_2"  
  type: "Convolution"  
  param {  
    lr_mult: 1  
    decay_mult: 1  
  }  
  param {  
    lr_mult: 2  
    decay_mult: 0  
  }  
  convolution_param {  
    num_output: 512  
    pad: 1  
    kernel_size: 3  
    weight_filler {  
      type: "xavier"  
      std: 0.01  
    }  
    bias_filler {  
      type: "constant"  
      value: 0  
    }  
  }  
}  
layer {  
  bottom: "conv5_2"  
  top: "conv5_2"  
  name: "relu5_2"  
  type: "ReLU"  
}  
layer {  
  bottom: "conv5_2"  
  top: "conv5_3"  
  name: "conv5_3"  
  type: "Convolution"  
  param {  
    lr_mult: 1  
    decay_mult: 1  
  }  
  param {  
    lr_mult: 2  
    decay_mult: 0  
  }  
  convolution_param {  
    num_output: 512  
    pad: 1  
    kernel_size: 3  
    weight_filler {  
      type: "xavier"  
      std: 0.01  
    }  
    bias_filler {  
      type: "constant"  
      value: 0  
    }  
  }  
}  
layer {  
  bottom: "conv5_3"  
  top: "conv5_3"  
  name: "relu5_3"  
  type: "ReLU"  
}  
layer {  
  bottom: "conv5_3"  
  top: "pool5"  
  name: "pool5"  
  type: "Pooling"  
  pooling_param {  
    pool: MAX  
    kernel_size: 2  
    stride: 2  
  }  
}  
layer {  
  bottom: "pool5"  
  top: "fc6"  
  name: "fc6"  
  type: "InnerProduct"  
  param {  
    lr_mult: 1  
    decay_mult: 1  
  }  
  param {  
    lr_mult: 2  
    decay_mult: 0  
  }  
  inner_product_param {  
    num_output: 4096  
    weight_filler {  
      type: "xavier"  
      std: 0.005  
    }  
    bias_filler {  
      type: "constant"  
      value: 0.1  
    }  
  }  
}  
layer {  
  bottom: "fc6"  
  top: "fc6"  
  name: "relu6"  
  type: "ReLU"  
}  
layer {  
  bottom: "fc6"  
  top: "fc6"  
  name: "drop6"  
  type: "Dropout"  
  dropout_param {  
    dropout_ratio: 0.5  
  }  
}  
layer {  
  bottom: "fc6"  
  top: "fc7"  
  name: "fc7"  
  type: "InnerProduct"  
  param {  
    lr_mult: 1  
    decay_mult: 1  
  }  
  param {  
    lr_mult: 2  
    decay_mult: 0  
  }  
  inner_product_param {  
    num_output: 4096  
    weight_filler {  
      type: "xavier"  
      std: 0.005  
    }  
    bias_filler {  
      type: "constant"  
      value: 0.1  
    }  
  }  
}  
layer {  
  bottom: "fc7"  
  top: "fc7"  
  name: "relu7"  
  type: "ReLU"  
}  
layer {  
  bottom: "fc7"  
  top: "fc7"  
  name: "drop7"  
  type: "Dropout"  
  dropout_param {  
    dropout_ratio: 0.5  
  }  
}  
layer {  
  bottom: "fc7"  
  top: "fc8"  
  name: "fc8"  
  type: "InnerProduct"  
  param {  
    lr_mult: 1  
    decay_mult: 1  
  }  
  param {  
    lr_mult: 2  
    decay_mult: 0  
  }  
  inner_product_param {  
    num_output: 5  
    weight_filler {  
      type: "xavier"  
      std: 0.005  
    }  
    bias_filler {  
      type: "constant"  
      value: 0.1  
    }  
  }  
}  

layer {  
  bottom: "fc8"  
  bottom: "Headposes"  
  top: "headposeloss1"  
  name: "headposeloss1"  
  type: "SoftmaxWithLoss"  
}  

solver.prototxt:

train_net: "ZOO_VGG16/train2.prototxt"
base_lr: 0.01
display: 20
max_iter: 100000
lr_policy: "step"
gamma: 0.1
momentum: 0.9
weight_decay: 0.0005
stepsize: 50000
snapshot: 5000
snapshot_prefix: "snapshots_pointing04_vgg16_klLoss/cross1/vgg_"
random_seed: 831486
iter_size: 1

参考文献:

  1. https://github.com/NVIDIA/DIGITS/issues/159#issuecomment-247707549
  2. https://github.com/NVIDIA/DIGITS/issues/535
  3. https://stackoverflow.com/questions/42652903/caffe-net-not-converging-when-replacing-alexnet-with-vgg16-but-everything-else?rq=1
  4. https://github.com/BVLC/caffe/wiki/Model-Zoo
  5. https://www.cnblogs.com/ryans/p/8196151.html
03-08
### VGG16 深度学习模型简介 VGG16 是一种经典的卷积神经网络架构,在多个计算机视觉任务中表现出色,包括图像分类、目标检测以及语义分割等[^2]。该模型由牛津大学的 Visual Geometry Group 提出,因其出色的性能和简洁的设计成为深度学习领域的重要组成部分。 ### 加载预训练的 VGG16 模型 为了快速上手使用 VGG16 进行图像处理任务,可以利用 PyTorch 和 TensorFlow 中提供的工具轻松加载预训练权重: #### 使用 PyTorch 加载 VGG16 ```python import torch from torchvision import models, transforms from PIL import Image # 定义设备 (CPU/GPU) device = 'cuda' if torch.cuda.is_available() else 'cpu' # 创建 VGG16 实例并加载预训练参数 model_vgg16 = models.vgg16(pretrained=True).to(device) # 设置为评估模式 model_vgg16.eval() ``` 对于特定应用场合下的微调或迁移学习,则可以通过修改最后一层全连接层来适应新的数据集需求: ```python num_features = model_vgg16.classifier[-1].in_features model_vgg16.classifier[-1] = torch.nn.Linear(num_features, num_classes) # 替换最后的线性层以匹配类别数 ``` #### 使用 TensorFlow/Keras 加载 VGG16 ```python import tensorflow as tf from tensorflow.keras.applications import VGG16 from tensorflow.keras.preprocessing.image import load_img, img_to_array # 不含顶层,默认输入形状(224, 224, 3),包含ImageNet上的预训练权重 base_model = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3)) for layer in base_model.layers: layer.trainable = False # 冻结基础模型中的所有层 # 添加自定义顶部分类器 x = base_model.output x = tf.keras.layers.GlobalAveragePooling2D()(x) predictions = tf.keras.layers.Dense(num_classes, activation='softmax')(x) final_model = tf.keras.Model(inputs=base_model.input, outputs=predictions) ``` 上述代码展示了如何在不同框架下初始化带有预训练权值的基础 VGG16 架构,并对其进行适当调整以便于后续的应用开发工作[^3]。 ### 应用案例分析 考虑到实际应用场景的需求差异较大,这里提供了一个简单的例子——基于 VGG16 的二元分类问题解决方案(如区分猫狗图片)。此过程中不仅涉及到了基本的数据增强操作,还包含了针对具体任务定制化的模型结构调整方法。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值