Caffe shape mismatch error using pretrained VGG-16 model

最新推荐文章于 2024-07-22 08:45:00 发布

北漠苍狼1746430162

最新推荐文章于 2024-07-22 08:45:00 发布

阅读量1.7k

点赞数 1

分类专栏：深度学习

本文链接：https://blog.csdn.net/zouyu1746430162/article/details/53787324

版权

深度学习专栏收录该内容

98 篇文章 0 订阅

订阅专栏

CAFFE深度学习交流群：532629018

I am using PyCaffe to implement a neural network inspired by the VGG 16 layer network. I want to use the pre-trained model available from their GitHub page. Generally this works by matching layer names.

For my "fc6" layer I have the following definition in my train.prototxt file:

layer {
  name: "fc6"
  type: "InnerProduct"
  bottom: "pool5"
  top: "fc6"
  inner_product_param {
    num_output: 4096
  }
}

Here is the prototxt file for the VGG-16 deploy architecture. Note that the "fc6"in their prototxt is identical to mine (except for the learning rate, but that's irrelevant). It's also worth noting that the inputs are all the same size in my model too: 3-channel 224x224px images.

I have been following this tutorial pretty closely, and the block of code that's giving me an issue is the following:

solver = caffe.SGDSolver(osp.join(model_root, 'solver.prototxt'))
solver.net.copy_from(model_root + 'VGG_ILSVRC_16_layers.caffemodel')
solver.test_nets[0].share_with(solver.net)
solver.step(1)

The first line loads my solver prototxt and then the second line copies the weights from the pre-trained model (VGG_ILSVRC_16_layers.caffemodel). When the solver runs, I get this error:

Cannot copy param 0 weights from layer 'fc6'; shape mismatch.  Source param 
shape is 1 1 4096 25088 (102760448); target param shape is 4096 32768 (134217728). 
To learn this layer's parameters from scratch rather than copying from a saved 
net, rename the layer.

The gist of it is that their model expects the layer to be of size 1x1x4096 while mine is just 4096. But I don't get how I can change this?

I found this answer in the Users Google group instructing me to do net surgery to reshape the pre-trained model before copying, but in order to do that I need the lmdb files from the original architecture's data layers, which I don't have (it throws an error when I try to run the net surgery script).

edited Apr 7 at 1:15

asked Apr 7 at 1:03

marcman

796619

you don't have a problem with the output dimension 4096, but rather with the input dimension: you have input with 25088 dim, while VGG expects input of dim 32768. you changed something along the conv layers that changed the feature size. – Shai Apr 7 at 5:08

add a comment

1 Answer

active oldest votes

up vote 3 down vote accepted

The problem is not with 4096 but rather with 25088. You need to calculate the output feature maps for each layer of your network based on the input feature maps. Note that the fc layer takes an input of fixed size so the output of the previous conv layer must match the input size required by the fclayer. Calculate your fc6 input feature map size (this is the output feature map of the previous convlayer) using the input feature map size of the previous conv layer. Here's the formula:

H_out = ( H_in + 2 x Padding_Height - Kernel_Height ) / Stride_Height + 1
W_out = (W_in + 2 x Padding_Width - Kernel_Width) / Stride_Width + 1

北漠苍狼1746430162

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录