Caffe小玩意

转载自:http://blog.csdn.net/u014510375/article/details/51697946

Caffe小玩意(1)-可视化网络结构

最近在学习Caffe,但是作为曾经的Windows深度用户,还是比较习惯可视化的界面。然而,Caffe当然是在Linux/OS X系统下更好啦,因为一般还是写script在命令行里面玩的。所以这样就不直观咯,为了能直观地看清楚网络结构,而不是看prototxt脑补…可视化就很重要了。 
幸好,开发Caffe的大神们已经考虑过这个问题了。在Caffe的根目录下,有个python文件夹,里面有个draw_net.py,就是我们所需要的文件了。 
接下来我随便挑了一个网络结构。假设我们现在已经在这个python文件夹的路径下,在命令行里,输入

<code class="language-bash hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">python draw_net.py ../models/bvlc_reference_caffenet/train_val.prototxt vis.png</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>

我用的是Macbook pro,在第一次运行的时候报错了

pydot.InvocationException: GraphViz’s executables not found

查了一下发现我缺少了Graphviz这个专门用于画图的软件,所以必须得装一下。如果你装了homebrew,那就简单咯

<code class="language-bash hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">brew install graphviz</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>

再次运行,就输出了vis.png到当前路径。打开看看: 
cnn_vis | center

比较遗憾的是,对于自己定义的layer,目前貌似没有办法进行可视化(我尝试了一下可视化Faster-RCNN结果失败了)。之后我会再看看怎样把自己定义的layer也可视化。

Caffe小玩意(2)-从caffemodel中导出参数

最近读到一篇paper非常有意思,他们把caffe里训练好的模型的参数导出来了,然后…弄到了torch里。所以,今天就来看看怎么导出参数吧。 
为了简单,这次我选的是LeNet

<code class="language-python hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> numpy <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">as</span> np
<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> scipy.io <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">as</span> sio
<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> caffe


<span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">load</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">()</span>:</span>
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Load the net</span>
    caffe.set_mode_cpu()
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># You may need to train this caffemodel first</span>
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># There should be script to help you do the training</span>
    net = caffe.Net(root + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'lenet.prototxt'</span>, root + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'lenet_iter_10000.caffemodel'</span>,\
        caffe.TEST)
    conv1_w = net.params[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'conv1'</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>].data
    conv1_b = net.params[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'conv1'</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>].data
    conv2_w = net.params[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'conv2'</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>].data
    conv2_b = net.params[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'conv2'</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>].data
    ip1_w = net.params[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ip1'</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>].data
    ip1_b = net.params[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ip1'</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>].data
    ip2_w = net.params[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ip2'</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>].data
    ip2_b = net.params[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ip2'</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>].data
    sio.savemat(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'conv1_w'</span>, {<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'conv1_w'</span>:conv1_w})
    sio.savemat(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'conv1_b'</span>, {<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'conv1_b'</span>:conv1_b})
    sio.savemat(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'conv2_w'</span>, {<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'conv2_w'</span>:conv2_w})
    sio.savemat(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'conv2_b'</span>, {<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'conv2_b'</span>:conv2_b})
    sio.savemat(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ip1_w'</span>, {<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ip1_w'</span>:ip1_w})
    sio.savemat(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ip1_b'</span>, {<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ip1_b'</span>:ip1_b})
    sio.savemat(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ip2_w'</span>, {<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ip2_w'</span>:ip2_w})
    sio.savemat(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ip2_b'</span>, {<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ip2_b'</span>:ip2_b})

<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> __name__ == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"__main__"</span>:
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># You will need to change this path</span>
    root = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'/Users/yuliangzou/caffe-rc3/examples/mnist/'</span>
    load()
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'Caffemodel loaded and written to .mat files successfully!'</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li><li style="box-sizing: border-box; padding: 0px 5px;">32</li><li style="box-sizing: border-box; padding: 0px 5px;">33</li><li style="box-sizing: border-box; padding: 0px 5px;">34</li></ul>

从代码里可以看得很清楚啦,首先导入模型,然后利用net.params就可以获取参数了,另外你也可以利用net.data导出数据进行可视化。当然,在导出参数之前…你必须要跑过一遍,不然你没有这个caffemodel… 
最后…要说一下我最近无聊的时候在github上开了个Naive-CNN的项目,就是….把Caffe里的模型参数导出来,用Matlab或者Python写一遍。目前只做了LeNet。欢迎大家也来玩: 
https://github.com/Yuliang-Zou/Naive-CNN

Caffe小玩意(3)-利用py-faster-rcnn自定义输入数据

众所周知,caffe是现有deep learning framework中最为自动化的,我们甚至可以只定义prototxt文件而不需要写代码,就完成整个网络的训练。正是由于它的高度自动化,当我们想要修改其中的模块,就不是一件容易的事了。 
caffe本身自带了一些标准通用的dataset,我们可以比较简单地使用它们。此外,对于一些其他的输入形式,caffe也给出了一些指示: 
http://caffe.berkeleyvision.org/tutorial/data.html 
http://caffe.berkeleyvision.org/tutorial/layers.html#data-layers

但是,对于那种label不是简单变量的输入,我们应该怎么输入到caffe里呢?(例如:显著性检测问题,我们的label应该是一幅灰度图像;人体关节检测问题,我们的label应该是一个tensor)。那么今天,我们就来看看如何利用rbg大神的py-faster-rcnn框架来自己定制输入数据: 
https://github.com/rbgirshick/py-faster-rcnn

首先,pull这个repository到本地目录(之后的./就代表在本地下的这个目录),然后运行./data/目录下的script下载数据(这些数据本身不是必要的,只是因为我之前需要finetune模型将它们下载了下来,之后的路径、操作等等也基于这一事实)。

好了,现在会多了一些目录出来。我们需要将数据(假设图像输入就是.jpg文件,label是python的numpy array,即.npy文件)放到相应的地方,即./data/VOCdevkit2007/VOC2007/JPEGImages。但是呢,这个目录下是原来的VOC2007数据集的图像输入,所以我建议在这个目录下再新建一个目录(这里叫dlib)。因此实际存放路径是: 
./data/VOCdevkit2007/VOC2007/JPEGImagesd/dlib

之后,我们需要为这些输入数据写xml文件。每一份输入图像都对应一个xml文件,内容如下(不需要注重格式):

<code class="language-xml hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">annotation</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">folder</span>></span>VOC2007<span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">folder</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">filename</span>></span>image_0046.jpg<span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">filename</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">source</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">database</span>></span>dlib facial landmark<span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">database</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">annotation</span>></span>Yuliang Zou<span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">annotation</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">source</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">size</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">width</span>></span>400<span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">width</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">height</span>></span>300<span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">height</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">depth</span>></span>3<span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">depth</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">size</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">segmented</span>></span>0<span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">segmented</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">annotation</span>></span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>

同样地,为了与原来数据的xml文件混淆,新建一个dlib文件夹,因此这些新xml文件的存放路径为: 
./data/VOCdevkit2007/VOC2007/Annotations/dlib

以上的操作,都没有对dataset进行training set与test set的区分,下面我们就来完成这件事。打开目录: 
./data/VOCdevkit2007/VOC2007/ImageSets/Main 
我们可以看到很多的txt文件,先把原来的trainval.txt与test.txt备份好。然后,新建自己的trainval.txt与test.txt,每一行都是输入图像的名称,例: 
dlib/100032540_1 
dlib/1002681492_1 
dlib/1004467229_1 
... 

(直接用这两个txt文件的名字,是因为改用新的会有点麻烦,详见附录)

之后,我们需要修改相应的python代码使得数据可以顺利导入。 
(1)在lib/roi_data_layer/layer.py里的setup()函数,我们需要添加如下代码,为label分配空间:

<code class="language-python hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">top[idx].reshape(cfg.TRAIN.IMS_PER_BATCH, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">68</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">38</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">50</span>)
self._name_to_top_map[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'heatmap'</span>] = idx
idx += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li></ul>

我这里的label是facial landmark,一共有68个2-d array。然后把下面的一些不需要的部分删掉(不然之后可能会报错)。

(2)在.lib/utils/blob.py里新定义一个函数:

<code class="language-python hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">heatmap_list_to_blob</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(hms)</span>:</span>
    <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">""" Convert a list of heat maps into a network input."""</span>
    num_hms = len(hms)
    blob = np.zeros((num_hms, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">68</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">38</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">50</span>), dtype=np.float32)
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> i <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> xrange(num_hms):
        hm = hms[i]
        blob[i] = hm.transpose((<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>))
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> blob</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li></ul>

这个函数可以将包含若干label的python list转换为caffe的blob数据结构。

(3)在lib/roi_data_layer/minibatch.py里导入刚刚定义的heatmap_list_to_blob函数,然后新定义函数:

<code class="language-python hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">_get_heatmap_blob</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(roidb)</span>:</span>
    <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"""Get a batch of heat maps"""</span>
    num_images = len(roidb)
    hms = []
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> i <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> xrange(num_images):
        hm = np.load(roidb[i][<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'heatmap'</span>])
        hms.append(hm)

    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Create a blob to hold the input heat maps</span>
    blob = heatmap_list_to_blob(hms)

    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> blob</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li></ul>

然后,在get_minibatch()函数中加入如下几行代码:

<code class="language-python hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Get the imput heat map blob, formatted for caffe</span>
    hm_blob = _get_heatmap_blob(roidb)
    blobs[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'heatmap'</span>] = hm_blob</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li></ul>

(4)在./lib/roi_data_layer/roidb.py的prepare_roidb()函数中这行代码之后:

<code class="language-python hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">roidb[i][<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'image'</span>] = imdb.image_path_at(i)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>

加入这么一行:

<code class="language-python hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">roidb[i][<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'heatmap'</span>] = roidb[i][<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'image'</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>:len(roidb[i][<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'image'</span>])-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">3</span>] + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'npy'</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>

相信看到这里大家也知道了,imdb.image_path_at(i)获取的是输入图像的完整路径,我们进行些许修改就可以得到label的完整路径。

最后,我们需要修改train.prototxt,这个按自己的需要定制就可以了,比较简单,就不详述了。

在最后之后,如果要测试性能,需要自己对./lib/fast_rcnn/test.py进行修改。这里不再详述,我相信当你成功地开始训练的时候,已经对这些内容比较了解了,可以比较容易地写出自己需要的版本。

当然,完成了以上的所有步骤之后,可能还是会出现某些问题。 
1.毕竟我的xml文件比原来的简化了不少,可以按实际情况删掉相应的code(原来的代码可能会导入xml文件的一些参数,但是我省略了那些参数) 
2.原来代码对于输入图像的scaling比较奇怪,那边有可能会出错。对于某些输入尺寸固定的dataset,或许你可以修改 
lib/roi_data_layer/layer.py里的setup()函数,其中会有一行 
top[idx].reshape(cfg.TRAIN.IMS_PER_BATCH, 3, _, _) 
最后的两个参数是height和width,按需要修改。

最近折腾这个东西也折腾了很久,甚是头疼,更是加深了我对rbg大神的仰慕之情。行文有些混乱,如果有不明白的欢迎留言,大家一起交流。


附录:

./lib/datasets/factory.py这份代码负责构造dataset 
line 15 - 20:

<code class="language-python hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Set up voc_<year>_<split> using selective search "fast" mode</span>
<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> year <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> [<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'2007'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'2012'</span>]:
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> split <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> [<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'train'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'val'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'trainval'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'test'</span>]:
        name = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'voc_{}_{}'</span>.format(year, split)
        __sets[name] = (<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">lambda</span> split=split, year=year: pascal_voc(split, year))</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li></ul>

./experiments/scripts/faster_rcnn_end2end.sh这份bash文件负责指定训练与测试时所用的dataset 
line 27 - 28:

<code class="language-bash hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">    TRAIN_IMDB=<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"voc_2007_trainval"</span>
    TEST_IMDB=<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"voc_2007_test"</span></code>

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值