转载自:http://blog.csdn.net/u014510375/article/details/51697946
Caffe小玩意(1)-可视化网络结构
最近在学习Caffe,但是作为曾经的Windows深度用户,还是比较习惯可视化的界面。然而,Caffe当然是在Linux/OS X系统下更好啦,因为一般还是写script在命令行里面玩的。所以这样就不直观咯,为了能直观地看清楚网络结构,而不是看prototxt脑补…可视化就很重要了。
幸好,开发Caffe的大神们已经考虑过这个问题了。在Caffe的根目录下,有个python文件夹,里面有个draw_net.py,就是我们所需要的文件了。
接下来我随便挑了一个网络结构。假设我们现在已经在这个python文件夹的路径下,在命令行里,输入
<code class="language-bash hljs has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">python draw_net.py ../models/bvlc_reference_caffenet/train_val.prototxt vis.png</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>
我用的是Macbook pro,在第一次运行的时候报错了
pydot.InvocationException: GraphViz’s executables not found
查了一下发现我缺少了Graphviz这个专门用于画图的软件,所以必须得装一下。如果你装了homebrew,那就简单咯
<code class="language-bash hljs has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">brew install graphviz</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>
再次运行,就输出了vis.png到当前路径。打开看看:
比较遗憾的是,对于自己定义的layer,目前貌似没有办法进行可视化(我尝试了一下可视化Faster-RCNN结果失败了)。之后我会再看看怎样把自己定义的layer也可视化。
Caffe小玩意(2)-从caffemodel中导出参数
最近读到一篇paper非常有意思,他们把caffe里训练好的模型的参数导出来了,然后…弄到了torch里。所以,今天就来看看怎么导出参数吧。
为了简单,这次我选的是LeNet
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> numpy <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">as</span> np <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> scipy.io <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">as</span> sio <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">import</span> caffe <span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">load</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">()</span>:</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Load the net</span> caffe.set_mode_cpu() <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># You may need to train this caffemodel first</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># There should be script to help you do the training</span> net = caffe.Net(root + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'lenet.prototxt'</span>, root + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'lenet_iter_10000.caffemodel'</span>,\ caffe.TEST) conv1_w = net.params[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'conv1'</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>].data conv1_b = net.params[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'conv1'</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>].data conv2_w = net.params[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'conv2'</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>].data conv2_b = net.params[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'conv2'</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>].data ip1_w = net.params[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ip1'</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>].data ip1_b = net.params[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ip1'</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>].data ip2_w = net.params[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ip2'</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>].data ip2_b = net.params[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ip2'</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>].data sio.savemat(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'conv1_w'</span>, {<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'conv1_w'</span>:conv1_w}) sio.savemat(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'conv1_b'</span>, {<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'conv1_b'</span>:conv1_b}) sio.savemat(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'conv2_w'</span>, {<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'conv2_w'</span>:conv2_w}) sio.savemat(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'conv2_b'</span>, {<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'conv2_b'</span>:conv2_b}) sio.savemat(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ip1_w'</span>, {<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ip1_w'</span>:ip1_w}) sio.savemat(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ip1_b'</span>, {<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ip1_b'</span>:ip1_b}) sio.savemat(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ip2_w'</span>, {<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ip2_w'</span>:ip2_w}) sio.savemat(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ip2_b'</span>, {<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'ip2_b'</span>:ip2_b}) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> __name__ == <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"__main__"</span>: <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># You will need to change this path</span> root = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'/Users/yuliangzou/caffe-rc3/examples/mnist/'</span> load() <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'Caffemodel loaded and written to .mat files successfully!'</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li><li style="box-sizing: border-box; padding: 0px 5px;">32</li><li style="box-sizing: border-box; padding: 0px 5px;">33</li><li style="box-sizing: border-box; padding: 0px 5px;">34</li></ul>
从代码里可以看得很清楚啦,首先导入模型,然后利用net.params就可以获取参数了,另外你也可以利用net.data导出数据进行可视化。当然,在导出参数之前…你必须要跑过一遍,不然你没有这个caffemodel…
最后…要说一下我最近无聊的时候在github上开了个Naive-CNN的项目,就是….把Caffe里的模型参数导出来,用Matlab或者Python写一遍。目前只做了LeNet。欢迎大家也来玩:
https://github.com/Yuliang-Zou/Naive-CNN
Caffe小玩意(3)-利用py-faster-rcnn自定义输入数据
众所周知,caffe是现有deep learning framework中最为自动化的,我们甚至可以只定义prototxt文件而不需要写代码,就完成整个网络的训练。正是由于它的高度自动化,当我们想要修改其中的模块,就不是一件容易的事了。
caffe本身自带了一些标准通用的dataset,我们可以比较简单地使用它们。此外,对于一些其他的输入形式,caffe也给出了一些指示:
http://caffe.berkeleyvision.org/tutorial/data.html
http://caffe.berkeleyvision.org/tutorial/layers.html#data-layers
但是,对于那种label不是简单变量的输入,我们应该怎么输入到caffe里呢?(例如:显著性检测问题,我们的label应该是一幅灰度图像;人体关节检测问题,我们的label应该是一个tensor)。那么今天,我们就来看看如何利用rbg大神的py-faster-rcnn框架来自己定制输入数据:
https://github.com/rbgirshick/py-faster-rcnn
首先,pull这个repository到本地目录(之后的./就代表在本地下的这个目录),然后运行./data/目录下的script下载数据(这些数据本身不是必要的,只是因为我之前需要finetune模型将它们下载了下来,之后的路径、操作等等也基于这一事实)。
好了,现在会多了一些目录出来。我们需要将数据(假设图像输入就是.jpg文件,label是python的numpy array,即.npy文件)放到相应的地方,即./data/VOCdevkit2007/VOC2007/JPEGImages。但是呢,这个目录下是原来的VOC2007数据集的图像输入,所以我建议在这个目录下再新建一个目录(这里叫dlib)。因此实际存放路径是:
./data/VOCdevkit2007/VOC2007/JPEGImagesd/dlib
之后,我们需要为这些输入数据写xml文件。每一份输入图像都对应一个xml文件,内容如下(不需要注重格式):
<code class="language-xml hljs has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">annotation</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">folder</span>></span>VOC2007<span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">folder</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">filename</span>></span>image_0046.jpg<span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">filename</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">source</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">database</span>></span>dlib facial landmark<span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">database</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">annotation</span>></span>Yuliang Zou<span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">annotation</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">source</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">size</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">width</span>></span>400<span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">width</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">height</span>></span>300<span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">height</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">depth</span>></span>3<span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">depth</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">size</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"><<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">segmented</span>></span>0<span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">segmented</span>></span><span class="hljs-tag" style="color: rgb(0, 102, 102); box-sizing: border-box;"></<span class="hljs-title" style="box-sizing: border-box; color: rgb(0, 0, 136);">annotation</span>></span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>
同样地,为了与原来数据的xml文件混淆,新建一个dlib文件夹,因此这些新xml文件的存放路径为:
./data/VOCdevkit2007/VOC2007/Annotations/dlib
以上的操作,都没有对dataset进行training set与test set的区分,下面我们就来完成这件事。打开目录:
./data/VOCdevkit2007/VOC2007/ImageSets/Main
我们可以看到很多的txt文件,先把原来的trainval.txt与test.txt备份好。然后,新建自己的trainval.txt与test.txt,每一行都是输入图像的名称,例:
dlib/100032540_1
dlib/1002681492_1
dlib/1004467229_1
...
(直接用这两个txt文件的名字,是因为改用新的会有点麻烦,详见附录)
之后,我们需要修改相应的python代码使得数据可以顺利导入。
(1)在lib/roi_data_layer/layer.py
里的setup()函数,我们需要添加如下代码,为label分配空间:
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">top[idx].reshape(cfg.TRAIN.IMS_PER_BATCH, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">68</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">38</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">50</span>) self._name_to_top_map[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'heatmap'</span>] = idx idx += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li></ul>
我这里的label是facial landmark,一共有68个2-d array。然后把下面的一些不需要的部分删掉(不然之后可能会报错)。
(2)在.lib/utils/blob.py
里新定义一个函数:
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">heatmap_list_to_blob</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(hms)</span>:</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">""" Convert a list of heat maps into a network input."""</span> num_hms = len(hms) blob = np.zeros((num_hms, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">68</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">38</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">50</span>), dtype=np.float32) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> i <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> xrange(num_hms): hm = hms[i] blob[i] = hm.transpose((<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>)) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> blob</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li></ul>
这个函数可以将包含若干label的python list转换为caffe的blob数据结构。
(3)在lib/roi_data_layer/minibatch.py
里导入刚刚定义的heatmap_list_to_blob函数,然后新定义函数:
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">_get_heatmap_blob</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(roidb)</span>:</span> <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"""Get a batch of heat maps"""</span> num_images = len(roidb) hms = [] <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> i <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> xrange(num_images): hm = np.load(roidb[i][<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'heatmap'</span>]) hms.append(hm) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Create a blob to hold the input heat maps</span> blob = heatmap_list_to_blob(hms) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> blob</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li></ul>
然后,在get_minibatch()函数中加入如下几行代码:
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Get the imput heat map blob, formatted for caffe</span> hm_blob = _get_heatmap_blob(roidb) blobs[<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'heatmap'</span>] = hm_blob</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li></ul>
(4)在./lib/roi_data_layer/roidb.py
的prepare_roidb()函数中这行代码之后:
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">roidb[i][<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'image'</span>] = imdb.image_path_at(i)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>
加入这么一行:
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">roidb[i][<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'heatmap'</span>] = roidb[i][<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'image'</span>][<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>:len(roidb[i][<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'image'</span>])-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">3</span>] + <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'npy'</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li></ul>
相信看到这里大家也知道了,imdb.image_path_at(i)
获取的是输入图像的完整路径,我们进行些许修改就可以得到label的完整路径。
最后,我们需要修改train.prototxt,这个按自己的需要定制就可以了,比较简单,就不详述了。
在最后之后,如果要测试性能,需要自己对./lib/fast_rcnn/test.py
进行修改。这里不再详述,我相信当你成功地开始训练的时候,已经对这些内容比较了解了,可以比较容易地写出自己需要的版本。
当然,完成了以上的所有步骤之后,可能还是会出现某些问题。
1.毕竟我的xml文件比原来的简化了不少,可以按实际情况删掉相应的code(原来的代码可能会导入xml文件的一些参数,但是我省略了那些参数)
2.原来代码对于输入图像的scaling比较奇怪,那边有可能会出错。对于某些输入尺寸固定的dataset,或许你可以修改
lib/roi_data_layer/layer.py
里的setup()函数,其中会有一行
top[idx].reshape(cfg.TRAIN.IMS_PER_BATCH, 3, _, _)
最后的两个参数是height和width,按需要修改。
最近折腾这个东西也折腾了很久,甚是头疼,更是加深了我对rbg大神的仰慕之情。行文有些混乱,如果有不明白的欢迎留言,大家一起交流。
附录:
./lib/datasets/factory.py
这份代码负责构造dataset
line 15 - 20:
<code class="language-python hljs has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># Set up voc_<year>_<split> using selective search "fast" mode</span> <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> year <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> [<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'2007'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'2012'</span>]: <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> split <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> [<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'train'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'val'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'trainval'</span>, <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'test'</span>]: name = <span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'voc_{}_{}'</span>.format(year, split) __sets[name] = (<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">lambda</span> split=split, year=year: pascal_voc(split, year))</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li></ul>
./experiments/scripts/faster_rcnn_end2end.sh
这份bash文件负责指定训练与测试时所用的dataset
line 27 - 28:
<code class="language-bash hljs has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"> TRAIN_IMDB=<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"voc_2007_trainval"</span> TEST_IMDB=<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"voc_2007_test"</span></code>