caffe 练习4 ----利用python批量抽取caffe计算得到的特征------by 香蕉麦乐迪

最新推荐文章于 2021-11-23 12:00:32 发布

sloanqin

最新推荐文章于 2021-11-23 12:00:32 发布

阅读量3.9k

点赞数 2

分类专栏：深度学习文章标签：深度学习图像处理 caffe linux python

本文链接：https://blog.csdn.net/sloanqin/article/details/49249905

版权

深度学习专栏收录该内容

14 篇文章 1 订阅

订阅专栏

目的：使用python基于caffe，批量抽取图像特征

官网链接：http://nbviewer.ipython.org/github/BVLC/caffe/blob/master/examples/00-classification.ipynb

说明方法：根据我自己写的特征提取代码，一步步解释

1 导入需要的库文件

import numpy as np
import matplotlib.pyplot as plt
import scipy
import sys
import caffe
import os
import leveldb
import gc

说明：numpy和leveldb，这两个库需要读者自己安装；

leveldb的下载、安装、使用链接：https://github.com/rjpower/py-leveldb

numpy安装：请百度

2 指定相应文件的目录

eval_image_dir=caffe_root+'sloanqin/data/godpool/eval_image/'#输入图片的路径
feature_dir='./feature/' #提取特征的路径
featureStr_dir='./featureStr/'#提取特征的路径

ps：这里我给了两个特征提取输出的路径，后面会说明为什么我要这样做；

3 创建相应的目录

<pre name="code" class="python">if(os.path.isdir(feature_dir)):
    os.system('rm -r '+feature_dir)
os.system('mkdir '+feature_dir)

if(os.path.isdir(featureStr_dir)): # this for us to view
    os.system('rm -r '+featureStr_dir)
os.system('mkdir '+featureStr_dir)

说明：如果路径已经存在了，我们删除掉；然后创建新的路径；

这样做的目的主要是删除程序上次运行产生的文件；
4 caffe网络设置

######################################################################
caffe.set_mode_gpu()

net = caffe.Net(caffe_root + 'sloanqin/data/godpool/deploy.prototxt',
                caffe_root + 'sloanqin/data/godpool/caffenet_train_iter_201000.caffemodel',
                caffe.TEST)

# input preprocessing: 'data' is the name of the input blob == net.inputs[0]
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_transpose('data', (2,0,1))
transformer.set_mean('data', np.load(caffe_root + 'python/caffe/imagenet/ilsvrc_2012_mean.npy').mean(1).mean(1)) # mean pixel
transformer.set_raw_scale('data', 255)  # the reference model operates on images in [0,255] range instead of [0,1]
transformer.set_channel_swap('data', (2,1,0))  # the reference model has channels in BGR order instead of RGB

# set net to batch size of 50
net.blobs['data'].reshape(50,3,227,227)
######################################################################

说明：这里设置的含义读者自己应该很容易看出来；
至于为什么可以这样设置，需要看caffe的源码，看作者设计的数据结构；不过现在并没有必要这样做，我们知道就可以了；但是caffe的源码还是很值得学习的；

5 循环读取图片，存储提取得到的特征

<pre name="code" class="python">######################################################################
files = os.listdir(eval_image_dir) #打开图片路径
i=0
j=0
classnumMax=0
fileNameNum=np.arange(0,50,1).reshape(50,-1) # 数组，用来存储图片名称
fileHandle=open(eval_classify_txt_dir,'a') # txt文件，存储每一张图片预测的分类结果
# 打开leveldb文件，用来存储特征
dbDir=feature_dir+'db'
tempDb=leveldb.LevelDB(dbDir)
# this string help us to view
dbDirStr=featureStr_dir+'db'
tempDbStr=leveldb.LevelDB(dbDirStr)

for f in files:
    if(os.path.isfile(eval_image_dir + f)):
        if(i == 50):
            i=0
            j=j+1
            # now we get 50 image data input
            # we will forward compute it
            # and extract features
            out = net.forward()
            fc7Data = net.blobs['fc7'].data
            probData=(net.blobs['prob'].data)
            predict=np.argmax(probData,axis=1).reshape(50,-1)

            for k in range(0,50):
                classnum=predict[k][0]
                feature=fc7Data[k].reshape(1,-1)

                tempData=feature.tobytes() # 将特征数组，转换成byte类型
                tempDb.Put(str(fileNameNum[k][0]),tempData) # 存入特征

                tempDataStr=str(feature) # 将特征数组，转换成string类型，存为string目的是为了方便观察
                tempDbStr.Put(str(fileNameNum[k][0]),tempDataStr) # 存入特征

                fileHandle.write(str(fileNameNum[k][0])+'.jpg '+str(classnum)+'\n') #存储每一张图片预测的分类结果
            print '@_@ have extracted ',j*50,' images '
        # save file name
        # read file to data
        # index + 1
        fileNameNum[i]=float(f[0:12])
        net.blobs['data'].data[i,:,:,:] = transformer.preprocess('data', caffe.io.load_image(eval_image_dir + f))
        i=i+1


# for the rest images less than 50
# here i equals the rest images
print '@_@  extracted the rest',i,' images '

out = net.forward()
fc7Data = net.blobs['fc7'].data
probData=(net.blobs['prob'].data)
predict=np.argmax(probData,axis=1).reshape(50,-1)


for k in range(0,i):
    classnum=predict[k][0]
    feature=fc7Data[k].reshape(1,-1)

    tempData=feature.tobytes()
    tempDb.Put(str(fileNameNum[k][0]),tempData)

    tempDataStr=str(feature)
    tempDbStr.Put(str(fileNameNum[k][0]),tempDataStr)

    fileHandle.write(str(fileNameNum[k][0])+'.jpg '+str(classnum)+'\n')
    #print 'save feature of image : ',int(fileNameNum[k][0]),'.jpg'

# at the end,close the file
fileHandle.close()

print 'sum of classCnt is : ',sum(sum(classCnt))
print 'classnumMax is : ',classnumMax
print '@_@  extracted ',j*50+i,' images '
print '.Done'
######################################################################

说明：我们的batchsize是50，所以caffe每次处理50张图片；对于最后输入的几张没有被50整除的图片，我们单独做了处理；所以上面的存储特征的代码才有两个部分是雷同的

6 运行得到结果
说明：运行后会不断提取特征，并输出相应的信息；在工程目录下面会有feature文件夹出现，里面有leveldb格式的特征数据；
读者可使用leveldb的get函数，读取指定图片名称的特征值，fc7层的特征是4096维的；

sloanqin

关注

2
点赞
踩
1

收藏

觉得还不错? 一键收藏
4
评论
caffe 练习4 ----利用python批量抽取caffe计算得到的特征------by 香蕉麦乐迪

1 首先使用已经训练好的moduleFirst, import required modules, set plotting parameters, and run ./scripts/download_model_binary.py models/bvlc_reference_caffenet to get the pretrained CaffeNet model if it hasn
复制链接

扫一扫

专栏目录