利用caffe pre-trained model进行图像分类

最新推荐文章于 2024-06-15 19:38:47 发布

tina_ttl

最新推荐文章于 2024-06-15 19:38:47 发布

阅读量8.5k

点赞数 2

分类专栏： caffe学习文章标签： caffe python

本文链接：https://blog.csdn.net/tina_ttl/article/details/51033646

版权

caffe学习专栏收录该内容

13 篇文章 0 订阅

订阅专栏

本人主要分析如何利用caffe pre-trained model进行图像分类
caffe的examples中给出了该任务的具体程序，想要了解该过程，只要阅读该程序即可

Setup

配置python环境，导入numpy，并对显示部分进行设置

# set up Python environment: numpy for numerical routines, and matplotlib for plotting
import numpy as np
import matplotlib.pyplot as plt
# display plots in this notebook
%matplotlib inline

# set display defaults
plt.rcParams['figure.figsize'] = (10, 10)        # large images
plt.rcParams['image.interpolation'] = 'nearest'  # don't interpolate: show square pixels
plt.rcParams['image.cmap'] = 'gray'  # use grayscale output rather than a (potentially misleading) color heatmap

导入caffe（其实是pycaffe）

# The caffe module needs to be on the Python path;
#  we'll add it here explicitly.
import sys
caffe_root = '../'  # this file should be run from {caffe_root}/examples (otherwise change this line)
sys.path.insert(0, caffe_root + 'python')

import caffe
# If you get "No module named _caffe", either you have not built pycaffe or you have the wrong path.

下载models

下面，判断caffe_root路径下的models路径下，是否有caffemodel存在，如果不存在，则利用caffe_root下的scripts文件夹中的download_model_binary.py文件下载该caffe model
e.g., caffenet的caffemodel的名称为：bvlc_reference_caffenet.caffemodel，置于caffe_root路径下的models路径下的bvlc_reference_caffenet文件夹下（models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel）

二级目录	三级目录/文件	四级目录/文件
/models	/bvlc_reference_caffenet	/bvlc_reference_caffenet.caffemodel
/scripts	/download_model_binary.py
/exampes	/当前运行的程序
/python	/caffe	/imagenet/…

-‘../’即表示当前运行程序的上一级目录，以上表为例，记为caffe_root文件夹

导入model，并且进行预处理

从硬盘中读入net

# 设置caffe的模式，这里设置为CPU模式
caffe.set_mode_cpu()

# caffenet的网络结果prototxt文件
model_def = caffe_root + 'models/bvlc_reference_caffenet/deploy.prototxt'

# caffenet的pre-trained model，即caffenet的整个训练好的模型参数
model_weights = caffe_root + 'models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel'

# 从硬盘中读入caffenet
net = caffe.Net(model_def,      # defines the structure of the model
                model_weights,  # contains the trained weights
                caffe.TEST)     # use test mode (e.g., don't perform dropout)

设置预处理transformer

# load the mean ImageNet image (as distributed with Caffe) for subtraction
mu = np.load(caffe_root + 'python/caffe/imagenet/ilsvrc_2012_mean.npy')
mu = mu.mean(1).mean(1)  # average over pixels to obtain the mean (BGR) pixel values
print 'mean-subtracted values:', zip('BGR', mu)

# create transformer for the input called 'data'
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})

transformer.set_transpose('data', (2,0,1))  # move image channels to outermost dimension
transformer.set_mean('data', mu)            # subtract the dataset-mean value in each channel
transformer.set_raw_scale('data', 255)      # rescale from [0, 1] to [0, 255]
transformer.set_channel_swap('data', (2,1,0))  # swap channels from RGB to BGR

caffenet在traning图像时，对training images进行了一些预处理，那么，为了能够利用该pre-trained model对新的图像进行分类，必须要对new image进行preprocessing，在该程序中，利用了caffe.io.Transformer
具体地代码如下，下面对该代码进行简单解释（没有完全理解，但会逐步改进）

导入imagenet数据的所有图像的均值

imagenet数据集所有图像的均值

这里的ilsvrc_2012_mean.npy文件时numpy的数据文件，类型为

创建transformer

该transformer的主要作用是
（1）对读取到的图像所对应的array的维度进行转换
想要识别图像，需要利用python读取图像，python读取的图像格式为：图像的高、图像的宽、图像的channel
为了适应caffe的数据格式，需要将其转化为：图像的channel、图像的高、图像的宽
（2）输入图像的每个channel的所有像素值都减去imagenet数据库中的所有图像的三个channel的均值，即mu
（3）对测试图像进行rescale，python中的图像像素值为[0,1]，为了利用caffe model，需要将该图像像素值变回[0,255]
（4）对输入图像的三个通道顺序进行变换，普通的图像都是R-G-B，但caffe在处理RGB图像时，将其变换为B-G-R

导入图像，进行分类

设置net的输入shape

# set the size of the input (we can skip this if we're happy
#  with the default; we can also change it later, e.g., for different batch sizes)
net.blobs['data'].reshape(50,        # batch size
                          3,         # 3-channel (BGR) images
                          227, 227)  # image size is 227x227

load image, 并利用transformer进行预处理

# 利用load_iamge从硬盘中导入图像，得到的image是一个(360, 480, 3)的ndarray
image = caffe.io.load_image(caffe_root + 'examples/images/cat.jpg')
# 对该图像进行preprocessing，得到ndarray的形状为(3, 227, 227)
transformed_image = transformer.preprocess('data', image)
print transformed_image.shape
# 显示该图像
plt.imshow(image)

利用网络对该输入的图像进行分类

# 将preprocessed的图像复制到分配给改net的内存中
net.blobs['data'].data[...] = transformed_image

# 计算网络输出，它是一个dict，key-prob对应的即为该输入图像的prob数值
output = net.forward()

# 从dict中取出该输入图像对应的prob向量，它的尺度为(1000,) 
output_prob = output['prob'][0]  # the output probability vector for the first image in the batch

print 'predicted class is:', output_prob.argmax()

输入的是一副cat，该段程序运行结果为：

predicted class is: 281

找到prob最大的那个位置所对应的label

# 导入imagenet数据集的label文件

# 判断该label文件是否存在，如果不存在，则下载该文档
if not os.path.exists(labels_file):
    !../data/ilsvrc12/get_ilsvrc_aux.sh

# 从txt文件中导入lables，它是一个(1000,)的ndarray
labels = np.loadtxt(labels_file, str, delimiter='\t')

print 'output label:', labels[output_prob.argmax()]

结果为

output label: n02123045 tabby, tabby cat

查看5-top 预测结果

# sort top five predictions from softmax output
top_inds = output_prob.argsort()[::-1][:5]  # reverse sort and take five largest items

print 'probabilities and labels:'
zip(output_prob[top_inds], labels[top_inds])

结果如下：

probabilities and labels:
Out[27]:
[(0.31243625, ‘n02123045 tabby, tabby cat’),
(0.23797157, ‘n02123159 tiger cat’),
(0.12387245, ‘n02124075 Egyptian cat’),
(0.10075716, ‘n02119022 red fox, Vulpes vulpes’),
(0.070957333, ‘n02127052 lynx, catamount’)]

tina_ttl

关注

2
点赞
踩
3

收藏

觉得还不错? 一键收藏
1
评论
利用caffe pre-trained model进行图像分类

本人主要分析如何利用caffe pre-trained model进行图像分类 caffe的examples中给出了该任务的具体程序，想要了解该过程，只要阅读该程序即可Setup配置python环境，导入numpy，并对显示部分进行设置# set up Python environment: numpy for numerical routines, and matplotlib for plot
复制链接

扫一扫

专栏目录