Keras自然语言处理（二十三）

最新推荐文章于 2021-05-21 12:46:27 发布

棋王一生

最新推荐文章于 2021-05-21 12:46:27 发布

阅读量277

点赞数

分类专栏：使用VGG16预训练模型预测图片文章标签： ImageNet VGG16

本文链接：https://blog.csdn.net/weixin_44755244/article/details/102821871

版权

使用VGG16预训练模型预测图片专栏收录该内容

1 篇文章 0 订阅

订阅专栏

第二十章使用预训练模型识别物体

卷积神经网络现在能够在某些计算机视觉任务上胜过人类，例如图片分类，给定物体照片，回答照片显示的物体是1000个特定中的哪一个。这项任务的优秀模型之一为VGG模型，系牛津大学研究人员所开发。该系列模型有VGG16和VGG19,他们都能够对照片中的物体进行分类，并且还能免费获取。接下来你将了解：

关于ImageNet数据集和VGG模型
使用keras加载VGG模型
使用加载的VGG模型对图片进行分类

20.1 概述

本章分为以下几个部分：

ImageNet图片集
VGG模型
使用Keras加载VGG模型
开发简单的图片分类器

注意：Keras使用Python Imaging Library 或PIL库来处理图像

20.2 ImageNet

在这里插入图片描述

20.3 VGG模型

蹩脚的翻译在这里插入图片描述
这里我就直接插入图片了，2014年的东西我觉得还是太老，但是作者认为比较新

20.4 加载VGG模型

这里，我首先给VGG权重文件的github地址：https://github.com/fchollet/deep-learning-models/releases，从连接中可以看出除了VGG模型权重之外还有很多其他的模型的权重文件，这里我们仅仅需要下载

vgg16_weights_tf_dim_ordering_tf_kernels.h5
vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5
vgg19_weights_tf_dim_ordering_tf_kernels.h5
vgg19_weights_tf_dim_ordering_tf_kernels_notop.h5

这四份文件即可
加载模型：

from tensorflow.python.keras.applications.vgg16 import VGG16
import os
weigth_path = "F:\\5-model and data\\git model"
weigth_name = 'vgg16_weights_tf_dim_ordering_tf_kernels.h5'
weigth = os.path.join(weigth_path,weigth_name)
vgg16 = VGG16(
    include_top=True,
    weights=weigth
)
vgg16.summary()

其结构：

Model: "vgg16"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 224, 224, 3)]     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
_________________________________________________________________
flatten (Flatten)            (None, 25088)             0         
_________________________________________________________________
fc1 (Dense)                  (None, 4096)              102764544 
_________________________________________________________________
fc2 (Dense)                  (None, 4096)              16781312  
_________________________________________________________________
predictions (Dense)          (None, 1000)              4097000   
=================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
_________________________________________________________________

20.5 简单的图片分类器

接下来让我们看看模型对图片的分类

20.5.1 获取图片

下面是一张咖啡杯的照片
在这里插入图片描述查看图片属性信息：

这张图片的大小为：767x576，该图片放在桌面。和上面一样加载模型

from tensorflow.python.keras.applications.vgg16 import VGG16
import os
weigth_path = "F:\\5-model and data\\git model"
weigth_name = 'vgg16_weights_tf_dim_ordering_tf_kernels.h5'
weigth = os.path.join(weigth_path,weigth_name)
vgg16 = VGG16(
    include_top=True,
    weights=weigth
)

接下来加载准备图像

20.5.3 加载准备的图像

接下来，我们可以将图像作为像素数据加载。Keras提供了一些工具来帮助完成此步骤。首先，我们可以使用load_img函数加载图像并将其大小调整为所需：

from tensorflow.python.keras.preprocessing.image import load_img
img_path = r"d:\桌面\咖啡杯.png"
img = load_img(img_path,target_size=(224,224))

接下来，我们可以将像素转换为Numpy数组，以便我们可以在Keras中使用它，这里我们使用img_to_array函数

from tensorflow.python.keras.preprocessing.image import img_to_array
img = img_to_array(img)

网络期望一个或多个图像作为输入，这意味着输入数组需要四维：样本，行，列和通道四个维度的数据。我们这里只有一个样本（一张图片），我们可以通过调用reshape来添加额外而定维度：

img = img.reshape(1,img.shape[0],img.shape[1],img.shape[2])

接下来，需要以与准备ImageNet训练数据相同的方式准备图像像素，Keras提供了一个preprocess_input函数。

img = preprocess_input(img)

现在加载模型和图片，并对图片做处理后我们来对图片进行预测。

20.5.4 预测

我们调用模型上的predict函数，以便预测属于1000种已知对象类型中的每一种的图像的概率

yhat = vgg16.predict(img)

现在我们需要解释预测概率。

20.5.5 解释预测

Keras提供了decoded_predictions函数来解释分类概率的函数，它可以返回类的列表及其概率，但是你只想知道最可能的概率大小

yhat = vgg16.predict(img)
label = decode_predictions(yhat)
la = label[0][0]
print('%s (%.2f%%)'%(la[1],la[2]*100))

结果是：

coffee_mug (72.53%)

如果是前三个：

label = decode_predictions(yhat)
la = label[0][0]
print('%s (%.2f%%)'%(la[1],la[2]*100))
la = label[0][1]
print('%s (%.2f%%)'%(la[1],la[2]*100))
la = label[0][2]
print('%s (%.2f%%)'%(la[1],la[2]*100))

结果是：

coffee_mug (72.53%)
coffeepot (9.63%)
cup (6.96%)

可以看到识别前三的概率约为88%
下面给出完整的示例：

from tensorflow.python.keras.applications.vgg16 import VGG16,\
    preprocess_input,decode_predictions
import os
from tensorflow.python.keras.preprocessing.image import load_img,img_to_array
weigth_path = "F:\\5-model and data\\git model"
weigth_name = 'vgg16_weights_tf_dim_ordering_tf_kernels.h5'
weigth = os.path.join(weigth_path,weigth_name)
vgg16 = VGG16(
    include_top=True,
    weights=weigth
)
# vgg16.summary()
img_path = "20-4-咖啡杯.png"
img = load_img(img_path,target_size=(224,224))
img = img_to_array(img)
img = img.reshape(1,img.shape[0],img.shape[1],img.shape[2])
img = preprocess_input(img)
yhat = vgg16.predict(img)
label = decode_predictions(yhat)
la = label[0][0]
print('%s (%.2f%%)'%(la[1],la[2]*100))
la = label[0][1]
print('%s (%.2f%%)'%(la[1],la[2]*100))
la = label[0][2]
print('%s (%.2f%%)'%(la[1],la[2]*100))

结果是：

coffee_mug (72.53%)
coffeepot (9.63%)
cup (6.96%)

有兴趣的朋友可以了解一下ImageNet的挑战信息以及相关的网络。这里我给出一点自己查到的信息：
在这里插入图片描述

棋王一生

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Keras自然语言处理（二十三）

第二十章使用预训练模型识别物体卷积神经网络现在能够在某些计算机视觉任务上胜过人类，例如图片分类，给定物体照片，回答照片显示的物体是1000个特定中的哪一个。这项任务的优秀模型之一为VGG模型，系牛津大学研究人员所开发。该系列模型有VGG16和VGG19,他们都能够对照片中的物体进行分类，并且还能免费获取。接下来你将了解：关于ImageNet数据集和VGG模型使用keras加载VGG模型...
复制链接

扫一扫