读书笔记--Deep Learning for Computer Vision with Python（第12章）

最新推荐文章于 2024-04-13 11:37:13 发布

置顶一只自学喵

最新推荐文章于 2024-04-13 11:37:13 发布

阅读量561

点赞数

文章标签： python 深度学习图像处理

本文链接：https://blog.csdn.net/weixin_43981811/article/details/95240984

版权

文章目录

第十二章训练你的第一个CNN

第十二章训练你的第一个CNN

【概述】本章的目标有两个
一、搭建一个图像预处理器。其功能是：将输入图像转换为Keras可以直接使用的Numpy数组。
二、搭建一个CNN。其特点是只有一层（慢慢来）。

一、图像预处理器及加载

1.1 imagetoarraypreprocessor

话不多说直接分析代码啦。

from keras.preprocessing.image import img_to_array

class ImageToArrayPreprocessor:
    def __init__(self, dataFormat=None):
        self.dataFormat = dataFormat

    def preprocess(self, image):
        return img_to_array(image, data_format=self.dataFormat)

逐段分析：

    def __init__(self, dataFormat=None):
        self.dataFormat = dataFormat

构造函数接受名为dataFormat的可选参数。此值默认为None，可以使用channels_first或channels_last字符串，但最好让Keras根据配置文件选择要使用的图像维度排序。

    def preprocess(self, image):
        return img_to_array(image, data_format=self.dataFormat)

此段的功能：
1.接受一个图像作为输入
2.调用 img_to_array ，并且指定是channels_first或channels_last
3.按channels_first或channels_last返回一个 numpy 数组

1.2 simplepreprocessor

这个预处理器在前几章，但是我比较懒。。。然后以后可能也要用到这个预处理器，所以就放到这章了。
代码如下：

import cv2

class SimplePreprocessor:
	def __init__(self, width, height, inter=cv2.INTER_AREA):
		self.width = width
		self.height = height
		self.inter = inter

	def preprocess(self, image):
		return cv2.resize(image, (self.width, self.height),
			interpolation=self.inter)

这个预处理器的功能是：将缩放输入图像的大小。逐段分析：

	def __init__(self, width, height, inter=cv2.INTER_AREA):
		self.width = width
		self.height = height
		self.inter = inter

构造函数接受宽度和高度两个参数，即输出图像的宽度和高度。最后一个inter参数是插值方法，默认为cv2.INTER_AREA，插值方法总共有5种：

１、INTER_NEAREST - 最近邻插值法
２、INTER_LINEAR - 双线性插值法
３、INTER_AREA - 基于局部像素的重采样（resampling using pixel area relation）。对于图像抽取（image decimation）来说，这可能是一个更好的方法。但如果是放大图像时，它和最近邻法的效果类似。
４、INTER_CUBIC - 基于4x4像素邻域的3次插值法
５、INTER_LANCZOS4 - 基于8x8像素邻域的Lanczos插值

	def preprocess(self, image):
		return cv2.resize(image, (self.width, self.height),
			interpolation=self.inter)

调用 cv2.resize 将输入图像缩放，最后输出缩放后的图像。

1.3 simpledatasetloader

直接上代码：

import numpy as np
import cv2
import os

class SimpleDatasetLoader:
	def __init__(self, preprocessors=None):
		self.preprocessors = preprocessors
		if self.preprocessors is None:
			self.preprocessors = []

	def load(self, imagePaths, verbose=-1):
		data = []
		labels = []
		for (i, imagePath) in enumerate(imagePaths):
			image = cv2.imread(imagePath)
			label = imagePath.split(os.path.sep)[-2]
			if self.preprocessors is not None:
				for p in self.preprocessors:
					image = p.preprocess(image)
			data.append(image)
			labels.append(label)
			if verbose > 0 and i > 0 and (i + 1) % verbose == 0:
				print("[INFO] processed {}/{}".format(i + 1, len(imagePaths)))

		return (np.array(data), np.array(labels))

这段代码是从磁盘加载图像，并将加载的图像做处理。

	def __init__(self, preprocessors=None):
		self.preprocessors = preprocessors
		if self.preprocessors is None:
			self.preprocessors = []

构造函数接受preprocessors参数，该参数是预处理函数，默认值是None，如果没有预处理的话preprocessors就为空。

	def load(self, imagePaths, verbose=-1):
		data = []
		labels = []
		for (i, imagePath) in enumerate(imagePaths):
			image = cv2.imread(imagePath)
			label = imagePath.split(os.path.sep)[-2]
			if self.preprocessors is not None:
				for p in self.preprocessors:
					image = p.preprocess(image)
			data.append(image)
			labels.append(label)
			if verbose > 0 and i > 0 and (i + 1) % verbose == 0:
				print("[INFO] processed {}/{}".format(i + 1, len(imagePaths)))

		return (np.array(data), np.array(labels))

load函数接受数据集的路径作为传入参数，数据集路径格式…/cat/cat.jpg或者…/dog/dog.jpg，lable可以直接从路径中得到，如果preprocessors非空的话，依次对图像进行预处理。

1.4 这些预处理程序的作用

乍看上去，这两个预处理操作是在干嘛呀，代码这么长，功能这么简单，每个预处理程序就调用了一个函数，我直接在程序里调用 img_to_array和cv2.resize不就行了吗？
直接调用当然可以，但是这么做的好处是：我们可以从硬盘加载数据集时将预处理器链接在一起。
例如，假设我们希望将所有输入图像的大小调整为32：
sp = SimplePreprocessor(32, 32)
调整大小之后，又想让图像转换成channels_first或channels_last的 numpy 数组：
iap = ImageToArrayPreprocessor()
现在，假设我们希望从磁盘加载图像数据集并预处理数据集中的所有图像。

sdl = SimpleDatasetLoader(preprocessor=[sp, iap])
(data, labels) = sdl.load(imagePaths, verbose=500)

上一段代码完成的任务是：按SimplePreprocessor、ImageToArrayPreprocessor的顺序处理从磁盘中读取的图片数据，输出对应的numpy数组。这样看起来代码简洁多了，并且下次再用直接调用也很方便。
预处理过程

二、很浅的CNN - ShallowNet

【概述】本章的网络结构为：INPUT -> CONV -> RELU -> FC。该网络实现后将用于Animals和CIFAR-10数据集。

2.1 ShallowNet

直接上代码：

from keras.models import Sequential
from keras.layers.convolutional import Conv2D
from keras.layers.core import Activation
from keras.layers.core import Flatten
from keras.layers.core import Dense
from keras import backend as K

# Input => CONV => RELU => FC
class ShallowNet:
    @staticmethod
    def build(width, height, depth, classes):
        model = Sequential()
        inputShape = (height, width, depth)
        if K.image_data_format() == "channels_first":
            inputShape = (depth, height, width)

        # define the first (and only) CONV => RELU layer
        model.add(Conv2D(32, (3, 3), padding="same",input_shape=inputShape))

        model.add(Activation("relu"))
        model.add(Flatten())
        model.add(Dense(classes))
        model.add(Activation("softmax"))

        return model

该段代码实现的是INPUT -> CONV -> RELU -> FC的网络结构。
分段分析：
@staticmethod参见 staticmethod

class ShallowNet:
    @staticmethod
    def build(width, height, depth, classes):
        model = Sequential()
        inputShape = (height, width, depth)
        if K.image_data_format() == "channels_first":
            inputShape = (depth, height, width)

该段定义构建方法，该方法接受四个参数：输入图像的宽、高、深度、类别的总数（对CIFAR-10来说，classes=10）
倒数后三行是为了保证输入数据的shape与Keras backend的一致（channels last或者channels first）。

        model.add(Conv2D(32, (3, 3), padding="same",input_shape=inputShape))
        model.add(Activation("relu"))
        model.add(Flatten())
        model.add(Dense(classes))
        model.add(Activation("softmax"))

首先，定义第一层卷积层，这一层用32个滤波器，每个滤波器的大小为3x3，还应用了same padding（对于此例子来说不是必须的，但是一开始就养成这个好习惯很必要）；
其次，添加relu层；
然后，将网络展平；
然后，应用全连接层；
最后，用softmax分类。

2.2 ShallowNet用在Animals数据集上

from sklearn.preprocessing import LabelBinarizer
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from pyimagesearch.preprocessing import ImageToArrayPreprocessor
from pyimagesearch.preprocessing import SimplePreprocessor
from pyimagesearch.datasets import SimpleDatasetLoader
from CNN_shallownet import ShallowNet
from keras.optimizers import SGD
from imutils import paths
import matplotlib.pyplot as plt
import numpy as np
import argparse

ap = argparse.ArgumentParser()
ap.add_argument("-d", "--dataset", required=True,
                help="path to input dataset")
args = vars(ap.parse_args())    # 用vars将ap.parse_args()变成字典

print('[INFO] loading images...')
imagePaths = list(paths.list_images(args["dataset"]))   # 列出dataset路径下的所有图像路径
sp = SimplePreprocessor(32, 32)
iap = ImageToArrayPreprocessor()

sdl = SimpleDatasetLoader(preprocessors=[sp, iap])
(data, labels) = sdl.load(imagePaths, verbose=500)
data = data.astype("float") / 255.0

(trainX, testX, trainY, testY) = train_test_split(data, labels, test_size=0.25, random_state=42)		#划分训练和测试集

trainY = LabelBinarizer().fit_transform(trainY)
testY = LabelBinarizer().fit_transform(testY)

print("[INFO] compiling model...")
opt = SGD(lr=0.005)
model = ShallowNet.build(width=32, height=32, depth=3, classes=3)		#创建ShallowNet
model.compile(loss='categorical_crossentropy', optimizer=opt,  metrics=['accuracy'])		#编译模型，采用随机梯度下降法优化

print('[INFO] training network...')
H = model.fit(trainX, trainY, validation_data=(testX, testY), batch_size=32, epochs=100, verbose=1)		#训练模型

print('[INFO] evaluating network...' )			#评价模型
prediations = model.predict(testX, batch_size=32)
print(classification_report(testY.argmax(axis=1),
        prediations.argmax(axis=1),
        target_names=['cat', 'dog', 'panda']))

argparse的简单介绍：argparse

一只自学喵

关注

0
点赞
踩
8

收藏

觉得还不错? 一键收藏
0
评论
读书笔记--Deep Learning for Computer Vision with Python（第12章）

文章目录第十二章训练你的第一个CNN一、图像预处理器及加载1.1 imagetoarraypreprocessor1.2 simplepreprocessor1.3 simpledatasetloader1.4 这些预处理程序的作用第十二章训练你的第一个CNN【概述】本章的目标有两个一、搭建一个图像预处理器。其功能是：将输入图像转换为Keras可以直接使用的Numpy数组。二、搭建一个...
复制链接

扫一扫