使用Python科学计算包搭建CNN算法实践（1）

最新推荐文章于 2024-06-27 17:39:25 发布

肖永威

最新推荐文章于 2024-06-27 17:39:25 发布

阅读量3.6k

点赞数

分类专栏：大数据文章标签： CNN python NUMPY

本文链接：https://blog.csdn.net/xiaoyw71/article/details/80209042

版权

大数据专栏收录该内容

57 篇文章 2 订阅

订阅专栏

　　深度学习的概念源于人工神经网络的研究。含多隐层的多层感知器就是一种深度学习结构。深度学习通过组合低层特征形成更加抽象的高层表示属性类别或特征，以发现数据的分布式特征表示。

　　CNN（Convolutional Neural Network）——卷积神经网络，人工神经网络（Neural Network，NN）的一种，其它还有RNN、DNN等类型，而CNN就是利用卷积进行滤波的神经网络。换句话说，CNN就是卷积加神经网络。

卷积

　　我们在 2 维上说话。有两个 $\mathcal{R}^2\rightarrow \mathcal{R}$ 的函数 f(x, y) 和 g(x, y) 。所谓 f 和 g 的卷积就是一个新的 $\mathcal{R}^2\rightarrow \mathcal{R}$ 的函数 c(x, y) 。通过下式得到

　　 $c(x,y)=\int_{-\infty}^{\infty} \int_{-\infty}^{\infty}f(s,t)\times g(x-s,y-t) \ ds \ dt$

　　这式子的含义是：遍览从负无穷到正无穷的全部 s 和 t 值，把 g 在 (x-s, y-t) 上的值乘以 f 在 (s, t) 上的值之后再“加和”到一起（积分意义上），得到 c 在 (x, y) 上的值。说白了卷积就是一种“加权求和”：以 f 为权，以 (x, y) 为中心，把 g 距离中心 (-s, -t) 位置上的值乘上 f 在 (s, t) 的值，最后加到一起。把卷积公式写成离散形式就更清楚了：

　　 $C(x,y)=\sum_{t=-\infty}^{\infty}\sum_{s=-\infty}^{\infty}F(s,t)\times G(x-s,y-t) \ \Delta s \ \Delta t=\sum_{t=-\infty}^{\infty}\sum_{s=-\infty}^{\infty}F(s,t)\times G(x-s,y-t)$

　　卷积代码实现如下：

def conv_(img, conv_filter):
    filter_size = conv_filter.shape[0]
    result = numpy.zeros((img.shape))
    #Looping through the image to apply the convolution operation.
    for r in numpy.uint16(numpy.arange(filter_size/2, 
                          img.shape[0]-filter_size/2-2)):
        for c in numpy.uint16(numpy.arange(filter_size/2, img.shape[1]-filter_size/2-2)):
            #Getting the current region to get multiplied with the filter.
            curr_region = img[r:r+filter_size, c:c+filter_size]
            #Element-wise multipliplication between the current region and the filter.
            curr_result = curr_region * conv_filter
            conv_sum = numpy.sum(curr_result) #Summing the result of multiplication.
            result[r, c] = conv_sum #Saving the summation in the convolution layer feature map.

    #Clipping the outliers of the result matrix.
    final_result = result[numpy.uint16(filter_size/2):result.shape[0]-numpy.uint16(filter_size/2), 
                          numpy.uint16(filter_size/2):result.shape[1]-numpy.uint16(filter_size/2)]
    return final_result
def conv(img, conv_filter):
    if len(img.shape) > 2 or len(conv_filter.shape) > 3: # Check if number of image channels matches the filter depth.
        if img.shape[-1] != conv_filter.shape[-1]:
            print("Error: Number of channels in both image and filter must match.")
            sys.exit()
    if conv_filter.shape[1] != conv_filter.shape[2]: # Check if filter dimensions are equal.
        print('Error: Filter must be a square matrix. I.e. number of rows and columns must match.')
        sys.exit()
    if conv_filter.shape[1]%2==0: # Check if filter diemnsions are odd.
        print('Error: Filter must have an odd size. I.e. number of rows and columns must be odd.')
        sys.exit()

    # An empty feature map to hold the output of convolving the filter(s) with the image.
    feature_maps = numpy.zeros((img.shape[0]-conv_filter.shape[1]+1, 
                                img.shape[1]-conv_filter.shape[1]+1, 
                                conv_filter.shape[0]))

    # Convolving the image by the filter(s).
    for filter_num in range(conv_filter.shape[0]):
        print("Filter ", filter_num + 1)
        curr_filter = conv_filter[filter_num, :] # getting a filter from the bank.
        """ 
        Checking if there are mutliple channels for the single filter.
        If so, then each channel will convolve the image.
        The result of all convolutions are summed to return a single feature map.
        """
        if len(curr_filter.shape) > 2:
            conv_map = conv_(img[:, :, 0], curr_filter[:, :, 0]) # Array holding the sum of all feature maps.
            for ch_num in range(1, curr_filter.shape[-1]): # Convolving each channel with the image and summing the results.
                conv_map = conv_map + conv_(img[:, :, ch_num], 
                                  curr_filter[:, :, ch_num])
        else: # There is just a single channel in the filter.
            conv_map = conv_(img, curr_filter)
        feature_maps[:, :, filter_num] = conv_map # Holding feature map with the current filter.
    return feature_maps # Returning all feature maps.

　　什么是人工神经网络（NN，Neural Network）？人工神经网络是一种计算模型，是由多层、每层多个人工神经元组成的复杂网络。单个人工神经元的结构见下图：
这里写图片描述

　　 $p_{1} ,p_{2} , \ ... \ ,p_{n}$ 是神经元的输入。a 是神经元的输出。神经元将输入 $p_{1} ,p_{2} , \ ... \ , p_{n}$ 加权求和后再加上偏置值 b ，最后再施加一个函数 f ，即：

　　 $a=f(n)=f \left( \sum_{i=1}^{n}{p_iw_i}+b \right) = f \left( \begin{array}{ccc} (w_1,w_2 \cdots w_n) \end{array} \left( \begin{array}{ccc} p_1 \\ p_2 \\ \vdots \\ p_n\end{array} \right)+b \right) = f \left( \mathcal{W}^T\mathcal{P}+b\right)$

　　上式最后是这个式子的向量形式。P 是输入向量，W 是权值向量，b 是偏置值标量。f 称为激活函数（ Activation Function ）。

ReLU激活函数层

　　在深度神经网络中，通常使用一种叫修正线性单元(Rectified linear unit，ReLU）作为神经元的激活函数。ReLU起源于神经科学的研究：2001年，Dayan、Abott从生物学角度模拟出了脑神经元接受信号更精确的激活模型，如下图：

　　首先，我们来看一下ReLU激活函数的形式，如下图：
这里写图片描述

　　从上图不难看出，ReLU函数其实是分段线性函数，把所有的负值都变为0，而正值不变，这种操作被成为单侧抑制。可别小看这个简单的操作，正因为有了这单侧抑制，才使得神经网络中的神经元也具有了稀疏激活性。尤其体现在深度神经网络模型(如CNN)中，当模型增加N层之后，理论上ReLU神经元的激活率将降低2的N次方倍。

　　比如，生病了去医院看病，检查报告里面上百项指标，但跟病情相关的通常只有那么几个。

　　ReLU函数代码实现如下：

def relu(feature_map):
    #Preparing the output of the ReLU activation function.
    relu_out = numpy.zeros(feature_map.shape)
    for map_num in range(feature_map.shape[-1]):
        for r in numpy.arange(0,feature_map.shape[0]):
            for c in numpy.arange(0, feature_map.shape[1]):
                relu_out[r, c, map_num] = numpy.max(feature_map[r, c, map_num], 0)
    return relu_out

　　经典的CNN网络，LeNet5

这里写图片描述

Max pooling

　　max pooling 的操作如下图所示：整个图片被不重叠的分割成若干个同样大小的小块（pooling size）。每个小块内只取最大的数字，再舍弃其他节点后，保持原有的平面结构得出 output。这里写图片描述

　　pooling代码实现如下：

def pooling(feature_map, size=2, stride=2):
    #Preparing the output of the pooling operation.
    pool_out = numpy.zeros((numpy.uint16((feature_map.shape[0]-size+1)/stride),
                            numpy.uint16((feature_map.shape[1]-size+1)/stride),
                            feature_map.shape[-1]))
    for map_num in range(feature_map.shape[-1]):
        r2 = 0
        for r in numpy.arange(0,feature_map.shape[0]-size-1, stride):
            c2 = 0
            for c in numpy.arange(0, feature_map.shape[1]-size-1, stride):
                pool_out[r2, c2, map_num] = numpy.max(feature_map[r:r+size,  c:c+size])
                c2 = c2 + 1
            r2 = r2 +1
    return pool_out

　　本文将仅使用NumPy实现CNN网络，创建三个层模块，分别为卷积层（Conv）、ReLu激活函数和最大池化（max pooling）。

（1）. 读取输入图像：

# Reading the image
img = skimage.io.imread("timg1.jpg")
#img = skimage.data.chelsea()
# Converting the image into gray.
img = skimage.color.rgb2gray(img)

这里写图片描述

（2）. 准备滤波器

   以下代码为第一个卷积层Conv准备滤波器组（Layer 1，缩写为l1，下同）：

# First conv layer
#l1_filter = numpy.random.rand(2,7,7)*20 # Preparing the filters randomly.
l1_filter = numpy.zeros((2,3,3))
l1_filter[0, :, :] = numpy.array([[[-1, 0, 1], 
                                   [-1, 0, 1], 
                                   [-1, 0, 1]]])
l1_filter[1, :, :] = numpy.array([[[1,   1,  1], 
                                   [0,   0,  0], 
                                   [-1, -1, -1]]])

　　上述代码创建了2个3x3大小的滤波器，（2,3,3）中的元素数字分别表示2：滤波器的数目（num_filters）、3：表示滤波器的列数、3：表示滤波器的行数。

（3）. 卷积层（Conv Layer）

　　构建好滤波器后，接下来就是与输入图像进行卷积操作。下面代码使用conv函数将输入图像与滤波器组进行卷积：

print("\n**Working with conv layer 1**")
l1_feature_map = conv(img, l1_filter)

　　图形效果见“L1-Map1”。

（4）.ReLU激活函数层

　　ReLU层将ReLU激活函数应用于conv层输出的每个特征图上，根据以下代码行调用ReLU激活函数：

print("\n**ReLU**")
l1_feature_map_relu = relu(l1_feature_map)

　　图形效果见“L1-Map1ReLU”。

（5）. 最大池化层

　　ReLU层的输出作为最大池化层的输入，根据下面的代码行调用最大池化操作：

print("\n**Pooling**")
l1_feature_map_relu_pool = pooling(l1_feature_map_relu, 2, 2)
print("**End of conv layer 1**\n")

　　图形效果见“L1-Map1ReLUPool”。

这里写图片描述

　　以上内容已经实现CNN结构的基本层——conv、ReLU以及max pooling，现在将其进行堆叠使用，代码如下：

# Second conv layer
l2_filter = numpy.random.rand(3, 5, 5, l1_feature_map_relu_pool.shape[-1])
print("\n**Working with conv layer 2**")
l2_feature_map = conv(l1_feature_map_relu_pool, l2_filter)
print("\n**ReLU**")
l2_feature_map_relu = relu(l2_feature_map)
print("\n**Pooling**")
l2_feature_map_relu_pool = pooling(l2_feature_map_relu, 2, 2)
print("**End of conv layer 2**\n")

# Third conv layer
l3_filter = numpy.random.rand(1, 7, 7, l2_feature_map_relu_pool.shape[-1])
print("\n**Working with conv layer 3**")
l3_feature_map = conv(l2_feature_map_relu_pool, l3_filter)
print("\n**ReLU**")
l3_feature_map_relu = relu(l3_feature_map)
print("\n**Pooling**")
l3_feature_map_relu_pool = pooling(l3_feature_map_relu, 2, 2)
print("**End of conv layer 3**\n")

　　l2表示第二个卷积层，该卷积层使用的卷积核为（3,5,5），即3个5x5大小的卷积核（滤波器）与第一层的输出进行卷积操作，得到3个特征图。l3表示第三个卷积层，该卷积层使用的卷积核为（1,7,7），即1个7x7大小的卷积核（滤波器）与第二层的输出进行卷积操作，得到1个特征图。

　　神经网络的基本结构是前一层的输出作为下一层的输入，比如l2层接收l1层的输出，l3层接收来l2层的输出。

这里写图片描述

　　图形展示代码如下：

# Graphing results
fig0, ax0 = matplotlib.pyplot.subplots(nrows=1, ncols=1)
ax0.imshow(img).set_cmap("gray")
ax0.set_title("Input Image")
ax0.get_xaxis().set_ticks([])
ax0.get_yaxis().set_ticks([])
matplotlib.pyplot.savefig("in_img.png", bbox_inches="tight")
matplotlib.pyplot.close(fig0)

# Layer 1
fig1, ax1 = matplotlib.pyplot.subplots(nrows=3, ncols=2)
ax1[0, 0].imshow(l1_feature_map[:, :, 0]).set_cmap("gray")
ax1[0, 0].get_xaxis().set_ticks([])
ax1[0, 0].get_yaxis().set_ticks([])
ax1[0, 0].set_title("L1-Map1")

ax1[0, 1].imshow(l1_feature_map[:, :, 1]).set_cmap("gray")
ax1[0, 1].get_xaxis().set_ticks([])
ax1[0, 1].get_yaxis().set_ticks([])
ax1[0, 1].set_title("L1-Map2")

ax1[1, 0].imshow(l1_feature_map_relu[:, :, 0]).set_cmap("gray")
ax1[1, 0].get_xaxis().set_ticks([])
ax1[1, 0].get_yaxis().set_ticks([])
ax1[1, 0].set_title("L1-Map1ReLU")

ax1[1, 1].imshow(l1_feature_map_relu[:, :, 1]).set_cmap("gray")
ax1[1, 1].get_xaxis().set_ticks([])
ax1[1, 1].get_yaxis().set_ticks([])
ax1[1, 1].set_title("L1-Map2ReLU")

ax1[2, 0].imshow(l1_feature_map_relu_pool[:, :, 0]).set_cmap("gray")
ax1[2, 0].get_xaxis().set_ticks([])
ax1[2, 0].get_yaxis().set_ticks([])
ax1[2, 0].set_title("L1-Map1ReLUPool")

ax1[2, 1].imshow(l1_feature_map_relu_pool[:, :, 1]).set_cmap("gray")
ax1[2, 0].get_xaxis().set_ticks([])
ax1[2, 0].get_yaxis().set_ticks([])
ax1[2, 1].set_title("L1-Map2ReLUPool")

matplotlib.pyplot.savefig("L1.png", bbox_inches="tight")
matplotlib.pyplot.close(fig1)

# Layer 2
fig2, ax2 = matplotlib.pyplot.subplots(nrows=3, ncols=3)
ax2[0, 0].imshow(l2_feature_map[:, :, 0]).set_cmap("gray")
ax2[0, 0].get_xaxis().set_ticks([])
ax2[0, 0].get_yaxis().set_ticks([])
ax2[0, 0].set_title("L2-Map1")

ax2[0, 1].imshow(l2_feature_map[:, :, 1]).set_cmap("gray")
ax2[0, 1].get_xaxis().set_ticks([])
ax2[0, 1].get_yaxis().set_ticks([])
ax2[0, 1].set_title("L2-Map2")

ax2[0, 2].imshow(l2_feature_map[:, :, 2]).set_cmap("gray")
ax2[0, 2].get_xaxis().set_ticks([])
ax2[0, 2].get_yaxis().set_ticks([])
ax2[0, 2].set_title("L2-Map3")

ax2[1, 0].imshow(l2_feature_map_relu[:, :, 0]).set_cmap("gray")
ax2[1, 0].get_xaxis().set_ticks([])
ax2[1, 0].get_yaxis().set_ticks([])
ax2[1, 0].set_title("L2-Map1ReLU")

ax2[1, 1].imshow(l2_feature_map_relu[:, :, 1]).set_cmap("gray")
ax2[1, 1].get_xaxis().set_ticks([])
ax2[1, 1].get_yaxis().set_ticks([])
ax2[1, 1].set_title("L2-Map2ReLU")

ax2[1, 2].imshow(l2_feature_map_relu[:, :, 2]).set_cmap("gray")
ax2[1, 2].get_xaxis().set_ticks([])
ax2[1, 2].get_yaxis().set_ticks([])
ax2[1, 2].set_title("L2-Map3ReLU")

ax2[2, 0].imshow(l2_feature_map_relu_pool[:, :, 0]).set_cmap("gray")
ax2[2, 0].get_xaxis().set_ticks([])
ax2[2, 0].get_yaxis().set_ticks([])
ax2[2, 0].set_title("L2-Map1ReLUPool")

ax2[2, 1].imshow(l2_feature_map_relu_pool[:, :, 1]).set_cmap("gray")
ax2[2, 1].get_xaxis().set_ticks([])
ax2[2, 1].get_yaxis().set_ticks([])
ax2[2, 1].set_title("L2-Map2ReLUPool")

ax2[2, 2].imshow(l2_feature_map_relu_pool[:, :, 2]).set_cmap("gray")
ax2[2, 2].get_xaxis().set_ticks([])
ax2[2, 2].get_yaxis().set_ticks([])
ax2[2, 2].set_title("L2-Map3ReLUPool")

matplotlib.pyplot.savefig("L2.png", bbox_inches="tight")
matplotlib.pyplot.close(fig2)

# Layer 3
fig3, ax3 = matplotlib.pyplot.subplots(nrows=1, ncols=3)
ax3[0].imshow(l3_feature_map[:, :, 0]).set_cmap("gray")
ax3[0].get_xaxis().set_ticks([])
ax3[0].get_yaxis().set_ticks([])
ax3[0].set_title("L3-Map1")

ax3[1].imshow(l3_feature_map_relu[:, :, 0]).set_cmap("gray")
ax3[1].get_xaxis().set_ticks([])
ax3[1].get_yaxis().set_ticks([])
ax3[1].set_title("L3-Map1ReLU")

ax3[2].imshow(l3_feature_map_relu_pool[:, :, 0]).set_cmap("gray")
ax3[2].get_xaxis().set_ticks([])
ax3[2].get_yaxis().set_ticks([])
ax3[2].set_title("L3-Map1ReLUPool")

matplotlib.pyplot.savefig("L3.png", bbox_inches="tight")
matplotlib.pyplot.close(fig3)

　　注：这里用到scikit-image（Python3的选择），是用于图像处理的 Python 包，使用原生的 NumPy 数组作为图像对象。它包括分割，几何变换，色彩操作，分析，过滤等算法。它用作集成到python运算环境几何一些科学运算库（Numpy，Scipy）。

D:\Python\Python36\Tools>pip install d:\python\scikit_image-0.13.1-cp36-cp36m-win_amd64.whl

这里写图片描述
　　资源地址：scikit_image‑0.13.1‑cp36‑cp36m‑win_amd64.whl

参考：

1. 《仅使用NumPy完成卷积神经网络CNN的搭建（附Python代码）》阿里云云栖社区译者：海棠，文章原标题《Building Convolutional Neural Network using NumPy from Scratch》，作者：Ahmed Gad，研究兴趣是深度学习、人工智能和计算机视觉
个人主页：https://www.linkedin.com/in/ahmedfgad/
2. 《skimage库需要依赖 numpy+mkl 和scipy》博客园小呆君 2017年11月
3. 《CNN中的maxpool到底是什么原理？》雷锋网 AI研习社，贾智龙 2017年7月
4. 《卷积神经网络CNN理论到实践(5)》 CSDN博客相国大人 2017年6月
5. 《人脸检测及识别python实现系列（4）——卷积神经网络（CNN）入门》博客园 Neo-T 2017年2月
《Python科学计算初探——余弦相似度》 CSDN博客肖永威 2018年4月
6. 《ReLU激活函数：简单之美》 CSDN博客对半独白 2016年11月

肖永威

关注

0
点赞
踩
23

收藏

觉得还不错? 一键收藏
打赏
0
评论
使用Python科学计算包搭建CNN算法实践（1）

深度学习的概念源于人工神经网络的研究。含多隐层的多层感知器就是一种深度学习结构。深度学习通过组合低层特征形成更加抽象的高层表示属性类别或特征，以发现数据的分布式特征表示。CNN（Convolutional Neural Network）——卷积神经网络，人工神经网络（Neural Network，NN）的一种，其它还有RNN、DNN等类型，而CNN就是利用卷积进行滤波的神经网络。换句话说，CN...
复制链接

扫一扫