CNN

最新推荐文章于 2023-06-29 18:59:02 发布

简简丹

最新推荐文章于 2023-06-29 18:59:02 发布

阅读量781

点赞数

分类专栏：深度学习文章标签： tensorflow实战谷歌深度学习框架

本文链接：https://blog.csdn.net/sinat_37386947/article/details/88780369

版权

深度学习专栏收录该内容

6 篇文章 0 订阅

订阅专栏

一.卷积神经网络结构：
1.输入层：为输入的特征数，如图像的像素数

2.卷积层：主要结构为卷积核（过滤器、内核）。一般经过卷积层之后节点矩阵会变得更深
过滤器（作用、尺寸、调整结果矩阵的大小
作用：将当前层神经网络上的一个子节点矩阵（三维的）转化为下一层神经网络上的一个单位节点矩阵，其中单位节点矩阵指的长和宽都为1，但深度不限的节点矩阵。
尺寸：长宽人工指定（33，55），处理的矩阵深度和当前神经网络节点矩阵的深度是一致的，处理后得到的单位节点矩阵的深度称为过滤器的深度
如：节点矩阵223 ，单位节点矩阵115，总共需要参数个数：2235+5=65个其中权重参数为2235=60个，偏置参数为5个。一个滤波器的参数是223=12个，共需要5种滤波器，每种滤波器对应一个偏置项。卷积核参数与滤波器尺寸、深度以及当前节点矩阵的深度有关。
结果矩阵的大小（主要确定长宽、深度已经提前指定）：受滤波器尺寸大小的影响、步长的影响、以及是否在周边填充0的影响。
(W-F+2P)/S+1
W：输入的尺寸，F:卷积核的尺寸、S:步幅，P:填充的尺寸
当步长为1时：
当填充0时，out(length) = in(length)/stride(length) （此时默认填充P=F-1,步长为1）
当不填充0时，out(length) =( in(length)-filter(length)+1)/stride(length)
代码实现卷积前向传播过程：

import tensorflow as tf
# shape是四维矩阵，前两个是卷积核尺寸（长宽），第三个是当前层的深度，第四个是过滤器的深度
filter_weights = tf.get_variable('weights',shape=[5,5,3,16],initializer=tf.truncated_normal_initializer(stddev=0.1))
# 偏置项的维度是下一层的深度，即过滤器的深度
biases = tf.get_variable("biases",shape=[16],initializer=tf.constant_initializer(0.1))
# tf.nn.conv2d实现卷积前向传播
# input:当前层的节点矩阵——四维矩阵（batch_size,三维节点矩阵）；strides：不同维度上的步长，长度为4的数组，第一维和第四维要是1;
# padding:填充方式，VALID不添加，SAME为全0填充
conv = tf.nn.conv2d(input=input,filter=filter_weights,strides=[1,1,1,1],padding='SAME')
#给每个节点添加一个偏置项tf.nn.bias_add
bias = tf.nn.bias_add(conv,biases)
# 添加激活函数
actived_conv = tf.nn.relu(bias)

3.池化层：最大池化；平均池化；没有参数。不改变三维矩阵的深度，只缩小矩阵的大小，减少参数，使得训练速度加快同时避免过拟合
需设置池化的大小、是否使用全0填充、移动步长
注意：卷积使用的过滤器是横跨整个深度的，而池化使用的过滤器只影响一个深度上的节点。池化层的过滤器不仅要在长和宽两个维度上移动、还要在深度上进行移动
代码实现最大池化前向传播过程：

tf.nn.max_pool(actived_conv,ksize=[1,3,3,1],strides=[1,2,2,1],padding='SAME')

4.全连接层：卷积与池化相当于特征提取过程，使用全连接神经网络作为分类

5.输出层softmax层主要用于分类问题

二.与全连接神经网路不同点

1.全连接神经网络每一层节点展示为一列；卷积神经网络每一层节点展示为三维矩阵。决定了输入格式的不一致。全连接神经网络输入是向量；而卷积神经网络的输入是矩阵（三维数组）。这里的矩阵是指一个batch的矩阵。
2.前者相邻两层节点之间全部互相链接，后者局部连接以及参数共享。参数大大减少,同时共享参数使得图像上的位置不受位置的影响

三.经典卷积神经网络模型
3.1 卷积神经网络结构设计通用模式
输入层 —— (卷积层+ —— 池化层？）+ —— 全连接层+
“+"表示一层或者多层。大部分卷积神经网络一般最多连续使用三层神经网络
“？”表示没有或者一层，减少参数防止过拟合，可用调整卷积层步长来代替
配置总结：
过滤器边长一般为1，3，5，多的可达到7，11；
过滤器深度逐层递增（*2）
卷积层的步长一般为1，也有为2，为3；
池化层：过滤器边长一般为2或者3，步长一般也为2或者为3

3.2 LeNet-5模型(99.5%准确率）

输入层：原始图像像素 32321

第一层，卷积层。输入32321，过滤器（55，深度为6，不使用0填充，步长为1），输出尺寸（32-5+1=28）28286。卷积层参数共5516+6=156个，连接个数有28286*（55+1）=122304个。其中下一层节点矩阵有28286=4704个节点，每个节点与55个当前层节点相连，同时有个偏置项

第二层，池化层。输入：28286。过滤器（22，不使用0填充，步长为2），输出尺寸：1414*6

第三层，卷积层。输入：14146。过滤器（55，深度为16，不使用0填充，步长为1），输出尺寸（14-5+1=10）101016。卷积层参数共55616+16=2416个，连接个数有101016*（5*5+1）=41600个。

第四层：池化层。输入：101016。过滤器（22，不使用0填充，步长为2），输出尺寸：55*16

第五层：全连接层。输入：5516。过滤器（55，深度为120，不使用0填充，步长为1）输出尺寸：11120 。卷积层参数共5516120+120=48120个，连接个数有11120*（5516+1）=48120个

第六层：全连接层。输入：120个，输出84个。参数个数84120+84=10164个，连接个数84（120+1）=10164个

第七层：全连接层。输入84个，输出10个。参数个数84*10+10=850个

tensorflow实现：输入为一个四维矩阵，全连接神经网络是一维的。调整输入数据的格式

# 调整输入格式
x = tf.placeholder(tf.float32,shape=[batch_size,mnist_inference.image_size,mnist_inference.image_size,mnist_inference.num_channels],name='x-input')
reshaped_xs = np.reshape(xs,(batch_size,mnist_inference.image_size,mnist_inference.image_size,mnist_inference.num_channels))

# 修改前向传播过程
input_node = 784
output_node = 10

image_size = 28
num_channels = 1
num_labels = 10

# 第一层卷积的尺寸以及深度
conv1_size = 5
conv1_deep = 32
# 第二层卷积的尺寸以及深度
conv2_size = 5
conv2_deep = 64
# 全连接层的节点个数
fc_size = 512
# 添加参数train表示训练或者测试，同时添加dropout在训练时使用，只在全连接层使用，而不在卷积或者池化层使用
def inference(input_tensor,train,regularizer):

    with tf.variable_scope("layer1-conv1"):
        conv1_weights = tf.get_variable("weight",[conv1_size,conv1_size,num_channels,conv1_deep],initializer=tf.truncated_normal_initializer(stddev=0.1))
        conv1_biases = tf.get_variable("bias",[conv1_deep],initializer=tf.constant_initializer(0.0))
        conv1 = tf.nn.conv2d(input_tensor,conv1_weights,strides=[1,1,1,1],padding='SAME')
        relu1 = tf.nn.relu(tf.nn.bias_add(conv1,conv1_biases))
        # 输入：28*28*1 矩阵
        # 输出：28*28*16 矩阵


    with tf.variable_scope("layer2-pooling1"):
        pooling1 = tf.nn.max_pool(relu1,ksize=[1,2,2,1],strides=[1,2,2,1],padding="SAME")
        # 输入：28*28*16 矩阵
        # 输出：14*14*16 矩阵


    with tf.variable_scope("layer3-conv2"):
        conv2_weights = tf.get_variable("weight",shape=[conv2_size,conv2_size,num_channels,conv2_deep],initializer=tf.truncated_normal_initializer(stddev=0.1))
        conv2_biases = tf.get_variable('bias',shape=[conv2_deep],initializer=tf.constant_initializer(0.0))
        conv2 = tf.nn.conv2d(pooling1,conv2_weights,strides=[1,1,1,1],padding="SAME")
        relu2 = tf.nn.relu(tf.nn.bias_add(conv2,conv2_biases))
        # 输入：14*14*16 矩阵
        # 输出：14*14*64 矩阵


    with tf.variable_scope("layer4-pooling2"):
        pooling2 = tf.nn.max_pool(input=relu2,ksize=[1,2,2,1],strides=[1,2,2,1],padding="SAME")
        # 输入：14*14*64 矩阵
        # 输出：7*7*64 矩阵


    #注意这里全连接神经网络的输入是向量，而第四层池化层的输出是矩阵，将矩阵中的节点拉直转化为向量。其中通过get_shape函数可以得到矩阵的维度，返回的是元组
    # as_list()转化为列表
    #只有张量才可以使用get_shape这种方法，tf.shape()都可以
    pool_shape = pooling2.get_shape().as_list() #四维，第0维是一个batch的数据个数
    nodes = pool_shape[1]+pool_shape[2]+pool_shape[3]
    # 通过tf.reshape函数将第四层的输出变成一个batch的向量
    reshaped = tf.reshape(pooling2,[pool_shape[0],nodes])
    with tf.variable_scope('layer5-fc'):
        fc1_weights = tf.get_variable("weight",[nodes,fc_size],initializer=tf.truncated_normal_initializer(stddev=0.1))
        if regularizer!=None:
            tf.add_to_collection('losses',regularizer(fc1_weights))
        fc1_biases = tf.get_variable('bias',shape=[fc_size],initializer=tf.constant_initializer(0.1))
        fc1 = tf.nn.relu(tf.matmul(reshaped,fc1_weights)+fc1_biases)
        if train:
            fc1 = tf.nn.dropout(fc1,0.5)
            # 注意：dropout只在全连接层使用，不在卷积层以及池化层使用；同时在训练过程中使用，在测试过程中不使用
        # 输入：14*14*64 =3136向量
        # 输出：512向量


    with tf.variable_scope('layer6-fc2'):
        fc2_weights = tf.get_variable('weight',shape=[fc_size,num_labels],initializer=tf.truncated_normal_initializer(stddev=0.1))
        if regularizer!=None:
            tf.add_to_collection('losses',regularizer(fc2_weights))
        fc2_biases = tf.get_variable('bias',shape=[num_labels],initializer=tf.constant_initializer(0.1))
        fc2 = tf.matmul(fc1,fc2_weights)+fc2_biases
        # 输入：512向量
        # 输出：10向量
    return fc2

3.3 Inception 模型
对比：

LeNet-5模型	Inception-v3模型
将不同卷积层通过串联的方式联系在一起	不同尺寸的过滤器分别使用
Inception结构中将不同卷积层通过并联的方式结合在一起	在同一卷积层同时使用不同尺寸的过滤器，将得到的矩阵拼接起来

在这里插入图片描述
由图看出，首先使用不同尺寸的过滤器处理输入矩阵（1，3，5），过滤器大小虽然不同，但是使用全0填充且步长为1，那得到的结果长和宽都与输入保持一致，将不同矩阵拼接城更深的矩阵（在深度上组合）。

import tensorflow as tf
# 直接使用Tensorflow原始API实现卷积层
with tf.variable_scope('layer_conv'):
    weights = tf.get_variable("weight",shape=[],initializer=tf.truncated_normal_initializer(stddev=0.1))
    biases = tf.variable_scope("bias",shape=[],initializer=tf.constant_initializer(0.1))
    conv = tf.nn.conv2d(input=,weights=,strides=,padding=)
    relu = tf.nn.relu(tf.nn.bias_add(conv,biases))
# 使用tensorflow-slim实现卷积层，在一行中实现卷积层的前向传播算法（节点输入矩阵，滤波器的深度，滤波器的尺寸）
# 加载slim库
import tensorflow.contrib.slim as slim
net = slim.conv2d(input,32,[3,3])

实现一个Inception模块
四条计算路径，然后拼接

import tensorflow as tf
import tensorflow.contrib.slim as slim
# slim.arg_scope函数可以用于设置默认的参数取值，第一个参数是一个函数列表，函数将使用默认的参数取值，比如说slim.conv2d(net,320,[1,1])会自动加上strides=1he padding='same'
# 在函数调用时也可以自己指定，默认值就不会调用
with slim.arg_scope([slim.conv2d,slim.max_pool2d,slim.avg_pool2d],stride=1,padding='VALID'):
    net = '上一层的输出节点矩阵'
    # 为一个Inception模块统一声明一个命名空间
    with tf.variable_scope("Mixed_7c"):
        # 为每一条路径声明一个命名空间
        #第一条路径
        with tf.variable_scope("Branch_0"):
            branch_0 = slim.conv2d(net,320,[1,1],scope='Conv2d_0a_1x1')

        # 第二条路径
        with tf.variable_scope("Branch_1"):
            branch_1 = slim.conv2d(net,384,[1,1],scope='Conv2d_0a_1x1')
            # tf.concat函数可以将多个矩阵拼接起来。函数第一个参数指定了拼接的维度，‘3’是指在深度维度做拼接
            branch_1 = tf.concat(3.[slim.conv2d(branch_1,384,[1,3],scope='Conv2d_0b_1x3'),slim.conv2d(branch_1,384,[3,1],scope='Conv2d_0c_3x1')])

        # 第三条路径
        with tf.variable_scope("Branch_2"):
            branch_2 = slim.conv2d(net,448,[1,1],scope='Conv2d_0a_1x1')
            branch_2 = slim.conv2d(branch_2,384,[3,3],scope='Conv2d_0b_3x3')
            branch_2 = tf.concat(3,[slim.conv2d(branch_2,384,[1,3],scope='Conv2d_0c_1x3'),slim.conv2d(branch_2,384,[3,1],scope='Conv2d_0d_3x1')])

        # 第四条路径
        with tf.variable_scope("Branch_3"):
            branch_3 = slim.avg_pool2d(net,[3,3],scope='AvgPool_0a_3x3')
            branch_3 = slim.conv2d(branch_3,192,[1,1],scope='Conv2d_0b_1x1')
        
        net = tf.concat(3,[branch_0,branch_1,branch_2,branch_3])

注：Inception-v3模型源码地址：
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/slim/python/slim/nets/inception_v3.py

四.卷积神经网络迁移学习
迁移学习的概念

将一个问题上训练好的模型通过简单的调整使其适用于一个新的问题

Tensorflow 实现迁移学习（使用Inception-v3模型在flower数据集上进行迁移学习）
1.数据集：flower_photos.tgz（5种类别的花，每种花有734张图片，是jpg格式）
下载地址：http://download.tensorflow.org/example_images/flower_photos.tgz
运用：使用Inception模型对花的类别进行分类
数据预处理：将数据变成Inception模型的输入格式（将大小不一的jpg格式转化为29933矩阵）
2.Inception-v3模型下载（谷歌提供的已经训练好的）
http://download.tensorflow.org/models/inception_v3_2016_08_28.tar.gz
3.迁移学习

#数据预处理及划分
#1.数据预处理成Inception-v3模型输入格式
# -*- coding:utf-8 -*-
import glob  #解析文件路径
import os.path
import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
import numpy as np
import tensorflow as tf
from tensorflow.python.platform import gfile # 读写文件

input_data_path = r'I:\代码\tensorflow\data\flower_photos_Inception'
#将整理后的图片数据通过numpy的格式保存
output_data_path =r'I:\代码\tensorflow\data\flower_processed_data.npy'

# 验证集与测试集的比例
validation_percentage = 10
test_percentage = 10

# 读取数据并将数据分割成训练数据、验证数据、测试数据
def create_image_list(sess,testing_percentage,validation_percentage):
    '''
    I:\代码\tensorflow\data\flower_photos_Inception
    I:\代码\tensorflow\data\flower_photos_Inception\daisy
    I:\代码\tensorflow\data\flower_photos_Inception\dandelion
    I:\代码\tensorflow\data\flower_photos_Inception\roses
    I:\代码\tensorflow\data\flower_photos_Inception\sunflowers
    I:\代码\tensorflow\data\flower_photos_Inception\tulips
    '''
    sub_dirs = [x[0] for x in os.walk(input_data_path)]
    # os.path()返回三元组（root,dirs,files）root：为当前正在遍历的文件夹本身的地址，dirs:是一个list，内容是改文件下中所有目录的名字（不包括子目录）
    # files 同样是 list , 内容是该文件夹中所有的文件(不包括子目录)
    is_root_dir = True

    #初始化各个训练集
    training_images = []
    training_labels = []
    testing_images = []
    testing_labels = []
    validation_images = []
    validation_labels = []
    current_label = 0

    # 读取所有子目录
    for sub_dir in sub_dirs:
        if is_root_dir:
            is_root_dir = False
            continue
        #获取一个子目录中的所有图片文件
        extensions = ['jpg','jpeg','IPG',"JPEG"]
        file_list = []
        dir_name = os.path.basename(sub_dir) #返回path最后的文件名
        # 如‘I:\代码\tensorflow\data\flower_photos_Inception\daisy’，则返回daisy
        for extension in extensions:
            file_glob = os.path.join(input_data_path,dir_name,'*.'+extension)
            # I:\代码\tensorflow\data\flower_photos_Inception\daisy\*.jpg
            file_list.extend(glob.glob(file_glob))
            # 获取所有图像路径
            if not file_list:
                continue

            # 处理图片数据
            for file_name in file_list:
                # 读取并解析图片，将图片转化为299*299以便inception-v3模型处理
                image_raw_data = gfile.FastGFile(file_name,'rb').read() #gfile.FastGFile()可用open代替
                image = tf.image.decode_jpeg(image_raw_data) #图片解码
                if image.dtype!=tf.float32:
                    image = tf.image.resize_images(image,[299,299])
                    image_value = sess.run(image)

                # 随机划分数据集
                chance = np.random.randint(100)
                if chance<validation_percentage:
                    validation_images.append(image_value)
                    validation_labels.append(current_label)
                elif chance<(test_percentage+validation_percentage):
                    testing_images.append(image_value)
                    testing_labels.append(current_label)
                else:
                    training_images.append(image_value)
                    training_labels.append(current_label)
            current_label+=1
    # 将训练数据随机打乱以获得更好的训练效果
    state = np.random.get_state() # get_state()为设定状态，记录下数组被打乱的操作
    np.random.shuffle(training_images) # 一般结合random.shuffle()函数使用
    np.random.set_state(state) #set_state()：接收get_state()返回的值state，并进行同样的操作
    np.random.shuffle(training_labels)
    # 将实例与标签两个数组同时打乱，但打乱后，实例与标签任然是一一对应的关系
    return np.asarray([training_images,training_labels,validation_images,validation_labels,testing_images,testing_labels])

# 数据整理主函数
def main():
    with tf.Session( ) as sess:
        processed_data = create_image_list(sess,test_percentage,validation_percentage)
        #通过numpy的格式保存处理后的数据
        np.save(output_data_path,processed_data)
if __name__ == '__main__':
    main()

#迁移学习过程

未完待续

简简丹

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
CNN

一.卷积神经网络结构：1.输入层：为输入的特征数，如图像的像素数2.卷积层：主要结构为卷积核（过滤器、内核）。一般经过卷积层之后节点矩阵会变得更深过滤器（作用、尺寸、调整结果矩阵的大小作用：将当前层神经网络上的一个子节点矩阵（三维的）转化为下一层神经网络上的一个单位节点矩阵，其中单位节点矩阵指的长和宽都为1，但深度不限的节点矩阵。尺寸：长宽人工指定（33，55），处理的矩阵深度和当前神经...
复制链接

扫一扫