用AlexNet实现路标分类

最新推荐文章于 2024-04-13 22:17:31 发布

立里∑

最新推荐文章于 2024-04-13 22:17:31 发布

阅读量560

点赞数

文章标签： TensorFlow 识别路标 AlexNet

本文链接：https://blog.csdn.net/qq_41218103/article/details/102732386

版权

一、数据来源：

从http://btsd.ethz.ch/shareddata/网站下载比利时交通图标。
( $T e s t i n g$ 文件夹用于做训练集， $T r a i n i n g$ 文件夹用于检测)：

二、数据前期处理

将 $T e s t i n g$ 文件夹中的 $. p p m$ 图片转化为 $. j p g$ 图片：
思路：遍历 $T e s t i n g$ 文件夹中各个子文件夹(每个子文件夹中是一种路标的不同图片)。利用PIL库中的函数打开个图片，将图片转化为模型所用的227×227格式后，转c成.jpg格式并存到另一文件夹中。在转存时利用os.path.exists函数来判断路径是否存在，返回的是False,再利用os.makedirs(path)函数来创建路径。

def change(path):
#将《切割好的图片》文件夹中的图片转化为AlexNet模型要求的图片大小
    for root,stdirs,filenames in os.walk(path):
    #遍历path路径对应的文件夹
        for stdir in stdirs:
        #遍历文件夹中的子文件夹
            i = 0
            dir1 = os.path.join(root,stdir)
            #获取子文件夹的路径
            #print('dir1:', dir1)
            #print('filenames:',filenames)
            for filename in os.listdir(dir1):
            #os.listdir：返回子文件夹中的图片目录
                try:
                    i+=1
                    dir2=os.path.join(dir1,filename)
                    #获取图片路径
                   # print('dir2:', dir2)
                    #print('*********')
                    img=Image.open(dir2)
                    #利用PIL库打开图片
                    x_s = 227
                    y_s = 227
                    out = img.resize((x_s, y_s), Image.ANTIALIAS)
                    #改变图片的大小；img.resize（1）
                    dir3_1=dir2.split('\\')[-2]
                    #将得到的图片地址分割，获取图片的文件夹名
                    #print('dir3_1',dir3_1)
                    dir3_2=r"C:\\Users\\abc\\Desktop\\test1\\"
                    os.path.join(dir3_2,dir3_1)
                    dir3_3 =dir3_2+dir3_1
                    folder = os.path.exists(dir3_3)
                    #os.path.exists:判断"dir3_3"的路径是否存在；若存在则返回True.
                    if not folder:
                        os.makedirs(dir3_3)
                        #递归的创建"dir3_3"对应的多级目录。即：若test1,dir3_1均不存在的话会自动创建
                    dir3=os.path.join(dir3_3,str(i)+'.jpg')
                    #dir3 = os.path.join(dir3_2, str(i) + '.jpg')
                    #print('dir3',dir3)
                    out.save(dir3)
                    #将图片按“dir3”的路径存储
                except:
                    print(dir2)
                    print('------')
               #使用try....except格式：当遇到.csv文件时跳过
                cv2.waitKey(0)

用到的函数：
（1）img.resize（(width,height),Image.ANTIALIAS）

 img.resize((width,height),X)
    第二个参数X可以是：
     Image.NEAREST ：低质量
     Image.BILINEAR：双线性
     Image.BICUBIC ：三次样条插值
     Image.ANTIALIAS：高质量

（2）os.makedirs(C:\\path1\\path2\\path3)与os.makedir(C:\\path1\\path2\\path3)的区别

os.makedirs(C:\\path1\\path2\\path3)：
创建多层目录。即：若path1，path2，path3均不存在，则依次创建path1，path2，path3目录

 os.makedir(C:\\path1\\path2\\path3)：
仅创建路径中的最后一级目录，即：仅创建path3目录

三、数据集的获取，设置图片标签

'''
输入图片文件夹的路径
输出图片的路径列表和对应的标签列表
'''

def get_file(file_dir):
    images = []  
    # 每张图片的路径组成的列表
    temp = [] 
     # 保存文件夹路径
    for root, sub_folders, files in os.walk(file_dir):
    #各图片的名字保存在files中
        for name in files:
            images.append(os.path.join(root, name))
			#依次将各图片的路径保存在images列表中
        for name in sub_folders:
            temp.append(os.path.join(root, name))
            #依次将各子文件夹的路径保存在temp列表中

    labels = []  
    # 保存标签列表

    # 此时temp为根目录下所有文件夹的路径列表 一次取出一个文件夹 对文件夹里面的所有数据图片设置标签
    for one_folder in temp:
        n_img = len(os.listdir(one_folder)) 
         # 得到“one_folder”文件夹下的图片总数
        letter = one_folder.split('\\')[-1]  
        # 将子文件夹的路径按照“\\”分割 取出最后一个即：得到文件夹的名称

        # 标注数据集
        labels = np.append(labels, n_img * [int(letter)])


    temp = np.array([images, labels])  
    # 重新创建数组temp；将images 和 labels 做为一对键值对写入temp中
    temp = temp.transpose()  
    # 将temp转置
    np.random.shuffle(temp)  
    # 打乱数据集的顺序

    image_list = list(temp[:, 0]) 
     # 取出数组中的第一维 即：图片的路径列表
    label_list = list(temp[:, 1]) 
     # 取出数组中的第二维 即：图片的标签列表
    label_list = [int(float(i)) for i in label_list]

    return image_list, label_list

四、将图片地址的数据集转化为 $T e n s o r F l o w$ 专用格式

'''
输入2.中获得的图片路径列表的对应的标签列表
输出两个张量：
'''
def get_batch(image_list, label_list, img_width, img_height, batch_size, capacity):
    """将图片地址的数据集转化为TensorFlow专用格式"""
    image = tf.cast(image_list, tf.string)
    #将列表转化为tensor，一个张量元素是一个字节数组
    label = tf.cast(label_list, tf.int64)
	#将列表转化为tensor，元素转化为64 位有符号整型
    input_queue = tf.train.slice_input_producer([image, label])
	#创建一个文件名队列，input_queue是文件名队列的名字
    label = input_queue[1]
    image_contents = tf.read_file(input_queue[0])
    #tf.read_file：读取图片
    image = tf.image.decode_jpeg(image_contents, channels=3)
	#将存储的".jpg"图像还原成一个三维矩阵.解码之后的结果为一个张量,在使用它的取值
	#之前需要明确调用运行的过程
    image = tf.image.resize_image_with_crop_or_pad(image, 227, 227)
    #将图片尺寸转化为227×227
    image = tf.image.per_image_standardization(image)
    image_batch, label_batch = tf.train.batch([image, label], batch_size=200, num_threads=64, capacity=300)
    label_batch = tf.reshape(label_batch, [batch_size])
    return image_batch, label_batch


# 输入文件路径 获得两个batch
x_train, y_train = get_file(r'C:\Users\abc\Desktop\test1')
image_batch, label_batch = get_batch(x_train, y_train, 227, 227, 200, 2048)

用到的函数：
(1). tf.train.slice_input_producer

tf.train.slice_input_producer 函数:(创建tf的文件名队列) 一种模型数据的排队输入方法。从tensor列表
[image, label]中按顺序或者随机取出一个tensor放入文件名队列。
(文件名队列存放的是参与训练的文件名，要训练N次，则文件名队列中就含有N个批次的所有文件名)

slice_input_producer(tensor_list, num_epochs=None, shuffle=True, seed=None,capacity=32, shared_name=None, name=None)

tensor_list:包含一系列tensor的列表，在 [image, label] 中有多少个图片，就应有多少个标签。
num_epochs:表示迭代次数，若未设置则表示无限次的便利tensor列表
shuffle： bool类型，设置是否打乱样本的顺序。一般情况下，如果shuffle=True，生成的样本顺序就被打乱了，在批处理的时候不需要再次打乱样本，使用 tf.train.batch函数就可以了;如果shuffle=False,就需要在批处理时候使用 tf.train.shuffle_batch函数打乱样本。
seed：整数（可选择），当shuffle=True时使用
capacity：tensor列表的容量
shared_name：可选参数，如果设置一个‘shared_name’，则在不同的上下文环境（Session）中可以通过这个名字共享生成的tensor。
name：设置操作的名称（可选）

(2). tf.image.resize_image_with_crop_or_pad

tf.image.resize_image_with_crop_or_pad(image, target_height, target_width)：
通过集中裁剪图像或使用零均匀填充图像，将图像大小调整为目标宽度和高度，如果宽度或高度分别大于指定的
目标宽度或目标高度，则此操作将沿该维度集中裁剪。如果宽度或高度分别小于指定的目标宽度或目标高度，
则此操作沿该尺寸以黑色填充

(3). tf.image.per_image_standardization

tf.image.per_image_standardization(image)：
图片标准化，将像素做处理

在这里插入图片描述

(4). tf.train.batch

tf.train.batch是一个tensor队列生成器，作用是按照给定的tensor顺序，把batch_size个tensor推送到文件队列，
作为训练一个batch的数据，等待tensor出队执行计算

`tf.train.batch(tensor,batch_size,num_thresds=1,capacity=32,enqueue_many=False,shapes=None, 
dynamic_pad=False,allow_smaller_final_batch=False,shared_name=None,name=None)

tensor:一个tensor列表或字典用来入队
batch_size：设置每次从队列中获取的出队数据的数量
num_threads：用来控制入队tensors线程的数量，如果num_threads大于1，则batch操作将是非确定性的，输出的batch可能会乱序
capacity：一个整数，用来设置队列中元素的最大数量
enqueue_many：在tensors中的tensor是否是单个样本
shapes：可选，每个样本的shape，默认是tensors的shape
dynamic_pad：Boolean值.允许输入变量的shape，出队后会自动填补维度，来保持与batch内的shapes相同
allow_samller_final_batch：可选，Boolean值，如果为True队列中的样本数量小于batch_size时，出队的数量会以最终遗留下来的样本进行出队，如果为Flalse，小于batch_size的样本不会做出队处理
shared_name：可选，通过设置该参数，可以对多个会话共享队列
name：可选，操作的名字

tf.train.batch与tf.train.slice_input_producer的区别：

tf.train.slice_input_producer用于创建tf的文件名队列，

推荐博客地址1
推荐博客地址2

tf.train.batch() 按顺序批量读取文件队列中的数据，将读取到的样例组织成batch批量数据的形式返回。

五、使用 Batch_Normalization 正则化处理数据集

def batch_norm(inputs, is_training, is_conv_out=True, decay=0.999):
    scale = tf.Variable(tf.ones([inputs.get_shape()[-1]]))
    beta = tf.Variable(tf.zeros([inputs.get_shape()[-1]]))
    pop_mean = tf.Variable(tf.zeros(inputs.get_shape()[-1]), trainable=False)
    pop_var = tf.Variable(tf.ones(inputs.get_shape()[-1]), trainable=False)

    if is_training:
        if is_conv_out:
            batch_mean, batch_var = tf.nn.moments(inputs, [0, 1, 2])
            #tf.nn.moments()函数用于计算均值和方差

        else:
            batch_mean, batch_var = tf.nn.moments(inputs, [0])

        train_mean = tf.assign(pop_mean, pop_mean * decay + batch_mean * (1 - decay))
        #tf.assign(ref,value):把value的值赋给ref，ref的值必须是Variable 
        train_var = tf.assign(pop_var, pop_var * decay + batch_var * (1 - decay))

        with tf.control_dependencies([train_mean, train_var]):
            return tf.nn.batch_normalization(inputs, batch_mean, batch_var, beta, scale, 0.001)
        #tf.control_dependencies:指定某些操作执行的依赖关系,先得到[train_mean, train_var],
        #再执行tf.nn.batch_normalization函数
        #batch_normalization(x,mean,variance,offset,scale,variance_epsilon):批量归一化：
        #一般用在激活函数之前，使x的各维度的均值是0，方差是1
       
    else:
        return tf.nn.batch_normalization(inputs, pop_mean, pop_var, beta, scale, 0.001)

(1) batch_normalization(x,mean,variance,offset,scale,variance_epsilon)

batch_normalization(x,mean,variance,offset,scale,variance_epsilon):数据的正则化计算

x:输入的数据文件
Mean:批量数据均值
variance：批量数据的方差
offset：待训练参数
scale：待训练参数
variance_epsilon：方差编译系数

六、模型参数

# 模型参数：
learning_rate = 1e-4  # 1×10^(-4)
#学习率
training_iters = 200
#预定的循环次数
batch_size = 200
#每次运行时使用200个随机数据
display_step = 5
n_classes = 62
#分类的数目
n_fc1 = 4096
n_fc2 = 2048
#神经网络的数量

# 构建神经网络
x = tf.placeholder(tf.float32, [None, 227, 227, 3])
y = tf.placeholder(tf.int32, [None, n_classes])

# 权重
W_conv = {
    'conv1': tf.Variable(tf.truncated_normal([11, 11, 3, 96], stddev=0.0001)),
    'conv2': tf.Variable(tf.truncated_normal([5, 5, 96, 256], stddev=0.01)),
    'conv3': tf.Variable(tf.truncated_normal([3, 3, 256, 384], stddev=0.01)),
    'conv4': tf.Variable(tf.truncated_normal([3, 3, 384, 384], stddev=0.01)),
    'conv5': tf.Variable(tf.truncated_normal([3, 3, 384, 256], stddev=0.01)),
    'fc1': tf.Variable(tf.truncated_normal([13 * 13 * 256, n_fc1], stddev=0.1)),
    'fc2': tf.Variable(tf.truncated_normal([n_fc1, n_fc2], stddev=0.1)),
    'fc3': tf.Variable(tf.truncated_normal([n_fc2, n_classes], stddev=0.1))
}

# 偏置
b_conv = {
    'conv1': tf.Variable(tf.constant(0.0, dtype=tf.float32, shape=[96])),
    'conv2': tf.Variable(tf.constant(0.1, dtype=tf.float32, shape=[256])),
    'conv3': tf.Variable(tf.constant(0.1, dtype=tf.float32, shape=[384])),
    'conv4': tf.Variable(tf.constant(0.1, dtype=tf.float32, shape=[384])),
    'conv5': tf.Variable(tf.constant(0.1, dtype=tf.float32, shape=[256])),
    'fc1': tf.Variable(tf.constant(0.1, dtype=tf.float32, shape=[n_fc1])),
    'fc2': tf.Variable(tf.constant(0.1, dtype=tf.float32, shape=[n_fc2])),
    'fc3': tf.Variable(tf.constant(0.0, dtype=tf.float32, shape=[n_classes]))
}

七、第一层卷积层

使用的卷积核大小是[11,11],步进是4.使用ReLU激活函数进行去线性化处理(为网络引入了大量的稀疏性，至少有一半的神经元不会被激活，加速了强特征的提取)，再将数据进行池化处理，提取特征。池化层的步长是2，表明使用了重叠池化，对数据集的特征保留更多，更好的反应特征现象。最后LRN层数据归一化处理。

# 将输入的x裁剪为（227 * 227)的三通道图像
x_image = tf.reshape(x, [-1, 227, 227, 3])

# 卷积层 1
conv1 = tf.nn.conv2d(x_image, W_conv['conv1'], strides=[1, 4, 4, 1], padding='VALID')
#实现卷积
conv1 = tf.nn.bias_add(conv1, b_conv['conv1'])
#tf.nn.bias_add：将偏差值b_conv['conv1']加到conv1的矩阵上
#conv1 = batch_norm(conv1, True)
#batch_norm: 数据正则化
conv1 = tf.nn.relu(conv1)
#激励relu函数

# 池化层 1
pool1 = tf.nn.max_pool(conv1, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding='VALID')
#平均池化

# LRN层：局部响应归一化,对当前输出层的结果做平滑处理
norm1 = tf.nn.lrn(pool1, 5, bias=1.0, alpha=0.001 / 9.0, beta=0.75)

用到的函数：
(1) tf.nn.conv2d

tf.nn.conv2d(input,filter,strides,padding,use_cudnn_on_gpu=None,name=None):卷积函数

input : 需要做卷积的输入图像，要求是一个 Tensor,具有[batch,in_height,in_width,in_channels]:
[训练时一个batch的图片数量，图片高度，图片宽度，图片通道数]，要求 float32 或 float64
filter ：CNN中的卷积核，要求是一个Tensor,具有[filter_height,filter_width,in_channels,out_channels]:[卷积核的高度，卷积核的宽度，输入图像通道数，输出图像通道数]，要求的类型与参数input 相同。第三位in_channels ,就是参数input 的第四维
strides: 卷积时在图像每一位的步长，这是一个一位的向量，第一维和第四维默认为1，第三维和第四维分别是平行和竖直滑行的步进长度。
padding：string类型的量，只能是SAME,VALID之一，这个值决定了不同的卷积方式
use_cudnn_on_gpu：bool类型，是否使用cudnn加速，默认为true

卷积：获得特征

(2) tf.nn.avg_pool

tf.nn.max_pool(value,ksize,strides,padding,name=None):平均池化：计算该位置及其相邻矩形区域内的平均值作为该位置的值。

第一个参数value：需要做池化的输入图像，输入feaure map，因为池化在卷积后边。shape为[batch, in_height, in_width, in_channels]：

batch：训练时一个batch的图片数量
in_height：输入图像的高度
in_width：输入图像的宽度
in_channels：输入feature map的数量

第二个参数ksize：类似于卷积的过滤器，池化窗口的大小，是一个长度为4的一维数组，但是这个数组的第一个和最后一数必须为1，即[1, height, width, 1]。这意味着池化层的过滤器是不可以在batch和channels上做池化。实际应用中，使用最多的过滤器尺寸为[1, 2, 2, 1]或者[1, 3, 3, 1]。

height: 过滤器的高度
width：过滤器的宽度

第三个参数strides：不同维度上的步长，是一个长度为4的一维向量，[ 1, strides, strides, 1]，第一维和最后一维的数字要求必须是1。因为卷积层的步长只对矩阵的长和宽有效。
第四个参数padding：string类型，是否考虑边界，值为“SAME”和“VALID”，"SAME"是考虑边界，不足的时候用填充周围，"VALID"则不考虑边界。

池化：对不同位置的特征进行聚合操作，进一步处理卷积得到的特征映射结果，减少特征值。池化函数会将平面内的某一位置及其相邻位置的特征值进行统计汇总，并将汇总后的结果作为这一位置在该平面内的值。
最大池化：减小"卷积层参数误差造成估计均值的偏移"误差，更多的保留纹理信息
平均池化：减小"邻域大小受限造成的估计值方差增大"误差，更多的保留图像的背景信息。

(3)tf.nn.lrn归一化

tf.nn.lrn(input, depth_radius=None, bias=None, alpha=None, beta=None,name=None)

input 表示输入数据，
depth_radius 表示使用前后几层进行归一化操作，
bias 表示偏移量，
alpha 和 beta 表示系数。
对输入区域进行归一化，即全部输入值均进行一系列的计算得到的数值。
(4)ReLU函数
$f (x) = m a x (0, x)$
避免了当传递的数值过大或过小时神经元梯度接近0，避免模型没有更新。

八、第二层卷积层

# 卷积层 2
conv2 = tf.nn.conv2d(norm1, W_conv['conv2'], strides=[1, 1, 1, 1], padding='SAME')
conv2 = tf.nn.bias_add(conv2, b_conv['conv2'])
conv2 = batch_norm(conv2, True)
conv2 = tf.nn.relu(conv2)

# 池化层 2
pool2 = tf.nn.avg_pool(conv2, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding='VALID')


# LRN层
norm2 = tf.nn.lrn(pool2, 5, bias=1.0, alpha=0.001 / 9.0, beta=0.75)

九、第三，四，五层卷积层

# 卷积层 3
conv3 = tf.nn.conv2d(norm2, W_conv['conv3'], strides=[1, 1, 1, 1], padding='SAME')
conv3 = tf.nn.bias_add(conv3, b_conv['conv3'])
conv3 = batch_norm(conv3, True)
conv3 = tf.nn.relu(conv3)

# 卷积层 4
conv4 = tf.nn.conv2d(conv3, W_conv['conv4'], strides=[1, 1, 1, 1], padding='SAME')
conv4 = tf.nn.bias_add(conv4, b_conv['conv4'])
conv4 = batch_norm(conv4, True)
conv4 = tf.nn.relu(conv4)

# 卷积层 5
conv5 = tf.nn.conv2d(conv4, W_conv['conv5'], strides=[1, 1, 1, 1], padding='SAME')
conv5 = tf.nn.bias_add(conv5, b_conv['conv5'])
conv5 = batch_norm(conv5, True)
conv5 = tf.nn.relu(conv2)

# 池化层5
pool5 = tf.nn.avg_pool(conv5, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding='VALID')

十、全连接层

在使用全连接层前，将输入数据的大小进行重新构建，使4维矩阵重构为2维矩阵

reshape = tf.reshape(pool5, [-1, 13 * 13 * 256])
#全连接层 1
fc1 = tf.add(tf.matmul(reshape, W_conv['fc1']), b_conv['fc1'])
#fc1 = batch_norm(fc1, True, False)
fc1 = tf.nn.relu(fc1)

# 全连接层 2
fc2 = tf.add(tf.matmul(fc1, W_conv['fc2']), b_conv['fc2'])
#fc2 = batch_norm(fc2, True, False)
fc2 = tf.nn.relu(fc2)

# 全连接层3 
fc3 = tf.add(tf.matmul(fc2, W_conv['fc3']), b_conv['fc3'])

十一、损失函数

通过交叉熵得到输入样例所属种类的概率分布情况

# 定义损失函数
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=fc3, labels=y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(loss)

用过的函数：
（1）tf.nn.softmax_cross_entropy_with_logits

tf.nn.softmax_cross_entropy_with_logits(sentinel=None,labels=None, logits=None, name=None):
sigmoid交叉熵,作为损失函数。labels表示真实标记的分布，logits则为训练后的模型的预测标记分布，
交叉熵损失函数可以衡量 labels与logits 的相似性

_sentinel:本质上是不用的参数，不用填
logits:计算的输出，注意是为使用softmax或sigmoid的，维度一般是[batch_size, num_classes] ，单样本是[num_classes]。数据类型（type）是float32或float64;
labels:和logits具有相同的type(float)和shape的张量(tensor)，即数据类型和张量维度都一致。
name:操作的名字，可填可不填
（2）tf.reduce_mean

tf.reduce_mean(input_tensor,axis=None,keep_dims=False,name=None,reduction_indices=None)
:用于计算张量tensor沿着指定的数轴（tensor的某一维度）上的的平均值，主要用作降维或者
计算tensor（图像）的平均值。

第一个参数input_tensor：输入的待降维的tensor;
第二个参数axis：指定的轴，如果不指定，则计算所有元素的均值;
第三个参数keep_dims：是否降维度，设置为True，输出的结果保持输入tensor的形状，设置为False，输出结果会降低维度;
第四个参数name：操作的名称;
第五个参数 reduction_indices：在以前版本中用来指定轴，已弃用;

(3) tf.train.GradientDescentOptimizer

tf.train.GradientDescentOptimizer(learning_rate, use_locking=False,name=’GradientDescent’)
：实现梯度下降算法的优化器（使用随机梯度下降算法实现更新参数）

learning_rate: 学习率
use_locking: 要是True的话，就对于更新操作（update operations.）使用锁
name: 名字，可选，默认是”GradientDescent”

.minimize() 函数处理了梯度计算和参数更新两个操作:通过更新参数来使loss最小化

十二、标签的重构

def onehot(labels):
    n_sample = len(labels)
    n_class = max(labels) + 1
    onehot_labels = np.zeros((n_sample, n_class))
    onehot_labels[np.arange(n_sample), labels] = 1
    return onehot_labels

生成一个{len(labels)}行{max(labels) + 1}列的矩阵.(将labels的数转化为2进制)

十三、模型训练

# 训练模型的存放的地址和名称
save_model = r"C:\Users\abc\Desktop\model\model1\AlexNetModel.ckpt"

init = tf.global_variables_initializer()

# 训练函数
def train(opech):
    with tf.Session() as sess:
        sess.run(init)

        saver = tf.train.Saver()
		#实例化一个 tf.train.Saver
        # 输出日志
        trian_writer = tf.summary.FileWriter(r"C:\Users\abc\Desktop\model\log", sess.graph)
		#指定一个文件用来保存图
        # 记录每次训练的情况在坐标图上的点
        point = []
        #损失的集合
        start_time = time.time()
        #返回当前时间的时间戳
        coord = tf.train.Coordinator()
        #创建一个线程协调器，用来管理session启动后的的多个线程。在收到任何关闭信号的时候，让所有的线程都知道
        #在启动线程之前，声明Coordinator类，并将这个类传入每一个创建的县城中
        threads = tf.train.start_queue_runners(coord=coord)
		
        step = 0

        # opech为迭代次数 每次输入一个batch去训练
        for i in range(opech):
            step = i
            image, label = sess.run([image_batch, label_batch])
            labels = onehot(label)
            sess.run(optimizer, feed_dict={x: image, y: labels})
            #optimizer:优化器
            loss_record = sess.run(loss, feed_dict={x: image, y: labels})
            #取得损失率
            print("目前损失为： %f \n" % loss_record)
            point.append(loss_record)
            end_time = time.time()
            #返回当前时间的时间戳
            print("花费时间： ", (end_time - start_time))
            print("----------------------------第 %d 轮训练已经完成-----------------------" % i)

        print("训练全部完成！")
        saver.save(sess, save_model)
        #保存训练好的模型
        ##使用TensorFlow提供的save函数进行存储
        print("模型已经成功保存至 %s !" % save_model)

        coord.request_stop()
        #请求该线程和其他线程停止
        #当线程调用request_stop时，should_stop的返回值将被设置为True，这样其他的线程就同时终止。
        coord.join(threads)
        #coord.join(threads)把线程加入主线程，等待threads(即：数据读入文件名队列)结束
        plt.plot(point)
        plt.xlabel('opech')
        plt.ylabel('loss_record')
        plt.title('learning_rate = %f , training_iters = %d ,batch_size = %d' % (learning_rate, training_iters, batch_size))
        plt.tight_layout()
        plt.savefig(r'C:\Users\abc\Desktop\model\AlexNet.jpg', dpi=200)

用到的函数：
（1）sess.run(tf.global_variables_initializer())

sess.run(tf.global_variables_initializer()):初始化模型参数，一般用于tf.Variable建立的变量

（2）train.Saver()

train.Saver()：TensorFlow提供过的用于保存和还原神经网络模型的API

(3) tf.summary.FileWriter

tf.summary.FileWriter(path,session.graph):指定一个文件用来保存图.

path: 事件文件所在的路径
session.graph：文件要记录的图

(4) tf.train.start_queue_runners

在使用tf.train.slice_input_producer创建文件名队列后，整个系统其实还是处于“停滞状态”的，也就是说，
文件名并没有真正被加入到队列中，此时如果我们开始计算，因为内存队列中什么也没有，计算单元就会一直等待，
导致整个系统被阻塞。使用tf.train.start_queue_runners之后，才会启动填充队列的线程，这时系统就不再
“停滞”。此后计算单元就可以拿到数据并进行计算，整个程序也就跑起来了

tf.train.start_queue_runners()作用:启动入队线程，由多个或单个线程，按照设定规则，把文件读入
文件名队列中.函数返回线程ID的列表，一般情况下，系统有多少个核，就会启动多少个入队线程
（入队具体使用多少个线程在tf.train.batch中定义）

十四、使用训练过的模型预测图片

在模型中使用正则化后模型无法识别图片，故删去正则化

def per_class(imagefile):
    image = Image.open(imagefile)
    image = image.resize([227, 227])
    image_array = np.array(image)
	#将输入转为矩阵格式
    image = tf.cast(image_array, tf.float32)
  	#转化为float32的张量
    image = tf.image.resize_image_with_crop_or_pad(image, 227, 227)
    #将image转化为227×227
    image = tf.image.per_image_standardization(image)
     #图形标准化
    image = tf.reshape(image, [1, 227, 227, 3])

    saver = tf.train.Saver()
    #创建Saver对象，
    with tf.Session() as sess:
        save_model = tf.train.latest_checkpoint(r'C:\Users\abc\Desktop\model\model1')
        #读取对应文件夹中的最新的一个模型
        saver.restore(sess, save_model)
        #用Saver对象saver恢复所有的变量
        image = tf.reshape(image, [1, 227, 227, 3])
        image = sess.run(image)
        prediction = sess.run(fc3, feed_dict={x: image})
        print('prediction',prediction)
        max_index = np.argmax(prediction)
        print('max_index',max_index)
        return max_index

训练后生成的文件：
在这里插入图片描述

.meta文件保存的是图结构，meta文件是pb（protocol buffer）格式文件，包含变量、op、集合等
ckpt文件是二进制文件，保存了所有的weights、biases、gradients等变量。在tensorflow 0.11之前，保存在**.ckpt**文件中。0.11后，通过两个文件保存,如：
.data-00000-of-00001
.index
checkpoint文件，该文件是个文本文件，里面记录了保存的最新的checkpoint文件以及其它checkpoint文件列表

全部代码

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import time
import os
from PIL import Image

"""
输入：数据集路径 路径下分别是 test1 文件夹
输出：两个列表 1.图片路径列表 2.在1中同位置图片的标签 （输出时会随机打乱）
"""


def get_file(file_dir):
    images = []  # 每张图片的路径组成的列表
    temp = []  # 保存cat dog文件夹路径
    for root, sub_folders, files in os.walk(file_dir):

        for name in files:
            images.append(os.path.join(root, name))

        for name in sub_folders:
            temp.append(os.path.join(root, name))

    labels = []  # 保存注释列表

    # 此时temp为根目录下所有文件夹的路径列表 一次取出一个文件夹 对文件夹里面的所有数据图片进行注释
    for one_folder in temp:
        n_img = len(os.listdir(one_folder))  # 得到图片总数
        letter = one_folder.split('\\')[-1]  # 按照“\\”分割 取出最后一个也就是文件夹的名称

        # 标注数据集
        labels = np.append(labels, n_img * [int(letter)])


    temp = np.array([images, labels])  # 重新创建数组temp 将images 和 labels 最为一对键值对写入
    temp = temp.transpose()  # 将temp转置
    np.random.shuffle(temp)  # 打乱数据集的顺序

    image_list = list(temp[:, 0])  # 取出数组中的第一维 也就是图片的路径列表
    label_list = list(temp[:, 1])  # 取出数组中的第二维 也就是图片的标签列表
    label_list = [int(float(i)) for i in label_list]
    #print('labels:',labels)
    #print('temp:',temp)
    #print(' label_list', label_list)

    return image_list, label_list


"""
输入：get_file输出的存有文件路径和与之对应的标签的两个列表 图片的宽和高 一个batch的数量 
输出：两个tensor 一个是(200 * 270 * 270 * 3)的一个batch的图片 另一个是(200 * 1)的一个batch的标签
通过路径获得数据集 将数据集图片与对应标签打包作为数据集输入AlexNet网络
"""


def get_batch(image_list, label_list, img_width, img_height, batch_size, capacity):
    """将图片地址的数据集转化为TensorFlow专用格式"""
    image = tf.cast(image_list, tf.string)
    label = tf.cast(label_list, tf.int64)
    input_queue = tf.train.slice_input_producer([image, label])

    label = input_queue[1]
    image_contents = tf.read_file(input_queue[0])
    image = tf.image.decode_jpeg(image_contents, channels=3)

    image = tf.image.resize_image_with_crop_or_pad(image, 227, 227)
    image = tf.image.per_image_standardization(image)
    image_batch, label_batch = tf.train.batch([image, label], batch_size=200, num_threads=64, capacity=300)
    label_batch = tf.reshape(label_batch, [batch_size])
    return image_batch, label_batch


# 输入文件路径 获得两个batch
x_train, y_train = get_file(r'C:\Users\abc\Desktop\test1')
image_batch, label_batch = get_batch(x_train, y_train, 227, 227, 200, 2048)

'''
def batch_norm(inputs, is_training, is_conv_out=True, decay=0.999):
    scale = tf.Variable(tf.ones([inputs.get_shape()[-1]]))
    beta = tf.Variable(tf.zeros([inputs.get_shape()[-1]]))
    pop_mean = tf.Variable(tf.zeros(inputs.get_shape()[-1]), trainable=False)
    pop_var = tf.Variable(tf.ones(inputs.get_shape()[-1]), trainable=False)

    if is_training:
        if is_conv_out:
            batch_mean, batch_var = tf.nn.moments(inputs, [0, 1, 2])

        else:
            batch_mean, batch_var = tf.nn.moments(inputs, [0])

        train_mean = tf.assign(pop_mean, pop_mean * decay + batch_mean * (1 - decay))
        train_var = tf.assign(pop_var, pop_var * decay + batch_var * (1 - decay))

        with tf.control_dependencies([train_mean, train_var]):
            return tf.nn.batch_normalization(inputs, batch_mean, batch_var, beta, scale, 0.001)

    else:
        return tf.nn.batch_normalization(inputs, pop_mean, pop_var, beta, scale, 0.001)
'''

# 模型参数：
learning_rate = 1e-4  # 1×10^(-4)
#学习率
training_iters = 200
batch_size = 200
display_step = 5
n_classes = 62
#分类的数目
n_fc1 = 4096
n_fc2 = 2048
#神经网络的数量

# 构建神经网络
x = tf.placeholder(tf.float32, [None, 227, 227, 3])
y = tf.placeholder(tf.int32, [None, n_classes])

# 权重
W_conv = {
    'conv1': tf.Variable(tf.truncated_normal([11, 11, 3, 96], stddev=0.0001)),
    'conv2': tf.Variable(tf.truncated_normal([5, 5, 96, 256], stddev=0.01)),
    'conv3': tf.Variable(tf.truncated_normal([3, 3, 256, 384], stddev=0.01)),
    'conv4': tf.Variable(tf.truncated_normal([3, 3, 384, 384], stddev=0.01)),
    'conv5': tf.Variable(tf.truncated_normal([3, 3, 384, 256], stddev=0.01)),
    'fc1': tf.Variable(tf.truncated_normal([13 * 13 * 256, n_fc1], stddev=0.1)),
    'fc2': tf.Variable(tf.truncated_normal([n_fc1, n_fc2], stddev=0.1)),
    'fc3': tf.Variable(tf.truncated_normal([n_fc2, n_classes], stddev=0.1))
}

# 偏置
b_conv = {
    'conv1': tf.Variable(tf.constant(0.0, dtype=tf.float32, shape=[96])),
    'conv2': tf.Variable(tf.constant(0.1, dtype=tf.float32, shape=[256])),
    'conv3': tf.Variable(tf.constant(0.1, dtype=tf.float32, shape=[384])),
    'conv4': tf.Variable(tf.constant(0.1, dtype=tf.float32, shape=[384])),
    'conv5': tf.Variable(tf.constant(0.1, dtype=tf.float32, shape=[256])),
    'fc1': tf.Variable(tf.constant(0.1, dtype=tf.float32, shape=[n_fc1])),
    'fc2': tf.Variable(tf.constant(0.1, dtype=tf.float32, shape=[n_fc2])),
    'fc3': tf.Variable(tf.constant(0.0, dtype=tf.float32, shape=[n_classes]))
}

# 将输入的x裁剪为（227 * 227)的三通道图像
x_image = tf.reshape(x, [-1, 227, 227, 3])

# 卷积层 1
conv1 = tf.nn.conv2d(x_image, W_conv['conv1'], strides=[1, 4, 4, 1], padding='VALID')
conv1 = tf.nn.bias_add(conv1, b_conv['conv1'])
#conv1 = batch_norm(conv1, True)
conv1 = tf.nn.relu(conv1)

# 池化层 1
pool1 = tf.nn.avg_pool(conv1, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding='VALID')

# LRN层
norm1 = tf.nn.lrn(pool1, 5, bias=1.0, alpha=0.001 / 9.0, beta=0.75)

# 卷积层 2
conv2 = tf.nn.conv2d(norm1, W_conv['conv2'], strides=[1, 1, 1, 1], padding='SAME')
conv2 = tf.nn.bias_add(conv2, b_conv['conv2'])
#conv2 = batch_norm(conv2, True)
conv2 = tf.nn.relu(conv2)

# 池化层 2
pool2 = tf.nn.avg_pool(conv2, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding='VALID')


# LRN层
norm2 = tf.nn.lrn(pool2, 5, bias=1.0, alpha=0.001 / 9.0, beta=0.75)

# 卷积层 3
conv3 = tf.nn.conv2d(norm2, W_conv['conv3'], strides=[1, 1, 1, 1], padding='SAME')
conv3 = tf.nn.bias_add(conv3, b_conv['conv3'])
#conv3 = batch_norm(conv3, True)
conv3 = tf.nn.relu(conv3)

# 卷积层 4
conv4 = tf.nn.conv2d(conv3, W_conv['conv4'], strides=[1, 1, 1, 1], padding='SAME')
conv4 = tf.nn.bias_add(conv4, b_conv['conv4'])
#conv4 = batch_norm(conv4, True)
conv4 = tf.nn.relu(conv4)

# 卷积层 5
conv5 = tf.nn.conv2d(conv4, W_conv['conv5'], strides=[1, 1, 1, 1], padding='SAME')
conv5 = tf.nn.bias_add(conv5, b_conv['conv5'])
#conv5 = batch_norm(conv5, True)
conv5 = tf.nn.relu(conv2)

# 池化层5
pool5 = tf.nn.avg_pool(conv5, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding='VALID')
reshape = tf.reshape(pool5, [-1, 13 * 13 * 256])

#全连接层 1
fc1 = tf.add(tf.matmul(reshape, W_conv['fc1']), b_conv['fc1'])
#fc1 = batch_norm(fc1, True, False)
fc1 = tf.nn.relu(fc1)

# 全连接层 2
fc2 = tf.add(tf.matmul(fc1, W_conv['fc2']), b_conv['fc2'])
#fc2 = batch_norm(fc2, True, False)
fc2 = tf.nn.relu(fc2)

# 全连接层3 分类层
fc3 = tf.add(tf.matmul(fc2, W_conv['fc3']), b_conv['fc3'])

# 定义损失函数
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=fc3, labels=y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(loss)

# 评估模型
correct_pred = tf.equal(tf.argmax(fc3, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

init = tf.global_variables_initializer()


# 使用onehot编码 重新标记
def onehot(labels):
    n_sample = len(labels)
    n_class = max(labels) + 1
    onehot_labels = np.zeros((n_sample, n_class))
    onehot_labels[np.arange(n_sample), labels] = 1
    return onehot_labels


# 训练模型的存放的地址和名称
save_model = r"C:\Users\abc\Desktop\model\model1\AlexNetModel.ckpt"


# 训练函数
def train(opech):
    with tf.Session() as sess:
        sess.run(init)

        saver = tf.train.Saver()

        # 输出日志
        trian_writer = tf.summary.FileWriter(r"C:\Users\abc\Desktop\model\log", sess.graph)

        # 记录每次训练的情况在坐标图上的点
        point = []
        #损失的集合
        start_time = time.time()
        coord = tf.train.Coordinator()
        threads = tf.train.start_queue_runners(coord=coord)

        step = 0

        # opech为迭代次数 每次输入一个batch去训练
        for i in range(opech):
            step = i
            image, label = sess.run([image_batch, label_batch])
            labels = onehot(label)
            sess.run(optimizer, feed_dict={x: image, y: labels})
            loss_record = sess.run(loss, feed_dict={x: image, y: labels})
            print("目前损失为： %f \n" % loss_record)
            point.append(loss_record)
            end_time = time.time()
            print("花费时间： ", (end_time - start_time))
            print("----------------------------第 %d 轮训练已经完成-----------------------" % i)

        print("训练全部完成！")
        saver.save(sess, save_model)
        #使用TensorFlow提供的save函数进行存储
        print("模型已经成功保存至 %s !" % save_model)

        coord.request_stop()
        coord.join(threads)
        plt.plot(point)
        plt.xlabel('迭代次数')
        plt.ylabel('损失率')
        plt.title('学习率 = %f , 迭代次数 = %d ,批量 = %d' % (learning_rate, training_iters, batch_size))
        plt.tight_layout()
        plt.savefig(r'C:\Users\abc\Desktop\model\AlexNet.jpg', dpi=200)


def per_class(imagefile):
    image = Image.open(imagefile)
    image = image.resize([227, 227])
    image_array = np.array(image)

    image = tf.cast(image_array, tf.float32)
    #image = tf.image.resize_image_with_crop_or_pad(image, 227, 227)
    #image = tf.image.per_image_standardization(image)
    #图形标准化

    image = tf.image.resize_image_with_crop_or_pad(image, 227, 227)
    image = tf.image.per_image_standardization(image)
    image = tf.reshape(image, [1, 227, 227, 3])

    saver = tf.train.Saver()
    with tf.Session() as sess:

        save_model = tf.train.latest_checkpoint(r'C:\Users\abc\Desktop\model\model1')
        saver.restore(sess, save_model)
        image = tf.reshape(image, [1, 227, 227, 3])
        image = sess.run(image)
        prediction = sess.run(fc3, feed_dict={x: image})
       # print('prediction',prediction)
        max_index = np.argmax(prediction)
        imagefile1=imagefile.split('\\')[-1]
        print(imagefile1+'图片属于第'+str(max_index)+"类交通路标")
        return max_index

# 执行以上程序
imagefiles = r"C:\Users\abc\Desktop\train2"

train(500)
for root, sub_folders, files in os.walk(imagefiles):
    for name in files:
        imagefile = os.path.join(root, name)
        #print(imagefile)
        a=per_class(imagefile)

结果：最后损失率：0.0624
检测：
在这里插入图片描述

立里∑

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
用AlexNet实现路标分类

一、数据来源：从http://btsd.ethz.ch/shareddata/网站下载两个文件夹(TestingTestingTesting 文件夹用于做训练集，TrainingTrainingTraining文件夹用于检测)：二、数据前期处理将TestingTestingTesting文件夹中的.ppm.ppm.ppm图片转化为.jpg.jpg.jpg图片：def chang...
复制链接

扫一扫