学习笔记：用tensorflow训练自己的图片

最新推荐文章于 2024-05-15 04:04:55 发布

荣•厚德载物

最新推荐文章于 2024-05-15 04:04:55 发布

阅读量1.3k

点赞数

分类专栏： AI学习记录---DL,CS,CV视觉

AI学习记录---DL,CS,CV视觉专栏收录该内容

56 篇文章 2 订阅

订阅专栏

感谢这位作者，以下记录是来自于https://blog.csdn.net/qq_36631272/article/details/79173035的，我看到比较好，就转记录到自己的博客了，如果有侵权，立马删掉

在训练mnist数据的时候，根据书本上的内容都可以很好很快的编辑并跑出来，但是一旦换成自己的文件夹，就很头疼，毕竟mnist里面一个read_data解决你所有的输入问题，然而在现实中，该read_data是要自己编辑的，本文主要针对非ont_hot数据，如何利用tensorflow搭起网络并跑通自己的数据，话不多说，直接上代码。

python版本：2.7

tensorflow 版本：1.2.0

 
   [html]  
   view plaincopy
#!/usr/bin/env python2  
# -*- coding: utf-8 -*-  
"""  
Created on Thu Jan 25 11:28:55 2018  
  
@author:huangxd  
"""  
"""  
vision:python3  
author:huangxd  
"""  
import os  
import math    
import numpy as np    
import tensorflow as tf  
  
#生成图片路径和标签list  
#train_dir='C:/Users/hxd/Desktop/tensorflow_study/Alexnet_dr'  
zeroclass = []    
label_zeroclass = []    
oneclass = []    
label_oneclass = []    
twoclass = []    
label_twoclass = []    
threeclass = []    
label_threeclass = []  
fourclass = []  
label_fourclass = []  
fiveclass = []  
label_fiveclass = []  
#s1 获取路径下所有图片名和路径，存放到对应列表并贴标签  
def get_files(file_dir,ratio):  
    for file in os.listdir(file_dir+'/0'):    
        zeroclass.append(file_dir +'/0'+'/'+ file)     
        label_zeroclass.append(0)    
    for file in os.listdir(file_dir+'/1'):    
        oneclass.append(file_dir +'/1'+'/'+file)    
        label_oneclass.append(1)    
    for file in os.listdir(file_dir+'/2'):    
        twoclass.append(file_dir +'/2'+'/'+ file)     
        label_twoclass.append(2)    
    for file in os.listdir(file_dir+'/3'):    
        threeclass.append(file_dir +'/3'+'/'+file)    
        label_threeclass.append(3)        
    for file in os.listdir(file_dir+'/4'):    
        fourclass.append(file_dir +'/4'+'/'+file)    
        label_fourclass.append(4)        
    for file in os.listdir(file_dir+'/5'):    
        fiveclass.append(file_dir +'/5'+'/'+file)    
        label_fiveclass.append(5)  
#s2 对生成图片路径和标签list打乱处理（img和label）  
    image_list=np.hstack((zeroclass, oneclass, twoclass, threeclass, fourclass, fiveclass))  
    label_list=np.hstack((label_zeroclass, label_oneclass, label_twoclass, label_threeclass, label_fourclass, label_fiveclass))  
    #shuffle打乱  
    temp = np.array([image_list, label_list])  
    temp = temp.transpose()  
    np.random.shuffle(temp)  
    #将所有的img和lab转换成list  
    all_image_list=list(temp[:,0])  
    all_label_list=list(temp[:,1])  
    #将所得List分为2部分，一部分train,一部分val，ratio是验证集比例  
    n_sample = len(all_label_list)    
    n_val = int(math.ceil(n_sample*ratio))   #验证样本数    
    n_train = n_sample - n_val   #训练样本数    
    
    tra_images = all_image_list[0:n_train]  
    tra_labels = all_label_list[0:n_train]    
    tra_labels = [int(float(i)) for i in tra_labels]    
    val_images = all_image_list[n_train:]    
    val_labels = all_label_list[n_train:]  
    val_labels = [int(float(i)) for i in val_labels]      
    return tra_images,tra_labels,val_images,val_labels  
#生成batch  
#s1:将上面的list传入get_batch(),转换类型,产生输入队列queue因为img和lab    
#是分开的，所以使用tf.train.slice_input_producer()，然后用tf.read_file()从队列中读取图像    
#   image_W, image_H, ：设置好固定的图像高度和宽度    
#   设置batch_size：每个batch要放多少张图片    
#   capacity：一个队列最大多少  
  
def get_batch(image,label,image_W,image_H,batch_size,capacity):  
    #转换类型  
    image=tf.cast(image,tf.string)  
    label=tf.cast(label,tf.int32)  
    #入队  
    input_queue=tf.train.slice_input_producer([image,label])  
    label=input_queue[1]  
    image_contents=tf.read_file(input_queue[0]) #读取图像  
    #s2图像解码，且必须是同一类型  
    image=tf.image.decode_jpeg(image_contents,channels=3)  
    #s3预处理，主要包括旋转，缩放，裁剪，归一化  
    image = tf.image.resize_image_with_crop_or_pad(image, image_W, image_H)    
    image = tf.image.per_image_standardization(image)  
    #s4生成batch  
  
    image_batch, label_batch = tf.train.batch([image, label],    
                                                batch_size= batch_size,    
                                                num_threads= 32,     
                                                capacity = capacity)  
    #重新排列label，行数为[batch_size]    
    label_batch = tf.reshape(label_batch, [batch_size])    
    #image_batch = tf.cast(image_batch, tf.float32)    
    return image_batch, label_batch  

该数据生成的是bool型，非one_hot编码，系统自带的mnist编码是one_hot编码，大家可以先去了解下这块东西

得到数据之后，接下来就是网络的搭建，我在这里将模型单独定义出来，方便后期的网络修正。

 
   [html]  
   view plaincopy
#!/usr/bin/env python2  
# -*- coding: utf-8 -*-  
  
"""  
Spyder Editor  
This is a temporary script file.  
filename:DR_alexnet.py  
creat time:2018年1月16日  
author:huangxudong  
"""  
import tensorflow as tf  
import numpy as np  
#define different layer function  
def maxPoolLayer(x,kHeight,kWidth,strideX,strideY,name,padding="SAME"):  
    return tf.nn.max_pool(x,ksize=[1,kHeight,kWidth,1],strides=[1,strideX,strideY,1],padding=padding,name=name)  
  
def dropout(x,keepPro,name=None):  
    return tf.nn.dropout(x,keepPro,name)  
  
def LRN(x,R,alpha,beta,name=None,bias=1.0):                      #局部相应归一化  
    return tf.nn.local_response_normalization(x,depth_radius=R,alpha=alpha,  
                                              beta=beta,bias=bias,name=name)  
def fcLayer(x,inputD,outputD,reluFlag,name):  
    with tf.variable_scope(name) as scope:  
        w=tf.get_variable("w",shape=[inputD,outputD])   #shape就是变量维度  
        b=tf.get_variable("b",[outputD])  
        out=tf.nn.xw_plus_b(x,w,b,name=scope.name)  
        if reluFlag:  
            return tf.nn.relu(out)  
        else:  
            return out  
def convLayer(x,kHeight,kWidth,strideX,strideY,featureNum,name,padding="SAME",groups=1):  
    """convolution"""  
    channel=int(x.get_shape()[-1])       #x数组的最后一个数  
    conv=lambda a,b: tf.nn.conv2d(a,b,strides=[1,strideY,strideX,1],padding=padding)   #匿名函数  
    with tf.variable_scope(name) as scope:  
        w=tf.get_variable("w",shape=[kHeight,kWidth,channel/groups,featureNum])  
        b=tf.get_variable("b",shape=[featureNum])  
        xNew=tf.split(value=x,num_or_size_splits=groups,axis=3)  
        wNew=tf.split(value=w,num_or_size_splits=groups,axis=3)  
        featureMap=[conv(t1,t2) for t1,t2 in zip(xNew,wNew)]  
        mergeFeatureMap=tf.concat(axis=3,values=featureMap)  
        out=tf.nn.bias_add(mergeFeatureMap,b)  
 #       print(mergeFeatureMap.get_shape().as_list(),out.shape)  
        return tf.nn.relu(out,name=scope.name)  #卷积激活一起完成,out大小和mergeFeatureMap一样，不需要reshape  
class alexNet(object):  
    """alexNet model"""  
    def __init__(self,x,keepPro,classNum,skip,modelPath="bvlc_alexnet.npy"):  
        self.X=x  
        self.KEEPPRO=keepPro                 #表示类名  
        self.CLASSNUM=classNum  
        self.SKIP=skip  
        self.MODELPATH=modelPath  
        #build CNN  
        self.buildCNN()  
    def buildCNN(self):             #重点，模型搭建  2800*2100  
        x1=tf.reshape(self.X,shape=[-1,512,512,3])  
#        print(x1.shape)  
        conv1=convLayer(x1,7,7,3,3,256,"conv1","VALID")    #169*169  
        lrn1=LRN(conv1,2,2e-05,0.75,"norm1")  
        pool1=maxPoolLayer(lrn1,3,3,2,2,"pool1","VALID")    #84*84  
  
        conv2=convLayer(pool1,3,3,1,1,512,"conv2","VALID")    #82*82  
        lrn2=LRN(conv2,2,2e-05,0.75,"norm2")  
        pool2=maxPoolLayer(lrn2,3,3,2,2,"pool2","VALID")       #40*40  
      
        conv3=convLayer(pool2,3,3,1,1,1024,"conv3","VALID")    #38*38      
        conv4=convLayer(conv3,3,3,1,1,1024,"conv4","VALID")   #36*36  
  
        conv5=convLayer(conv4,3,3,2,2,512,"conv5","VALID")    #17*17  
        pool5=maxPoolLayer(conv5,3,3,2,2,"pool5","VALID")     #8*8  
#        print(pool5.shape)  
        fcIn=tf.reshape(pool5,[-1,512*8*8])  
        fc1=fcLayer(fcIn,512*8*8,4096,True,"fc6")  
        dropout1=dropout(fc1,self.KEEPPRO)  
  
        fc2=fcLayer(dropout1,4096,4096,True,"fc7")  
        dropout2=dropout(fc2,self.KEEPPRO)  
  
        self.fc3=fcLayer(dropout2,4096,self.CLASSNUM,True,"fc8")  

上面便是网络的搭建，搭好之后还需要将模型加载出来：

 
   [html]  
   view plaincopy
def loadModel(self,sess):  
    """load model"""  
    wDict=np.load(self.MODELPATH,encoding="bytes").item()  
    for name in wDict:  
        if name not in self.SKIP:  
            with tf.variable_scope(name, reuse = True):  
                for p in wDict[name]:  
                    if len(p.shape) == 1:  
                        #bias  
                        sess.run(tf.get_variable('b', trainable = False).assign(p))  
                    else:  
                        #weights  
                        sess.run(tf.get_variable('w', trainable = False).assign(p))  

训练模型的时候，维数一定要匹配，同时要了解你自己的数据的格式，和读取的类型，一个one_hot编码用的函数和非one_hot用的函数完全不一样，这也是我当时一直出现问题的原因。

 
   [html]  
   view plaincopy
#!/usr/bin/env python2  
# -*- coding: utf-8 -*-  
"""  
Created on Thu Jan 25 11:32:40 2018  
  
@author: huangxudong  
"""  
import dr_alexnet  
import tensorflow as tf  
import read_data2  
  
#定义网络超参数  
learning_rate=0.01  
train_iters=2000  
batch_size=5  
capacity=256  
display_step=10  
#读取数据  
tra_list,tra_labels,val_list,val_labels=read_data2.get_files('/home/bigvision/Desktop/DR_model',0.2)  
tra_list_batch,tra_label_batch=read_data2.get_batch(tra_list,tra_labels,512,512,batch_size,capacity)  
val_list_batch,val_label_batch=read_data2.get_batch(val_list,val_labels,512,512,batch_size,capacity)  
  
#定义网络参数  
n_class=6       #标记维度  
dropout=0.75  
skip=[]  
#输入占位符  
x=tf.placeholder(tf.float32,[None,786432])  #2800*2100*3,512*512*3  
y=tf.placeholder(tf.int32,[None])  
#print(y.shape)  
keep_prob=tf.placeholder(tf.float32)  #dropout  
  
  
''''构建模型，定义损失函数和优化器'''''  
pred=dr_alexnet.alexNet(x,dropout,n_class,skip)  
#定义损失函数和优化器  
cost=tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y,logits=pred.fc3))  
optimizer=tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)  
#评估函数,优化函数  
correct_pred=tf.nn.in_top_k(pred.fc3,y,1)  #1表示列上去最大,0是行,这个地方如果是one_hot就是tf.argmax  
accuracy=tf.reduce_mean(tf.cast(correct_pred,tf.float32))    #改类型  
  
  
'''训练模型'''  
init=tf.global_variables_initializer()   #初始化所有变量  
  
with tf.Session() as sess:  
    sess.run(init)  
    coord=tf.train.Coordinator()        
    threads= tf.train.start_queue_runners(coord=coord)      
    step=1  
    #开始训练，达到最大训练次数  
    while step*batch_size<train_iters:         
        batch_x,batch_y=tra_list_batch.eval(session=sess),tra_label_batch.eval(session=sess)  
        batch_x=batch_x.reshape((batch_size,786432))  
        batch_y=batch_y.T  
          
        sess.run(optimizer,feed_dict={x:batch_x,y:batch_y,keep_prob:dropout})  
        if step%display_step==2:              
            #计算损失值和准确度,输出  
            loss,acc=sess.run([cost,accuracy],feed_dict={x:batch_x,y:batch_y,keep_prob:1.})  
            print("Iter"+str(step*batch_size)+",Minibatch Loss="+ "{:.6f}".format(loss)+", Training Acc"+ "{:.5f}".format(acc))  
        step+=1  
    print("Optimization Finished!")  
    coord.request_stop()       
    coord.join(threads)            #多线程进行batch送入  

feed_dict字典读取数据的时候不能是tensor类型，必须是list,numpy类型（还有一个忘了），所以在送入batch数据的时候加入了.eval(session.sess)，当初这块也是磨了很久。希望以后不在犯错

荣•厚德载物

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
学习笔记：用tensorflow训练自己的图片

感谢这位作者，以下记录是来自于https://blog.csdn.net/qq_36631272/article/details/79173035的，我看到比较好，就转记录到自己的博客了，如果有侵权，立马删掉在训练mnist数据的时候，根据书本上的内容都可以很好很快的编辑并跑出来，但是一旦换成自己的文件夹，就很头疼，毕竟mnist里面一个read_data解决你所有的输入问题，然而在现实中，该re...
复制链接

扫一扫

专栏目录