tensorflow2的一般训练方法(MNIST为例)

tensorflow2的一般训练方法(MNIST为例)

简介

在tensorflow中,有一类高度定式化的方法,就是使用Keras进行相关训练,我不否认Keras的便利性,但是,如果需要定义一些更为复杂的训练过程,Keras的一些功能个人感觉多多少少有点不适合。

Keras一般的过程就是,定义模型,模型配置(model.compile),训练模型(model.fit),参数提取(model.summary)。其中模型配置,训练模型,参数提取就是相当定式化的过程,如果是进行类别的分类任务还好,但是,如果进行比如GAN网络的训练,就拉跨了。更为重要的是Keras的这些部分,弱化了我们对整个网络训练过程的理解,因为我们在这些过程中,只要按照例子改变传给model.compile,model.fit,model.summary的变量就可以直接训练,可以见我的另一篇博客: 我的博客.

因此,我们需要一种更为普遍性的代码来进行训练,这个就是这篇文章的任务。在这篇文章中,我们仍旧以MNIST训练为例子,进行解析。

我的程序

这里的程序最好与上文中的博客对比来看。
首先依旧是相关package的导入。

'''
@Auther : gaoxin
@Date : 2020.08.26
@Version : 1.0
'''

import tensorflow as tf
from PIL import Image
import numpy as np
from matplotlib import pyplot as plt
import os
from tensorflow.keras.layers import Conv2D, BatchNormalization, Activation, MaxPool2D, Dropout, Flatten, Dense
from tensorflow.keras import Model
import warnings

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
warnings.filterwarnings("ignore")

np.set_printoptions(threshold = np.inf)

原来一样的预处理过程,不改变。

train_path = './mnist_image_label/mnist_train_jpg_60000/'
train_txt = './mnist_image_label/mnist_train_jpg_60000.txt'
x_train_savepath = './mnist_image_label/mnist_x_train.npy'
y_train_savepath = './mnist_image_label/mnist_y_train.npy'

test_path = './mnist_image_label/mnist_test_jpg_10000/'
test_txt = './mnist_image_label/mnist_test_jpg_10000.txt'
x_test_savepath = './mnist_image_label/mnist_x_test.npy'
y_test_savepath = './mnist_image_label/mnist_y_test.npy'


def generateds(path, txt):
    f = open(txt, 'r')
    contents = f.readlines()
    f.close() 
    x, y_ = [], []
    for content in contents: 
        value = content.split() 
        img_path = path + value[0]
        img = Image.open(img_path) 
        img = np.array(img.convert('L'))
        img = img / 255. 
        x.append(img) 
        y_.append(value[1]) 
        print('loading : ' + content)  
    x = np.array(x)  
    y_ = np.array(y_)  
    y_ = y_.astype(np.int64)  
    return x, y_  



if os.path.exists(x_train_savepath) and os.path.exists(y_train_savepath) and os.path.exists(
    x_test_savepath) and os.path.exists(y_test_savepath):
        print('-------------Load Datasets-----------------')
        x_train_save = np.load(x_train_savepath)
        y_train = np.load(y_train_savepath)
        x_test_save = np.load(x_test_savepath)
        y_test = np.load(y_test_savepath)

        x_train = np.reshape(x_train_save, (len(x_train_save), 28, 28, 1))
        x_test = np.reshape(x_test_save, (len(x_test_save), 28, 28, 1))
else:
    	print('-------------Generate Datasets-----------------')
    	x_train, y_train = generateds(train_path, train_txt)
    	x_test, y_test = generateds(test_path, test_txt)
	
    	print('-------------Save Datasets-----------------')
    	x_train_save = np.reshape(x_train, (len(x_train), -1))
    	x_test_save = np.reshape(x_test, (len(x_test), -1))
    	np.save(x_train_savepath, x_train_save)
    	np.save(y_train_savepath, y_train)
    	np.save(x_test_savepath, x_test_save)
    	np.save(y_test_savepath, y_test)

接着进行模型定义,也与之前的相同。

class Baseline(Model) :
    def __init__(self):
        super(Baseline, self).__init__()
        self.c1 = Conv2D(filters=6, kernel_size=(5, 5), padding='same')  # 卷积层
        self.b1 = BatchNormalization()  # BN层
        self.a1 = Activation('relu')  # 激活层
        self.p1 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')  # 池化层
        self.d1 = Dropout(0.2)  # dropout层

        self.flatten = Flatten()
        self.f1 = Dense(128, activation='relu')
        self.d2 = Dropout(0.2)
        self.f2 = Dense(10, activation='softmax')

    @tf.function
    def call(self, x):
        x = self.c1(x)
        x = self.b1(x)
        x = self.a1(x)
        x = self.p1(x)
        x = self.d1(x)

        x = self.flatten(x)
        x = self.f1(x)
        x = self.d2(x)
        y = self.f2(x)

        return y

然后是损失函数的定义。使用均方差作为损失函数。
因为这里的inputs_y是一个batch,batch的每个元素为一个数字(1~10),batch大小为[batch_size,],而prediction大小为[batch_size,10](一行的10个中进行概率分布),因此需要把inputs_y变为[batch_size,10]大小的形式,建立各项为零的数组,对应数字位置的0置为1(这是预测的理想状态)。

def loss(model , inputs_x , inputs_y) :
    prediction = model(inputs_x)
    y = np.zeros(prediction.shape)
    for i in range(len(inputs_y)) :
        y[i,inputs_y[i]] = 1
    #print(y)
    loss = tf.reduce_mean(tf.square(y-prediction))
    return loss
    

接着是一个test数据集的检测,检测准确率。

def accuracy(model ,x , y) :
    acc = 0
    prediction = model(x)
    y_ = tf.argmax(prediction,1)
    num = len(y_)
    for i in range(num) :
        if y[i] == y_[i] : acc = acc+1
    return acc/num

因为这里的训练是按照batch给模型喂数据的,所以还要有一个函数依次按照batch给模型投喂数据进行训练,为了方便,这里写成了一个class。

class dataset() :
    def __init__(self,train_x,train_y) :
        self.counter = 0
        self.train_x = train_x
        self.train_y = train_y

    def sample(self, batch_size) :
        batch_x = self.train_x[self.counter:self.counter+batch_size]
        batch_y = self.train_y[self.counter:self.counter+batch_size]
        self.counter = self.counter + batch_size
        if self.counter > self.train_y.shape[0] :
            self.counter = 0
        return batch_x , batch_y

然后是本次的主程序啦,定义一个训练的函数。

def training() :

    model = Baseline()
    model.build(input_shape = (None , 28 , 28 , 1))
    optimizer = tf.keras.optimizers.Adam()#learning_rate = 0.001
    batch_size = 32
    batch_num = y_train.shape[0] // batch_size
    data = dataset(x_train,y_train)

    summary_writer = tf.summary.create_file_writer('./tensorboard')
    tf.summary.trace_on(graph=True, profiler=True)  # 开启Trace,可以记录图结构和profile信息

    checkpoint_save_path = "./checkpoint/mnist.ckpt"
    if os.path.exists(checkpoint_save_path + '.index'):
        print('-------------load the model-----------------')
        model.load_weights(checkpoint_save_path)

    for epoch in range(1) :

        for num in range(batch_num) :

            inputs_x, inputs_y = data.sample(batch_size) 

            with tf.GradientTape() as tape:
                y_loss = loss(model , inputs_x , inputs_y)

            grads = tape.gradient(y_loss, model.trainable_variables)
            optimizer.apply_gradients(zip(grads, model.trainable_variables))


            if num % 600 == 0:
                acc = accuracy(model , x_test , y_test)
                print(num, 'y-loss:',float(y_loss), 'accuracy', float(acc))


            with summary_writer.as_default():                               # 希望使用的记录器
                tf.summary.scalar("y_loss", y_loss, step=num)

        print('#################################################')

    with summary_writer.as_default(): 
        tf.summary.trace_export(name="model_trace", step=0, profiler_outdir='./tensorboard')

    model.save_weights(checkpoint_save_path)

这里,一个epoch就是数据集遍历训练一边的一个轮回,我只来了一遍;一个batch就是一次喂给模型的多个最小单位的集合,比如这里的一个batch是32个。

这个程序段中间的这一段:

            with tf.GradientTape() as tape:
                y_loss = loss(model , inputs_x , inputs_y)

            grads = tape.gradient(y_loss, model.trainable_variables)
            optimizer.apply_gradients(zip(grads, model.trainable_variables))

利用了tensorflow的自动求导机制,运用前面定义好的Adam优化器进行优化求解。

下面的这一段是为了查看训练的进度与主要参数情况。

            if num % 600 == 0:
                acc = accuracy(model , x_test , y_test)
                print(num, 'y-loss:',float(y_loss), 'accuracy', float(acc))

下面这一段则是进行记录,在tensorboard可以查看训练的收敛过程。

           with summary_writer.as_default():                               # 希望使用的记录器
                tf.summary.scalar("y_loss", y_loss, step=num)

为了观察训练收敛过程,在命令行输入:

tensorboard --logdir=./tensorboard

然后把输入命令后的网址复制到浏览器即可。

附上, 训练时的一个截图:
在这里插入图片描述
最后附上我的完整代码,emmmm,因为我图了个简便,便于和前一篇对比,所以这里代码的整体结构挺烂的,不喜勿喷哈。然后数据集也是前文的数据集。

'''
@Auther : gaoxin
@Date : 2020.08.26
@Version : 1.0
'''


import tensorflow as tf
from PIL import Image
import numpy as np
from matplotlib import pyplot as plt
import os
from tensorflow.keras.layers import Conv2D, BatchNormalization, Activation, MaxPool2D, Dropout, Flatten, Dense
from tensorflow.keras import Model
import warnings

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
warnings.filterwarnings("ignore")

np.set_printoptions(threshold = np.inf)


train_path = './mnist_image_label/mnist_train_jpg_60000/'
train_txt = './mnist_image_label/mnist_train_jpg_60000.txt'
x_train_savepath = './mnist_image_label/mnist_x_train.npy'
y_train_savepath = './mnist_image_label/mnist_y_train.npy'

test_path = './mnist_image_label/mnist_test_jpg_10000/'
test_txt = './mnist_image_label/mnist_test_jpg_10000.txt'
x_test_savepath = './mnist_image_label/mnist_x_test.npy'
y_test_savepath = './mnist_image_label/mnist_y_test.npy'


def generateds(path, txt):
    f = open(txt, 'r')
    contents = f.readlines()
    f.close() 
    x, y_ = [], []
    for content in contents: 
        value = content.split() 
        img_path = path + value[0]
        img = Image.open(img_path) 
        img = np.array(img.convert('L'))
        img = img / 255. 
        x.append(img) 
        y_.append(value[1]) 
        print('loading : ' + content)  
    x = np.array(x)  
    y_ = np.array(y_)  
    y_ = y_.astype(np.int64)  
    return x, y_  



if os.path.exists(x_train_savepath) and os.path.exists(y_train_savepath) and os.path.exists(
    x_test_savepath) and os.path.exists(y_test_savepath):
        print('-------------Load Datasets-----------------')
        x_train_save = np.load(x_train_savepath)
        y_train = np.load(y_train_savepath)
        x_test_save = np.load(x_test_savepath)
        y_test = np.load(y_test_savepath)

        x_train = np.reshape(x_train_save, (len(x_train_save), 28, 28, 1))
        x_test = np.reshape(x_test_save, (len(x_test_save), 28, 28, 1))
else:
    	print('-------------Generate Datasets-----------------')
    	x_train, y_train = generateds(train_path, train_txt)
    	x_test, y_test = generateds(test_path, test_txt)
	
    	print('-------------Save Datasets-----------------')
    	x_train_save = np.reshape(x_train, (len(x_train), -1))
    	x_test_save = np.reshape(x_test, (len(x_test), -1))
    	np.save(x_train_savepath, x_train_save)
    	np.save(y_train_savepath, y_train)
    	np.save(x_test_savepath, x_test_save)
    	np.save(y_test_savepath, y_test)



class Baseline(Model) :
    def __init__(self):
        super(Baseline, self).__init__()
        self.c1 = Conv2D(filters=6, kernel_size=(5, 5), padding='same')  # 卷积层
        self.b1 = BatchNormalization()  # BN层
        self.a1 = Activation('relu')  # 激活层
        self.p1 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')  # 池化层
        self.d1 = Dropout(0.2)  # dropout层

        self.flatten = Flatten()
        self.f1 = Dense(128, activation='relu')
        self.d2 = Dropout(0.2)
        self.f2 = Dense(10, activation='softmax')

    @tf.function
    def call(self, x):
        x = self.c1(x)
        x = self.b1(x)
        x = self.a1(x)
        x = self.p1(x)
        x = self.d1(x)

        x = self.flatten(x)
        x = self.f1(x)
        x = self.d2(x)
        y = self.f2(x)

        return y


def loss(model , inputs_x , inputs_y) :
    prediction = model(inputs_x)
    y = np.zeros(prediction.shape)
    for i in range(len(inputs_y)) :
        y[i,inputs_y[i]] = 1
    #print(y)
    loss = tf.reduce_mean(tf.square(y-prediction))
    return loss


def accuracy(model ,x , y) :
    acc = 0
    prediction = model(x)
    y_ = tf.argmax(prediction,1)
    num = len(y_)
    for i in range(num) :
        if y[i] == y_[i] : acc = acc+1
    return acc/num


class dataset() :
    def __init__(self,train_x,train_y) :
        self.counter = 0
        self.train_x = train_x
        self.train_y = train_y

    def sample(self, batch_size) :
        batch_x = self.train_x[self.counter:self.counter+batch_size]
        batch_y = self.train_y[self.counter:self.counter+batch_size]
        self.counter = self.counter + batch_size
        if self.counter > self.train_y.shape[0] :
            self.counter = 0
        return batch_x , batch_y


def training() :
    model = Baseline()
    model.build(input_shape = (None , 28 , 28 , 1))
    optimizer = tf.keras.optimizers.Adam()#learning_rate = 0.001
    batch_size = 32
    batch_num = y_train.shape[0] // batch_size
    data = dataset(x_train,y_train)

    summary_writer = tf.summary.create_file_writer('./tensorboard')
    tf.summary.trace_on(graph=True, profiler=True)  # 开启Trace,可以记录图结构和profile信息

    checkpoint_save_path = "./checkpoint/mnist.ckpt"
    if os.path.exists(checkpoint_save_path + '.index'):
        print('-------------load the model-----------------')
        model.load_weights(checkpoint_save_path)

    for epoch in range(1) :

        for num in range(batch_num) :

            inputs_x, inputs_y = data.sample(batch_size) 

            with tf.GradientTape() as tape:
                y_loss = loss(model , inputs_x , inputs_y)

            grads = tape.gradient(y_loss, model.trainable_variables)
            optimizer.apply_gradients(zip(grads, model.trainable_variables))


            if num % 600 == 0:
                acc = accuracy(model , x_test , y_test)
                print(num, 'y-loss:',float(y_loss), 'accuracy', float(acc))


            with summary_writer.as_default():                               # 希望使用的记录器
                tf.summary.scalar("y_loss", y_loss, step=num)

        print('#################################################')

    with summary_writer.as_default(): 
        tf.summary.trace_export(name="model_trace", step=0, profiler_outdir='./tensorboard')

    model.save_weights(checkpoint_save_path)


if __name__ == '__main__' :
    training()
    

官网例子

在最后加上官网的一个mnist训练的例子。
给一个官网教程连接:https://tensorflow.google.cn/tutorials/quickstart/advanced

import tensorflow as tf

from tensorflow.keras.layers import Dense, Flatten, Conv2D
from tensorflow.keras import Model

mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# Add a channels dimension
x_train = x_train[..., tf.newaxis]
x_test = x_test[..., tf.newaxis]

train_ds = tf.data.Dataset.from_tensor_slices(
    (x_train, y_train)).shuffle(10000).batch(32)
test_ds = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(32)

class MyModel(Model):
  def __init__(self):
    super(MyModel, self).__init__()
    self.conv1 = Conv2D(32, 3, activation='relu')
    self.flatten = Flatten()
    self.d1 = Dense(128, activation='relu')
    self.d2 = Dense(10, activation='softmax')

  def call(self, x):
    x = self.conv1(x)
    x = self.flatten(x)
    x = self.d1(x)
    return self.d2(x)

model = MyModel()

loss_object = tf.keras.losses.SparseCategoricalCrossentropy()

optimizer = tf.keras.optimizers.Adam()

train_loss = tf.keras.metrics.Mean(name='train_loss')
train_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name='train_accuracy')

test_loss = tf.keras.metrics.Mean(name='test_loss')
test_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name='test_accuracy')

@tf.function
def train_step(images, labels):
  with tf.GradientTape() as tape:
    predictions = model(images)
    loss = loss_object(labels, predictions)
  gradients = tape.gradient(loss, model.trainable_variables)
  optimizer.apply_gradients(zip(gradients, model.trainable_variables))

  train_loss(loss)
  train_accuracy(labels, predictions)

@tf.function
def test_step(images, labels):
  predictions = model(images)
  t_loss = loss_object(labels, predictions)

  test_loss(t_loss)
  test_accuracy(labels, predictions)

EPOCHS = 5

for epoch in range(EPOCHS):
  # 在下一个epoch开始时,重置评估指标
  train_loss.reset_states()
  train_accuracy.reset_states()
  test_loss.reset_states()
  test_accuracy.reset_states()

  for images, labels in train_ds:
    train_step(images, labels)

  for test_images, test_labels in test_ds:
    test_step(test_images, test_labels)

  template = 'Epoch {}, Loss: {}, Accuracy: {}, Test Loss: {}, Test Accuracy: {}'
  print (template.format(epoch+1,
                         train_loss.result(),
                         train_accuracy.result()*100,
                         test_loss.result(),
                         test_accuracy.result()*100))
                         
  • 3
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值