tensorflow2.0基础(6)——模型的训练

模型的训练

模型训练的方式有三种:内置fit方法,内置train_on_batch方法,自定义训练循环。
注:fit_generator方法在tf.keras中不推荐使用,其功能已经被fit包含。

import numpy as np 
import pandas as pd 
import tensorflow as tf
from tensorflow.keras import * 

#打印时间分割线
@tf.function
def printbar():
    ts = tf.timestamp()
    today_ts = ts%(24*60*60)

    hour = tf.cast(today_ts//3600+8,tf.int32)%tf.constant(24)
    minite = tf.cast((today_ts%3600)//60,tf.int32)
    second = tf.cast(tf.floor(today_ts%60),tf.int32)
    
    def timeformat(m):
        if tf.strings.length(tf.strings.format("{}",m))==1:
            return(tf.strings.format("0{}",m))
        else:
            return(tf.strings.format("{}",m))
    
    timestring = tf.strings.join([timeformat(hour),timeformat(minite),
                timeformat(second)],separator = ":")
    tf.print("=========="*8,end = "")
    tf.print(timestring)
MAX_LEN = 300
BATCH_SIZE = 32
(x_train,y_train),(x_test,y_test) = datasets.reuters.load_data()
x_train = preprocessing.sequence.pad_sequences(x_train,maxlen=MAX_LEN)
x_test = preprocessing.sequence.pad_sequences(x_test,maxlen=MAX_LEN)

MAX_WORDS = x_train.max()+1
CAT_NUM = y_train.max()+1

ds_train = tf.data.Dataset.from_tensor_slices((x_train,y_train)) \
          .shuffle(buffer_size = 1000).batch(BATCH_SIZE) \
          .prefetch(tf.data.experimental.AUTOTUNE).cache()
   
ds_test = tf.data.Dataset.from_tensor_slices((x_test,y_test)) \
          .shuffle(buffer_size = 1000).batch(BATCH_SIZE) \
          .prefetch(tf.data.experimental.AUTOTUNE).cache()
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/reuters.npz
2113536/2110848 [==============================] - 3s 1us/step

1、内置fit方法

fit方法功能非常强大, 支持对numpy array, tf.data.Dataset以及 Python generator数据进行训练。

并且可以通过设置回调函数实现对训练过程的复杂控制逻辑。

tf.keras.backend.clear_session()
def create_model():
    
    model = models.Sequential()
    model.add(layers.Embedding(MAX_WORDS,7,input_length=MAX_LEN))
    model.add(layers.Conv1D(filters = 64,kernel_size = 5,activation = "relu"))
    model.add(layers.MaxPool1D(2))
    model.add(layers.Conv1D(filters = 32,kernel_size = 3,activation = "relu"))
    model.add(layers.MaxPool1D(2))
    model.add(layers.Flatten())
    model.add(layers.Dense(CAT_NUM,activation = "softmax"))
    return(model)

def compile_model(model):
    model.compile(optimizer=optimizers.Nadam(),
                loss=losses.SparseCategoricalCrossentropy(),
                metrics=[metrics.SparseCategoricalAccuracy(),metrics.SparseTopKCategoricalAccuracy(5)]) 
    return(model)
 
model = create_model()
model.summary()
model = compile_model(model)
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding (Embedding)        (None, 300, 7)            216874    
_________________________________________________________________
conv1d (Conv1D)              (None, 296, 64)           2304      
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 148, 64)           0         
_________________________________________________________________
conv1d_1 (Conv1D)            (None, 146, 32)           6176      
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 73, 32)            0         
_________________________________________________________________
flatten (Flatten)            (None, 2336)              0         
_________________________________________________________________
dense (Dense)                (None, 46)                107502    
=================================================================
Total params: 332,856
Trainable params: 332,856
Non-trainable params: 0
_________________________________________________________________

ds_train,ds_test都可以使用生成器分批传入数据来缓解内存压力。

history = model.fit(ds_train,validation_data = ds_test,epochs = 10)
Train for 281 steps, validate for 71 steps
Epoch 1/10
281/281 [==============================] - 11s 39ms/step - loss: 2.0054 - sparse_categorical_accuracy: 0.4683 - sparse_top_k_categorical_accuracy: 0.7443 - val_loss: 1.6604 - val_sparse_categorical_accuracy: 0.5744 - val_sparse_top_k_categorical_accuracy: 0.7569
Epoch 2/10
281/281 [==============================] - 8s 30ms/step - loss: 1.4802 - sparse_categorical_accuracy: 0.6167 - sparse_top_k_categorical_accuracy: 0.7947 - val_loss: 1.5262 - val_sparse_categorical_accuracy: 0.6242 - val_sparse_top_k_categorical_accuracy: 0.7907
Epoch 3/10
281/281 [==============================] - 10s 34ms/step - loss: 1.2000 - sparse_categorical_accuracy: 0.6898 - sparse_top_k_categorical_accuracy: 0.8485 - val_loss: 1.5431 - val_sparse_categorical_accuracy: 0.6367 - val_sparse_top_k_categorical_accuracy: 0.8032
Epoch 4/10
281/281 [==============================] - 8s 30ms/step - loss: 0.9274 - sparse_categorical_accuracy: 0.7626 - sparse_top_k_categorical_accuracy: 0.9049 - val_loss: 1.7144 - val_sparse_categorical_accuracy: 0.6300 - val_sparse_top_k_categorical_accuracy: 0.8010
Epoch 5/10
281/281 [==============================] - 10s 34ms/step - loss: 0.6881 - sparse_categorical_accuracy: 0.8241 - sparse_top_k_categorical_accuracy: 0.9463 - val_loss: 1.9174 - val_sparse_categorical_accuracy: 0.6247 - val_sparse_top_k_categorical_accuracy: 0.7983
Epoch 6/10
281/281 [==============================] - 9s 33ms/step - loss: 0.5167 - sparse_categorical_accuracy: 0.8753 - sparse_top_k_categorical_accuracy: 0.9687 - val_loss: 2.0889 - val_sparse_categorical_accuracy: 0.6291 - val_sparse_top_k_categorical_accuracy: 0.8001
Epoch 7/10
281/281 [==============================] - 10s 35ms/step - loss: 0.4075 - sparse_categorical_accuracy: 0.9044 - sparse_top_k_categorical_accuracy: 0.9800 - val_loss: 2.2479 - val_sparse_categorical_accuracy: 0.6278 - val_sparse_top_k_categorical_accuracy: 0.8037
Epoch 8/10
281/281 [==============================] - 10s 35ms/step - loss: 0.3367 - sparse_categorical_accuracy: 0.9194 - sparse_top_k_categorical_accuracy: 0.9869 - val_loss: 2.4076 - val_sparse_categorical_accuracy: 0.6193 - val_sparse_top_k_categorical_accuracy: 0.8010
Epoch 9/10
281/281 [==============================] - 10s 35ms/step - loss: 0.2888 - sparse_categorical_accuracy: 0.9308 - sparse_top_k_categorical_accuracy: 0.9910 - val_loss: 2.5644 - val_sparse_categorical_accuracy: 0.6180 - val_sparse_top_k_categorical_accuracy: 0.7988
Epoch 10/10
281/281 [==============================] - 8s 29ms/step - loss: 0.2543 - sparse_categorical_accuracy: 0.9361 - sparse_top_k_categorical_accuracy: 0.9935 - val_loss: 2.7273 - val_sparse_categorical_accuracy: 0.6171 - val_sparse_top_k_categorical_accuracy: 0.7970

2、内置train_on_batch方法

该内置方法相比较fit方法更加灵活,可以不通过回调函数而直接在批次层次上更加精细地控制训练的过程。

tf.keras.backend.clear_session()

def create_model():
    model = models.Sequential()

    model.add(layers.Embedding(MAX_WORDS,7,input_length=MAX_LEN))
    model.add(layers.Conv1D(filters = 64,kernel_size = 5,activation = "relu"))
    model.add(layers.MaxPool1D(2))
    model.add(layers.Conv1D(filters = 32,kernel_size = 3,activation = "relu"))
    model.add(layers.MaxPool1D(2))
    model.add(layers.Flatten())
    model.add(layers.Dense(CAT_NUM,activation = "softmax"))
    return(model)

def compile_model(model):
    model.compile(optimizer=optimizers.Nadam(),
                loss=losses.SparseCategoricalCrossentropy(),
                metrics=[metrics.SparseCategoricalAccuracy(),metrics.SparseTopKCategoricalAccuracy(5)]) 
    return(model)
 
model = create_model()
model.summary()
model = compile_model(model)
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding (Embedding)        (None, 300, 7)            216874    
_________________________________________________________________
conv1d (Conv1D)              (None, 296, 64)           2304      
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 148, 64)           0         
_________________________________________________________________
conv1d_1 (Conv1D)            (None, 146, 32)           6176      
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 73, 32)            0         
_________________________________________________________________
flatten (Flatten)            (None, 2336)              0         
_________________________________________________________________
dense (Dense)                (None, 46)                107502    
=================================================================
Total params: 332,856
Trainable params: 332,856
Non-trainable params: 0
_________________________________________________________________

    for epoch in tf.range(1,epoches+1):
        model.reset_metrics()
        
        # 在后期降低学习率
        if epoch == 5:
            model.optimizer.lr.assign(model.optimizer.lr/2.0)
            tf.print("Lowering optimizer Learning Rate...\n\n")
        
        for x, y in ds_train:
            train_result = model.train_on_batch(x, y)

        for x, y in ds_valid:
            valid_result = model.test_on_batch(x, y,reset_metrics=False)
            
        if epoch%1 ==0:
            printbar()
            tf.print("epoch = ",epoch)
            print("train:",dict(zip(model.metrics_names,train_result)))
            print("valid:",dict(zip(model.metrics_names,valid_result)))
print("")
train_model(model,ds_train,ds_test,10)
================================================================================16:44:50
epoch =  1
train: {'loss': 1.7782589, 'sparse_categorical_accuracy': 0.54545456, 'sparse_top_k_categorical_accuracy': 0.6818182}
valid: {'loss': 2.1192057, 'sparse_categorical_accuracy': 0.5507569, 'sparse_top_k_categorical_accuracy': 0.75779164}

================================================================================16:44:57
epoch =  2
train: {'loss': 1.4705807, 'sparse_categorical_accuracy': 0.59090906, 'sparse_top_k_categorical_accuracy': 0.72727275}
valid: {'loss': 1.6258131, 'sparse_categorical_accuracy': 0.6046305, 'sparse_top_k_categorical_accuracy': 0.7871772}

================================================================================16:45:04
epoch =  3
train: {'loss': 1.088266, 'sparse_categorical_accuracy': 0.72727275, 'sparse_top_k_categorical_accuracy': 0.8181818}
valid: {'loss': 1.3976628, 'sparse_categorical_accuracy': 0.6451469, 'sparse_top_k_categorical_accuracy': 0.8134461}

================================================================================16:45:12
epoch =  4
train: {'loss': 0.71706825, 'sparse_categorical_accuracy': 0.77272725, 'sparse_top_k_categorical_accuracy': 0.95454544}
valid: {'loss': 1.4577352, 'sparse_categorical_accuracy': 0.6460374, 'sparse_top_k_categorical_accuracy': 0.81745327}

Lowering optimizer Learning Rate...


================================================================================16:45:19
epoch =  5
train: {'loss': 0.45584556, 'sparse_categorical_accuracy': 0.8636364, 'sparse_top_k_categorical_accuracy': 0.95454544}
valid: {'loss': 1.5735245, 'sparse_categorical_accuracy': 0.65138024, 'sparse_top_k_categorical_accuracy': 0.8165628}

================================================================================16:45:26
epoch =  6
train: {'loss': 0.360793, 'sparse_categorical_accuracy': 0.95454544, 'sparse_top_k_categorical_accuracy': 1.0}
valid: {'loss': 1.673664, 'sparse_categorical_accuracy': 0.650935, 'sparse_top_k_categorical_accuracy': 0.81478184}

================================================================================16:45:33
epoch =  7
train: {'loss': 0.29006183, 'sparse_categorical_accuracy': 0.95454544, 'sparse_top_k_categorical_accuracy': 1.0}
valid: {'loss': 1.7496933, 'sparse_categorical_accuracy': 0.6438112, 'sparse_top_k_categorical_accuracy': 0.817008}

================================================================================16:45:40
epoch =  8
train: {'loss': 0.23076382, 'sparse_categorical_accuracy': 0.95454544, 'sparse_top_k_categorical_accuracy': 1.0}
valid: {'loss': 1.8100835, 'sparse_categorical_accuracy': 0.64069456, 'sparse_top_k_categorical_accuracy': 0.81745327}

================================================================================16:45:48
epoch =  9
train: {'loss': 0.18231665, 'sparse_categorical_accuracy': 1.0, 'sparse_top_k_categorical_accuracy': 1.0}
valid: {'loss': 1.8627172, 'sparse_categorical_accuracy': 0.6398041, 'sparse_top_k_categorical_accuracy': 0.8161175}

================================================================================16:45:55
epoch =  10
train: {'loss': 0.14538175, 'sparse_categorical_accuracy': 1.0, 'sparse_top_k_categorical_accuracy': 1.0}
valid: {'loss': 1.9056457, 'sparse_categorical_accuracy': 0.6375779, 'sparse_top_k_categorical_accuracy': 0.8156723}

3、自定义训练循环

自定义训练循环无需编译模型,直接利用优化器根据损失函数反向传播迭代参数,拥有最高的灵活性。

tf.keras.backend.clear_session()

def create_model():
    
    model = models.Sequential()

    model.add(layers.Embedding(MAX_WORDS,7,input_length=MAX_LEN))
    model.add(layers.Conv1D(filters = 64,kernel_size = 5,activation = "relu"))
    model.add(layers.MaxPool1D(2))
    model.add(layers.Conv1D(filters = 32,kernel_size = 3,activation = "relu"))
    model.add(layers.MaxPool1D(2))
    model.add(layers.Flatten())
    model.add(layers.Dense(CAT_NUM,activation = "softmax"))
    return(model)

model = create_model()
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding (Embedding)        (None, 300, 7)            216874    
_________________________________________________________________
conv1d (Conv1D)              (None, 296, 64)           2304      
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 148, 64)           0         
_________________________________________________________________
conv1d_1 (Conv1D)            (None, 146, 32)           6176      
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 73, 32)            0         
_________________________________________________________________
flatten (Flatten)            (None, 2336)              0         
_________________________________________________________________
dense (Dense)                (None, 46)                107502    
=================================================================
Total params: 332,856
Trainable params: 332,856
Non-trainable params: 0
_________________________________________________________________

optimizer = optimizers.Nadam()
loss_func = losses.SparseCategoricalCrossentropy()

train_loss = metrics.Mean(name='train_loss')
train_metric = metrics.SparseCategoricalAccuracy(name='train_accuracy')

valid_loss = metrics.Mean(name='valid_loss')
valid_metric = metrics.SparseCategoricalAccuracy(name='valid_accuracy')

@tf.function
def train_step(model, features, labels):
    with tf.GradientTape() as tape:
        predictions = model(features,training = True)
        loss = loss_func(labels, predictions)
    gradients = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))

    train_loss.update_state(loss)
    train_metric.update_state(labels, predictions)
    

@tf.function
def valid_step(model, features, labels):
    predictions = model(features)
    batch_loss = loss_func(labels, predictions)
    valid_loss.update_state(batch_loss)
    valid_metric.update_state(labels, predictions)
    

def train_model(model,ds_train,ds_valid,epochs):
    for epoch in tf.range(1,epochs+1):
        
        for features, labels in ds_train:
            train_step(model,features,labels)

        for features, labels in ds_valid:
            valid_step(model,features,labels)

        logs = 'Epoch={},Loss:{},Accuracy:{},Valid Loss:{},Valid Accuracy:{}'
        
        if epoch%1 ==0:
            printbar()
            tf.print(tf.strings.format(logs,
            (epoch,train_loss.result(),train_metric.result(),valid_loss.result(),valid_metric.result())))
            tf.print("")
            
        train_loss.reset_states()
        valid_loss.reset_states()
        train_metric.reset_states()
        valid_metric.reset_states()

train_model(model,ds_train,ds_test,10)
================================================================================16:46:18
Epoch=1,Loss:2.00627327,Accuracy:0.46504119,Valid Loss:1.69869936,Valid Accuracy:0.553873539

================================================================================16:46:25
Epoch=2,Loss:1.46558726,Accuracy:0.621131122,Valid Loss:1.55003631,Valid Accuracy:0.609973311

================================================================================16:46:32
Epoch=3,Loss:1.18542743,Accuracy:0.688376725,Valid Loss:1.56454027,Valid Accuracy:0.646482646

================================================================================16:46:39
Epoch=4,Loss:0.912428439,Accuracy:0.761523068,Valid Loss:1.75646842,Valid Accuracy:0.646927893

================================================================================16:46:46
Epoch=5,Loss:0.672883391,Accuracy:0.82654196,Valid Loss:2.02260804,Valid Accuracy:0.636687458

================================================================================16:46:53
Epoch=6,Loss:0.513099134,Accuracy:0.875417531,Valid Loss:2.30667543,Valid Accuracy:0.626447

================================================================================16:47:00
Epoch=7,Loss:0.412734926,Accuracy:0.90236026,Valid Loss:2.55101776,Valid Accuracy:0.624220848

================================================================================16:47:07
Epoch=8,Loss:0.344970435,Accuracy:0.917279,Valid Loss:2.73618364,Valid Accuracy:0.621549428

================================================================================16:47:14
Epoch=9,Loss:0.299876332,Accuracy:0.927521706,Valid Loss:2.86248517,Valid Accuracy:0.621549428

================================================================================16:47:21
Epoch=10,Loss:0.267550141,Accuracy:0.934313059,Valid Loss:2.94415259,Valid Accuracy:0.623330355

本文参考链接

  • 0
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值