记一下Fashion_mnist的Sequential和Model

最新推荐文章于 2023-03-13 22:03:08 发布

魑魅魍魉.

最新推荐文章于 2023-03-13 22:03:08 发布

阅读量526

点赞数

分类专栏： deep Learning

本文链接：https://blog.csdn.net/na_fantastic/article/details/102852669

版权

deep Learning 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

首先，keras的Sequential和Model两种模式

简单的模型和实验用Sequential就足够，简单快捷，大型一些的模型和复杂一点的网络结构用Model，能实现更多需求。一般入门级训练集就是mnist手写数据，用的Fashion_mnist，数据类型是一样的，就是图片内容是手写数字和服装的区别。

Fashion_mnist数据集

keras的datasets里有fashion_mnist数据集，直接加载就可以，这个数据集里分了6W张训练图片和对应的label，1W张测试图片和对应label。加载数据集后可以可视化看一下，label不是独热编码，要是用独热编码要自己手动转换一下。

import keras
import matplotlib.pyplot as plt
fashion_mnist = keras.datasets.fashion_mnist
#60000训练，10000测试，图片28x28，10个种类，像素值在0-255
(images_train, train_label), (images_test, test_label)= fashion_mnist.load_data()
#数据集里没有名字，是用序号区别的
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
print("数据格式:",images_train.shape,train_label.shape)
print("class:",train_label)
plt.figure()
plt.imshow(images_train[7])
plt.colorbar()
plt.grid(False)
plt.show()

plt.figure(figsize=(10,10))
for i in range(25):
    plt.subplot(5,5,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(images_train[i], cmap=plt.cm.binary)
    plt.xlabel(class_names[train_label[i]])
plt.show()

在这里插入图片描述

序列模型 Sequential

这个模式写起来比较简单快捷，训练测试集准备好之后，对好输入数据维度，逐层添加就可以了，compile编译添加优化器损失函数，fit输入训练图片、真实label、设置迭代次数开始训练，predict可以就刚训练的模型进行预测。

import keras
from keras.callbacks import ModelCheckpoint

fashion_mnist = keras.datasets.fashion_mnist
#60000训练，10000测试，图片28x28，10个种类，像素值在0-255
(images_train, train_label), (images_test, test_label)= fashion_mnist.load_data()
images_train= images_train/255.0
images_test= images_test/255.0

model= keras.Sequential([
        keras.layers.Flatten(input_shape=(28,28)),
        keras.layers.Dense(128,activation='relu'),
        keras.layers.Dense(10,activation='softmax')
        ])
model.summary()#输出模型各层的参数状况
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
model_dir=os.getcwd()
now=datetime.datetime.now()
log_dir = os.path.join(model_dir, "{}{:%Y%m%dT%H%M}".format(
            "mymodel", now))
if not os.path.exists(log_dir):
    os.makedirs(log_dir)
checkpoint_path = os.path.join(log_dir, "model_{}_*epoch*.h5".format(
        "first"))
checkpoint_path = checkpoint_path.replace(
        "*epoch*", "{epoch:04d}")

checkpoint=ModelCheckpoint(checkpoint_path,verbose=0,save_weights_only=True)
model.fit(images_train, train_label, epochs=10,callbacks=[checkpoint])
predictions = model.predict(images_test)
print("第一张图的预测值:",predictions[0])
print("第一张图预测值的最大值：",np.argmax(predictions[0]))    
print("第一张图的真实label：",test_label[0])

fashion_mnist都是灰度图，也就没有通道那一维，对应输入就（28，28），最后映射成10个输出，代表每类的数值，数值最高类为预测数值，输出值与对应的真实label做sparse_categorical_crossentropy损失，迭代优化，当然迭代操作是已经封装好的，大体流程就是这样的。
ModelCheckpoint是为了保存训练权重到指定文件夹，指定文件夹用时间命名创建的，这样权重可以拿出来再度测试用，比较方便，调用放到callbacks回调函数里就可以，如果不用这个回调函数也是可以直接训练测试的，就是我不大清楚默认训练好的权重是放在哪里的。
也有其他保存模型和权重的方法函数，ModelCheckpoint是可以保存每个epoch的权重，或者可以选择保存最优的等。

Model模型

model模型有点麻烦有点复杂，但是比较通用，都调通了感觉也还好，就是没调通的时候感觉很麻烦。
基本流程和Sequential差不多，有几个一开始没弄清楚的地方记一下。
把模型单独封装了，训练和测试都分开写了，其实没有必要搞这么复杂，但是想趁机把Model模式和fit_generator都搞一搞。
先是model.py：

import keras.layers as KL
import keras.models as KM
from keras.callbacks import ModelCheckpoint
import os
import datetime
import numpy as np
def generator(batch_size,images_train,train_label):
    while True:
        offset = np.random.randint(0, images_train.shape[0] - batch_size)
        yield images_train[offset:offset+batch_size], train_label[offset:offset + batch_size]
class F():
    
    def __init__(self,mode,model_dir):
        assert mode in["train","inference"]
        self.mode=mode
        self.model_dir=model_dir#logs
        self.model=self.create_complex_model(mode=mode)
    

    def create_complex_model(self,mode):#模型
        assert mode in ["train","inference"]
        input_img = KL.Input([28, 28, 1])
        x = KL.Conv2D(64,(3,3),activation='relu')(input_img)
        x = KL.MaxPooling2D(2,2)(x)
        x = KL.Conv2D(64,(3,3),activation='relu')(x)
        x = KL.MaxPool2D(2,2)(x)
        x = KL.Flatten()(x)
        x = KL.Dense(128,activation='relu')(x)
        output = KL.Dense(10,activation='softmax')(x)
        
    
        if mode=="train":
            
            inputs=input_img
            outputs=output#确定模型的输入输出
        else:
            inputs=input_img
            outputs=output
        
        model=KM.Model(inputs=inputs,outputs=outputs)
        return model
    
    def train(self,images_train,train_label,images_test,test_label):
        assert self.mode=="train"
        now=datetime.datetime.now()
        self.log_dir = os.path.join(self.model_dir, "{}{:%Y%m%dT%H%M}".format(
            "mymodel", now))
        if not os.path.exists(self.log_dir):
            os.makedirs(self.log_dir)
       
        train_gen=generator(500,images_train,train_label)
        val_dataset=(images_test,test_label)
        self.model.summary()#输出模型各层的参数状况
        checkpoint_path = os.path.join(self.log_dir, "model_{}_*epoch*.h5".format(
            "first"))
        checkpoint_path = checkpoint_path.replace(
            "*epoch*", "{epoch:04d}")

        checkpoint=ModelCheckpoint(checkpoint_path,verbose=0,save_weights_only=True)
        self.model.compile(optimizer="adam",
                           loss='sparse_categorical_crossentropy',
                           metrics=['accuracy'])
       
        self.model.fit_generator(train_gen,
                            epochs=10,
                            steps_per_epoch=50,
                            callbacks=[checkpoint],
                            validation_data=val_dataset,
                            validation_steps=5)
                            
    def test(self,weights,class_name,images,test_label):
        self.model.summary()#输出模型各层的参数状况
        self.model.load_weights(weights)
        predictions=self.model.predict(images)
        print("90张图的预测值:",predictions[90])
        print("90张图预测值的最大值：",np.argmax(predictions[90]))    
        print("90张图的真实label：",test_label[90])

模型是这样的：
self.model.summary()
train.py:

import sys
import os
from tensorflow import keras

ROOT_DIR = os.getcwd()#当前文件路径
sys.path.append(ROOT_DIR)
Model_dir=os.path.join(ROOT_DIR,"logs")#存放权重的文件夹

from fashionmnist import model 
#加载数据集
fashion_mnist = keras.datasets.fashion_mnist
#60000训练，10000测试，图片28x28，10个种类，像素值在0-255
(images_train, train_label), (images_test, test_label)= fashion_mnist.load_data()
images_train=images_train.reshape(60000,28,28,1)#卷积网络要有通道维
images_test=images_test.reshape(10000,28,28,1)
images_train= images_train/255.0
images_test= images_test/255.0
#训练
model=model.F(mode="train",model_dir=Model_dir)#创建模型
model.train(images_train,train_label,images_test,test_label)

先是输入数据类型维度的问题，卷积层要接受的维度大概是（28，28，3）这样的，对应长宽通道数，fashion_mnist是灰度图没有通道维，得reshape加上，或者用numpy扩维，变成（28，28，1）就可以了。

在模型接收输入上，用keras.layers.Input()，这个模型是单输入，只要维度格式对上就行了，还没清楚多输入是靠名字对应的，还是就是靠维度格式直接自动对应的。

还有fit_generator()接收的是generator生成器，这个生成器做的大概工作就是把数据分批返回，毕竟fit_generator()就是做大型训练集分批训练。关于generator生成器，也就是关键字yield，理解到它跟return不一样，return结束调用，yield不会，yield只是暂停，再次调用还是连接上次调用，yield的返回就是generator object，可以打印输出看到和return的区别，到这就可以了，把训练集和对应标签一起放进去分批就可以了。验证集就不用有变动了，直接放进fit_generator就可以了。

如果是自定义的损失函数，可以添加损失函数层，添加自己的损失函数，还没写。

训练可以看到进度条，测试集损失函数变化，正确率，验证集损失函数，验证集正确率。
在这里插入图片描述
test.py:

import os
import sys
import keras

ROOT_DIR = os.getcwd()
sys.path.append(ROOT_DIR)
Model_dir=os.path.join(ROOT_DIR,"logs")
from fashionmnist import model

weights=os.path.join(Model_dir,"model_first_0010.h5")

fashion_mnist = keras.datasets.fashion_mnist
#60000训练，10000测试，图片28x28，10个种类，像素值在0-255
(images_train, train_label), (images_test, test_label)= fashion_mnist.load_data()
images_train=images_train.reshape(60000,28,28,1)
images_test=images_test.reshape(10000,28,28,1)
#images_train= images_train/255.0
#images_test= images_test/255.0
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
model=model.F(mode="inference",model_dir=Model_dir)
model.test(weights,class_names,images_test,test_label)

测试也要把数据转成接收的维数，self.model.load_weights()加载权重，这个加载权重有个参数选择是否按层的名字对应权重，一开始的时候选True测试结果不对，可能是层没有命名的原因，直接加载就没问题了。因为这个模型简单只有一个输入，加载完权重直接用predict放验证集进去就好了。

自定义损失函数

1.自定义loss function

比如这里用的sparse_categorical_crossentropy，

def losses_graph(y_true, y_pred):#自定义损失函数
        
        return tf.keras.backend.sparse_categorical_crossentropy(y_true, y_pred)

自定义计算损失就好，然后调用这样用：

self.model.compile(optimizer="adam",
                           loss=losses_graph,
                           metrics=['accuracy'])

不过结果挺奇怪的，损失函数值没什么问题，显示的正确率会低百分之十左右，不知道是我哪里又忘了想。

2.通过自定义一个keras的layer层

自定义一个keras layer 作为model的最后一层，compile中的loss=None占位就好，可以有多损失。
比如Mask R-CNN里的loss层：

def compile(self, learning_rate, momentum):
        """Gets the model ready for training. Adds losses, regularization, and
        metrics. Then calls the Keras compile() function.
        """
        # Optimizer object优化器这里没什么，就是传参设置一下
        optimizer = keras.optimizers.SGD(
            lr=learning_rate, momentum=momentum,
            clipnorm=self.config.GRADIENT_CLIP_NORM)
        # Add Losses
        # First, clear previously set losses to avoid duplication
        self.keras_model._losses = []
        self.keras_model._per_input_losses = {}
        loss_names = [
            "rpn_class_loss",  "rpn_bbox_loss",
            "mrcnn_class_loss", "mrcnn_bbox_loss", "mrcnn_mask_loss"]
        for name in loss_names:
            layer = self.keras_model.get_layer(name)
            if layer.output in self.keras_model.losses:
                continue
            loss = (
                tf.reduce_mean(layer.output, keepdims=True)
                * self.config.LOSS_WEIGHTS.get(name, 1.))#如果需要可以设置权重
            self.keras_model.add_loss(loss)#这里给模型添加损失函数

        # Add L2 Regularization
        # Skip gamma and beta weights of batch normalization layers.
        reg_losses = [
            keras.regularizers.l2(self.config.WEIGHT_DECAY)(w) / tf.cast(tf.size(w), tf.float32)
            for w in self.keras_model.trainable_weights
            if 'gamma' not in w.name and 'beta' not in w.name]
        self.keras_model.add_loss(tf.add_n(reg_losses))

        # Compile
        self.keras_model.compile(
            optimizer=optimizer,
            loss=[None] * len(self.keras_model.outputs))#这里是多损失

损失函数其实还是用的tf的reduce_mean，add_loss加起来，优化的是损失和，训练的时候的输出outputs就是损失了。
还有用把loss的计算通过Lambda转换为layer然后把layer通过add_loss编译进模型的，都差不多：[添加链接描述](https://www.jianshu.com/p/4283c25f2a8c) 这个本来感觉挺明白的，多看几遍反而不明白了，可能我想太多了，代码运行过是没问题的。