NOTE_北大tensorflow_Chapter 4

最新推荐文章于 2024-07-16 20:28:23 发布

甲壳虫奇袭电脑城

最新推荐文章于 2024-07-16 20:28:23 发布

阅读量120

点赞数 1

分类专栏： Tensorflow 文章标签： tensorflow

本文链接：https://blog.csdn.net/LLABVIEW/article/details/120028368

版权

Tensorflow 专栏收录该内容

6 篇文章 0 订阅

订阅专栏

step	method
1	module
2	data and shuffle
3	model = tf.keras.model.Sequential([ ])
4	model.compile
5	model.fit
6	model.summary

Overview

以上一章六步法为骨干，做6个扩展

扩展功能	作用	对应step
1.自制数据集	解决本领域应用	2_dataset
2.数据集增强	数据集比较小时扩充数据集	2_dataset
3.断点续训	存取模型	5_model.fit
4.参数提取	把参数存入文本	6_model.summary
5.acc / loss可视化	查看训练效果	6_model.summary
6.应用程序	给图识物	6_model.summary

1.自治数据集

1.1 编辑文件路径

2，2，4原则确定文件路径字符串
2个dataset文件夹路径（train，validation），2个label txt路径，4个npy路径（x\y_train，x\y_validation ）。

1.2 自定义函数

目的：读取含有data名称和label的txt文件，输出data array和label array。
1.打开txt文件，readline所有行，关闭文件
2.用for iterate 所有行，split每一行的图片名和label，将图像转换为8bit灰度图然后转换成np.array，图像除以255使其在0-1区间有利与神经网路处理，将img和label添加至相关数组。
3.转换x为np.array,转换y_为np.array并.astype至np.int64

1.3 if 4个npy文件是否齐全

true:读数据和label，并reshape数据格式

false:调用自定义函数，生成训练集和测试集，将新的data和label保存成4个npy文件

1.4 常用函数

import tensorflow as tf
import numpy as np
import os
from PIL import Image

f = open(txtPath, 'r')
contents = f.readlines()
f.close()
for index in contents:
    value = index.split()
    img = Image.open(dataPath + value[0])
    img = np.array(img.convert('L'))
x = np.array(x)
x = x.reshape((len(x), 28, 28))
y_ = np.array(y_)
y = y_.astype(np.int64)
x_train = np.load(train_x_npy_path)
x_train, y_train = generate(train_path, train_txt_path)
np.save(train_x_npy_path, x_train)
model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(128, activation = 'relu'),
    tf.keras.layers.Dense(10, activation = 'softmax')
])
model.compile()
model.fit()
model.summary()

2.数据集增强

2.1 变化

1.将x_train， x_train.reshape成为6万张28*28单通道灰度值数据。
2.生成tensorflow.keras.preprocessing.image.ImageDataGenerator()对象。
3.model.fit()中的x_train, y_train, batch_size通过对象.flow合三为一（flow返回一个数据标签的元组值）。

2.2 代码

import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
#########################################################################

#1.从6万张28*28数据， 变为6万张28*28单通道灰度值数据
#x_train.shape = （60000， 28， 28）
#x_train.shape[0] = 60000
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
#########################################################################

#2.生成对象，赋值6大参数
image_gen_train = ImageDataGenerator(
    rescale = 1. / 1.,#缩放系数
    rotation_range = 45, #随机旋转角度范围
    width_shift_range = .15, #随机宽度偏移量
    height_shift_range = .15, #随机高度偏移量
    horizontal_flip = False, #水平翻转
    zoom_range = 0.5 #将图像随机缩放裕量50%
)
#########################################################################

#3.x，y，size三合一
model.fit(image_gen_train.flow(x_train, y_train, batch_size = 32), epochs = 5,
          validation_data = (x_test, y_test), validation_freq = 1)

3.断点续训

3.1 变化

1.compile 和 fit 中间添加权重文件以及 callback 函数。
2.model.fit() 中添加 callback 参数。#如果之前没有’.index’文件，则回调函数会在model.fit中生成一个文件夹（tensorflow新版本不会生成.ckpt文件，.ckpt生成一个文件夹内涵三个文件：1. checkpoint，2. ‘.data’文件，3. ‘.index’文件）
keras.callbacks.ModelCheckpoint()及模型的训练

3.2 代码

#1.添加权重和回调函数
checkpoint_save_path = "./checkpoint/mnist.ckpt"
if os.path.exists(checkpoint_save_path + '.index'):
    print('-------------load the model-----------------')
    model.load_weights(checkpoint_save_path)

#在每个epoch后保存模型到filepath
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,
                                                 save_weights_only=True,#只保留模型参数
                                                 save_best_only=True)#只保留最优模型

#########################################################################
#2.添加回调参数
history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1,
                    callbacks=[cp_callback])

4.参数提取

4.1 与断电续训变化

1.控制输出方式np.set_printoptions()#inf表示正无穷， threshold：控制输出的值的个数
2.在断点续训的model.summary()后，添加文件内容。

4.2 代码

import numpy as np
#inf表示正无穷 ， threshold：控制输出的值的个数
np.set_printoptions(threshold = np.inf)

model.summary()

#########################################################################
#修改位置
print(model.trainable_variables)
file = open('./weight.txt', 'w')
for v in model.trainable_variables:
    file.write(str(v.name) + '\n')
    file.write(str(v.shape) + '\n')
    file.write(str(v.numpy()) + '\n')
file.close()

5. acc / loss 可视化

1.与参数提取的变化

1.在最末尾增加可视化。（1.提取history中的四条曲线。2.设置参数。3.show。）

2.代码

#############        show        ###################
acc = history.history['sparse_categorical_accuracy']
val_acc = history.history['val_sparse_categorical_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
###################################################################
#plt.subplot(1, 2, 1) 代表总体大图有几个小图， 此为1行2列的大图，最后一个1代表该小图在大图中的索引
###################################################################
plt.subplot(1, 2, 1)
plt.plot(acc, label = 'training accuracy')
plt.plot(val_acc, label = 'validation accuracy')
plt.title('training and validation accuracy')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(loss, label='Training Loss')
plt.plot(val_loss, label='Validation Loss')
plt.title('Training and Validation Loss')
plt.legend()#图例名称

plt.show()

6.应用程序

Content：应用已经训练好的参数（储存在 ‘.ckpt’ 文件中）对输入进来的图像进行预测。

1.步骤

1.制作与 ‘.ckpt’ 文件内部相同的Sequential（Dense内相关节点数不要搞错）。
2.将该model加载权重。
3.输入处理图片张数。
4.加载图片，之后设置图片大小及图片品质，因为训练集图片为2828大小，所以加载的图片resize为2828，品质平滑抗锯齿。
5.将图片转换成灰度值并转换成np数组。
6.对灰度值数组的处理有两种方法（因为训练集中的背景是黑色，测试集是白底所以需要对测试集图片进行处理）：(1)255减去原图像。(2)对每个像素进行二值化。实测第二种方法正确率最高，第一种方法经常识别错误。
7.数据归一化。
8.图像添加维度，进行预测。（因为训练时都是按照batch送入网络的，所以在进入predict前在图像数组前添加一个维度）
9.返回最大标签的index。

2.代码

import tensorflow as tf
import numpy as np
from PIL import Image

cp_path = './checkpoint/mnist.ckpt'

model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(256, activation = 'relu'),
    tf.keras.layers.Dense(10, activation = 'softmax')
])

model.load_weights(cp_path)

times = int(input('test times:'))

for i in range(times):
    pic_path = input('picture path:')
    img = Image.open(pic_path)
    img = img.resize((28, 28), Image.ANTIALIAS)
    img_arr = np.array(img.convert('L'))
##############################################################
#第二种方法
    for i in range(28):
        for j in range(28):
            if img_arr[i][j] < 200:
                img_arr[i][j] = 255
            else:
                img_arr[i][j] = 0
##############################################################
#第一种方法
	img_arr = 255 - img_arr
##############################################################
    img_arr = img_arr / 255.0
    x_predict = img_arr[tf.newaxis, ...]
    pred = model.predict(x_predict)

    result = tf.argmax(pred, axis = 1)

    print(pred)
    print('\n')
    print(result)

甲壳虫奇袭电脑城

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
NOTE_北大tensorflow_Chapter 4

文件：2，2，2原则确定文件路径字符串总体两个部分：1.train。2.validation每个部分两个功能：1.加载（1.数据集路径。2.标签txt路径。）。2.保存（1.数据集’.npy’。2.标签集’.npy‘
复制链接

扫一扫