工具的评价
这个仅仅是视觉化的工具,可以帮助理解模型和数据,因为是降维投影,损失了信息,并不能作为提高模型效果的算法。
数据集和模型
数据集--fashion minst
简单的模型--2层全连接,效果很一般,就是个玩具
# fashion_mnist 数据集
# https://tensorflow.google.cn/tutorials/keras/classification
# TensorFlow and tf.keras
import tensorflow as tf # debug <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ModuleNotFoundError: No module named 'gast' 'flatbuffers'
# 解决
# pip uninstall gast
# pip install gast==0.4.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
# pip uninstall flatbuffers
# pip install flatbuffers==1.12 -i https://pypi.tuna.tsinghua.edu.cn/simple
from tensorflow import keras # keras
import numpy as np # numpy 矩阵数据处理
import matplotlib.pyplot as plt # matplotlib 绘图
print(tf.__version__)
fashion_mnist = keras.datasets.fashion_mnist # 数据集 from https://storage.googleapis.com/tensorflow/tf-keras-datasets/
# train-labels-idx1-ubyte.gz
# train-images-idx3-ubyte.gz
# t10k-labels-idx1-ubyte.gz
# t10k-images-idx3-ubyte.gz
# /home/cc/.keras/datasets
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
train_images = train_images / 255.0
test_images = test_images / 255.0
model = keras.Sequential([ # 模型 , 展平层 + 全连接层 + 全连接层
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation='relu'),
# keras.layers.Dense(64, activation='relu'),
# keras.layers.Dense(32, activation='relu'),
keras.layers.Dense(10)
])
model.compile(optimizer='adam', # 模型超参数
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=5 ) # 开始训练模型,得到模型参数
print('\n')
test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=1) # 评估模型 , 输入数据和标签,输出损失值和其他指标
# [vɜːˈbəʊs] verbose = 0 不 输出日志 ,verbose = 1 进度条+记录, verbose = 2 记录
print('Test accuracy:', test_acc) # 打印损失值
看看测试数据集里的图片
print('看测试数据集中 前5个数据 ======================================')
for peekImage in range(5):
print('▼↓ 测试图像,测试标签: ',test_labels[peekImage], class_names[ test_labels[peekImage] ])
plt.figure()
plt.imshow(test_images[peekImage])
plt.colorbar()
plt.grid(False)
plt.show()
在训练好的模型基础上做预测,
用测试数据集里的照片,或自己准备的照片
# 预测
print("预测test 数据集的100张图片, 如果有误就打印出来=============================================")
# predictions = model.predict(test_images, batch_size=1) #预测
predictions = model.predict_classes(test_images, batch_size=16, verbose=1) #预测 ,加进度条
# predictions = model.predict(test_images[5], batch_size=1) #显示 一个预测
# model.predict(X_test, batch_size=32,verbose=1) # verbose:1代表显示进度条
for i in range(100):
if (test_labels[i] != predictions[i]): # 如果预测错误就打印出来
print('▼实际标签 ',test_labels[i],class_names[test_labels[i] ])
print(' 预测结果 ',predictions[i],class_names[predictions[i] ] ,"\n")
plt.figure()
plt.imshow(test_images[i])
plt.colorbar()
plt.grid(False)
plt.show()
print('测试20个数据 ,看是否有误======================================')
for i in range(20):
if (test_labels[i] != predictions[i]):
print("检测数据集标号:",i)
print('实际标签 ',test_labels[i] )
print('预测结果 ',predictions[i] )
print('再测试一遍,预测test 数据集的前35个数据 ,看是否有错误===========================')
plt.figure(figsize=(10,10))
for i in range(35):
plt.subplot(7,5,i+1) # 画 35个预测结果, 7行5列
plt.xticks([])
plt.yticks([])
plt.grid(False)
plt.imshow(test_images[i], cmap=plt.cm.binary) # cmap --color map
plt.xlabel('lable:'+class_names[test_labels[i]])
print('predict ', predictions[i] ,class_names[predictions[i] ] )
plt.show()
print('将训练集里的单张图片预测分类======================================')
print (test_images[0].reshape(1,28,28).shape) # 将2维照片升维, 加一个batch 维度
predicOnePic = model.predict_classes(test_images[0].reshape(1,28,28), batch_size=8, verbose=1) #预测所有测试集 ,加进度条
print (predicOnePic )
print('将训练集里的4张图片预测分类======================================')
print (test_images[0:4].reshape(4,28,28).shape) # 将2维照片升维, 加一个batch 维度
predicOnePic = model.predict_classes(test_images[0:4].reshape(4,28,28), batch_size=8, verbose=1) #预测所有测试集 ,加进度条
print (predicOnePic )
print('将训练集里的一张图片写成文件======================================')
import cv2
img=test_images[23].reshape(28,28,1)*255 # 因为图片已经标准化处理, 不*255恢复大小,输出图片是黑色的
# newImage = cv2.imwrite("im_save.png", img)
cv2.imwrite('newImage.png',img, [int(cv2.IMWRITE_PNG_COMPRESSION), 9])
print('将文件夹里的一张图片预测分类======================================')
import cv2
img= cv2.imread('newImage.png',0)/255 ## 图片标准化处理 ,在 [0,1]区间
img=img.reshape(1,28,28)
predicOnePic = model.predict_classes(img, batch_size=8, verbose=1) #预测所有测试集 ,加进度条
print (predicOnePic )
print('▼预测标签 ',class_names[ predicOnePic[0] ])
预测test 数据集的100张图片, 如果有误就打印出来=============================================
结果有12个错误,前2个分别是:
▼实际标签 4类 Coat 预测结果 2类 Pullover
▼实际标签 9类 Ankle boot 预测结果 5类 Sandal
再测试一遍,预测test 数据集的前35个数据 ,看是否有误======================================
predict 9 Ankle boot predict 2 Pullover predict 1 Trouser predict 1 Trouser predict 6 Shirt predict 1 Trouser predict 4 Coat predict 6 Shirt predict 5 Sandal predict 7 Sneaker predict 4 Coat predict 5 Sandal predict 7 Sneaker predict 3 Dress predict 4 Coat predict 1 Trouser predict 2 Pullover predict 2 Pullover <------- 错误预测 实际是 4 coat predict 8 Bag predict 0 T-shirt/top predict 2 Pullover predict 5 Sandal predict 7 Sneaker predict 5 Sandal predict 1 Trouser predict 2 Pullover predict 6 Shirt predict 0 T-shirt/top predict 9 Ankle boot predict 4 Coat predict 8 Bag predict 8 Bag predict 3 Dress predict 3 Dress predict 8 Bag
可以看到:第17个预测发生了错误(从0开始数数)
数据集标号: 17 实际标签 4 coat 预测结果 2 pullover