一、问题定义
Fashion-MNIST ,是2017年8月底德国研究机构Zalando Research发布的一个数据集,其中训练集包含60000个样本,测试集包含10000个样本,分为10类。样本都来自日常穿着的衣裤鞋包,每一个都是28×28的灰度图。
这个数据集致力于成为手写数字数据集MNIST的替代品,可用作机器学习算法的基准测试,也同样适合新手入门。
这个数据集的样子大致如下(每个类别占三行):
GitHub地址:https://github.com/zalandoresearch/fashion-mnist
本文的AIStudio地址:https://aistudio.baidu.com/aistudio/projectdetail/1515375
导入相关库
import paddle
import cv2
import matplotlib.pyplot as plt
import matplotlib.image as mping
import numpy as np
print(paddle.__version__)
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/__init__.py:107: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
from collections import MutableMapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/rcsetup.py:20: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
from collections import Iterable, Mapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/colors.py:53: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
from collections import Sized
2.0.0
二、数据准备
1、数据加载和预处理
# 获取数据
import paddle.vision.transforms as T
from paddle.vision.datasets import FashionMNIST
# 数据的加载和预处理
transform = T.Normalize(mean=[127.5], std=[127.5]) # 调用飞浆api进行归一化处理
# 训练数据集
train_dataset = FashionMNIST(mode='train', transform=transform)
# 评估数据集
eval_dataset = FashionMNIST(mode='test', transform=transform)
print('训练数据集数量: {}, 验证数据集数量:{}'.format(len(train_dataset), len(test_dataset)))
训练数据集数量: 60000, 验证数据集数量:10000
plt.figure()
img = train_dataset[0][0]
img = plt.imshow(img.reshape([28, 28]), cmap=plt.cm.binary) # 图像二值化展示
plt.show
<function matplotlib.pyplot.show(*args, **kw)>
三、模型选择和开发
# 搭建网络模型
class FashionMNISTNet(paddle.nn.Layer):
def __init__(self):
super(FashionMNISTNet, self).__init__()
## 第一个卷积层
self.conv1 = paddle.nn.Conv2D(in_channels=1, out_channels=30, kernel_size=3, stride=1, padding='SAME')
## 激活函数
self.act = paddle.nn.ReLU()
## 第一个池化层(这里使用最大池化)
self.max_pool1 = paddle.nn.MaxPool2D(kernel_size=2, stride=2, padding='SAME')
## 第二个卷积层
self.conv2