导入本地已下载的MNIST数据集

最新推荐文章于 2024-06-28 18:04:12 发布

aoyou19

最新推荐文章于 2024-06-28 18:04:12 发布

阅读量2.8k

点赞数 4

文章标签：深度学习 tensorflow

本文链接：https://blog.csdn.net/aoyou19/article/details/120120699

版权

导入本地已下载的MNIST数据集，四个文件：
t10k-images-idx3-ubyte.gz
t10k-labels-idx1-ubyte.gz
train-images-idx3-ubyte.gz
train-labels-idx1-ubyte.gz

import matplotlib.pyplot as plt
import gzip
import numpy as np
import os

def load_data_gz(data_folder):
    files = ['train-labels-idx1-ubyte.gz', 'train-images-idx3-ubyte.gz', 't10k-labels-idx1-ubyte.gz',
             't10k-images-idx3-ubyte.gz']

    paths = []
    for fname in files:
        paths.append(os.path.join(data_folder, fname))

    //# 读取每个文件夹的数据
    with gzip.open(paths[0], 'rb') as lbpath:
        y_train = np.frombuffer(lbpath.read(), np.uint8, offset=8)

    with gzip.open(paths[1], 'rb') as imgpath:
        x_train = np.frombuffer(imgpath.read(), np.uint8, offset=16).reshape(len(y_train), 784)

    with gzip.open(paths[2], 'rb') as lbpath:
        y_test = np.frombuffer(lbpath.read(), np.uint8, offset=8)

    with gzip.open(paths[3], 'rb') as imgpath:
        x_test = np.frombuffer(imgpath.read(), np.uint8, offset=16).reshape(len(y_test), 784)

    return x_train, y_train, x_test, y_test

//# 调用load_data_gz函数加载数据集
data_folder = '\\mnist_dataset'
x_train_gz, y_train_gz, x_test_gz, y_test_gz = load_data_gz(data_folder)

print('x_train_gz.shape:', x_train_gz.shape)
print('y_train_gz.shape', y_train_gz.shape)
print('x_test_gz.shape:', x_test_gz.shape)
print('y_test_gz.shape:', y_test_gz.shape)

输出结果为：

x_train_gz.shape: (60000, 784)
y_train_gz.shape (60000,)
x_test_gz.shape: (10000, 784)
y_test_gz.shape: (10000,)

将长度784的向量转换为28*28的图像

train_image = np.zeros([60000, 28, 28]).astype(np.float32)

for i in range(x_train_gz.shape[0]):
    re = x_train_gz[i, :].reshape(28, 28)
    train_image[i, :, :] = re
print('train_image.shape: ', train_image.shape)

输出结果为：

train_image.shape:  (60000, 28, 28)

选取前16张查看：


plt.figure()
for i in range(16):
    plt.subplot(4, 4, i+1)
    plt.imshow(train_image[i, :, :], 'gray')
    plt.axis('off')
plt.show()

显示结果：
在这里插入图片描述

aoyou19

关注

4
点赞
踩
29

收藏

觉得还不错? 一键收藏
0
评论
导入本地已下载的MNIST数据集

导入本地已下载的MNIST数据集，四个文件：t10k-images-idx3-ubyte.gzt10k-labels-idx1-ubyte.gztrain-images-idx3-ubyte.gztrain-labels-idx1-ubyte.gzimport matplotlib.pyplot as pltimport gzipimport numpy as npimport osdef load_data_gz(data_folder): files = ['train-la
复制链接

扫一扫