JPG、PNG与MNIST数据集之间的转换

最新推荐文章于 2023-07-10 20:29:49 发布

VIP文章 sdoddyjm68

最新推荐文章于 2023-07-10 20:29:49 发布

阅读量9.7k

点赞数 17

本文链接：https://blog.csdn.net/sdoddyjm68/article/details/78430209

版权

最近用到手写识别，想起来 TensorFlow 的 tutorial 上有个手写识别的教程，想正好拿来用。但是问题很明显，TensorFlow 上的这个教程手写数据集的前期处理是自动完成的，如果我想输入自己的手写图片，该如何做前期处理呢？

TensorFlow 用到的数据集是MNIST，在数据集的官网上可以看到，MNIST 把图片文件转成了特定格式的二进制文件，文件后缀为 idx3-ubyte

同时在简书上找到这篇文章：利用 python 解析MNIST数据集（侵删），这篇文章的整体思路是用 python 自带的 struct 包来进行二进制文件的操作，代码如下：

"""
对MNIST手写数字数据文件转换为bmp图片文件格式。
相关格式转换见官网以及代码注释。

========================
关于IDX文件格式的解析规则：
========================
THE IDX FILE FORMAT

the IDX file format is a simple format for vectors and multidimensional matrices of various numerical types.
The basic format is

magic number
size in dimension 0
size in dimension 1
size in dimension 2
.....
size in dimension N
data

The magic number is an integer (MSB first). The first 2 bytes are always 0.

The third byte codes the type of the data:
0x08: unsigned byte
0x09: signed byte
0x0B: short (2 bytes)
0x0C: int (4 bytes)
0x0D: float (4 bytes)
0x0E: double (8 bytes)

The 4-th byte codes the number of dimensions of the vector/matrix: 1 for vectors, 2 for matrices....

The sizes in each dimension are 4-byte integers (MSB first, high endian, like in most non-Intel processors).

The data is stored like in a C array, i.e. the index in the last dimension changes the fastest.
"""

import numpy as np
import struct
import matplotlib.pyplot as plt

# 训练集文件
train_images_idx3_ubyte_file = '../../data/mnist/bin/train-images.idx3-ubyte'
# 训练集标签文件
train_labels_idx1_ubyte_file = '../../data/mnist/bin/train-labels.idx1-ubyte'

# 测试集文件
test_images_idx3_ubyte_file = '../../data/mnist/bin/t10k-images.idx3-ubyte'
# 测试集标签文件
test_labels_idx1_ubyte_file = '../../data/mnist/bin/t10k-labels.idx1-ubyte'


def decode_idx3_ubyte(idx3_ubyte_file):
    """
    解析idx3文件的通用函数
    :param idx3_ubyte_file: idx3文件路径
    :return: 数据集
    """
    # 读取二进制数据
    bin_data = open(idx3_ubyte_file, 'rb').read()

    # 解析文件头信息，依次为魔数、图片数量、每张图片高、每张图片宽
    offset = 0
    fmt_header = '>iiii'
    magic_number, num_images, num_rows, num_cols = struct.unpack_from(fmt_header, bin_data, offset)
    print '魔数:%d, 图片数量: %d张, 图片大小: %d*%d' % (magic_number, num_images, num_rows, num_cols)

    # 解析数据集
    image_size = num_rows * num_cols
    offset += struct.calcsize(fmt_header)
    fmt_image = '>' + str(image_size) + 'B'
    images = np.empty((num_images, num_rows, num_cols))
    for i

最低0.47元/天解锁文章

sdoddyjm68

关注

17
点赞
踩
64

收藏

觉得还不错? 一键收藏
28
评论
JPG、PNG与MNIST数据集之间的转换

最近用到手写识别，想起来 TensorFlow 的 tutorial 上有个手写识别的教程，想正好拿来用。但是问题很明显，TensorFlow 上的这个教程手写数据集的前期处理是自动完成的，如果我想输入自己的手写图片，该如何做前期处理呢？TensorFlow 用到的数据集是MNIST，在数据集的官网上可以看到，MNIST 把图片文件转成了特定格式的二进制文件，文件后缀为 idx3-ubyte同时百度
复制链接

扫一扫