TensorFlow 读取本地数据集压缩包gz,并转换为numpy矩阵【修改官方源码而成】

本教程以读取Fashion-MNIST为例

1 下载Fashion-MNIST数据集gz格式压缩包

fashion-mnist

共下载四个文件

本人将所有文件保存到此文档路径中:'/home/brian/Documents/tensorflow-gpu/tensorflow-learning/data/fashion/'

2 在python3环境中通过如下读入数据,并转换为numpy矩阵

(注意替换程序中的path为你保存文件的所在文件夹的路径,  files为数据集文件名)

# 引入必要的库函数
from tensorflow.python.keras.utils.data_utils import get_file
from tensorflow.python.util.tf_export import tf_export
import gzip

# 读取本地gz文档,并转换为numpy矩阵的函数
def load_localData():
    
    path = '/home/brian/Documents/tensorflow-gpu/tensorflow-learning/data/fashion/'
    
    files = [
          'train-labels-idx1-ubyte.gz', 'train-images-idx3-ubyte.gz',
            't10k-labels-idx1-ubyte.gz', 't10k-images-idx3-ubyte.gz']
    
    paths = []
    for fname in files:
        paths.append(get_file(fname, origin=None, cache_dir=path + fname, cache_subdir=path))

    with gzip.open(paths[0], 'rb') as lbpath:
        y_train = np.frombuffer(lbpath.read(), np.uint8, offset=8)

    with gzip.open(paths[1], 'rb') as imgpath:
        x_train = np.frombuffer(\
            imgpath.read(), np.uint8, offset=16).reshape(len(y_train), 28, 28)

    with gzip.open(paths[2], 'rb') as lbpath:
        y_test = np.frombuffer(lbpath.read(), np.uint8, offset=8)

    with gzip.open(paths[3], 'rb') as imgpath:
        x_test = np.frombuffer(\
    imgpath.read(), np.uint8, offset=16).reshape(len(y_test), 28, 28)
    
    return x_train, y_train, x_test, y_test

3 应用函数读取数据

x_train, y_train, x_test, y_test = load_localData()
print(x_train.shape)
print(y_train.shape)
print(x_test.shape)
print(y_test.shape)
(60000, 28, 28)
(60000,)
(10000, 28, 28)
(10000,)
评论 6
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值