python mnist数据导入以及处理

最新推荐文章于 2024-05-17 08:45:37 发布

FishBear_move_on

最新推荐文章于 2024-05-17 08:45:37 发布

阅读量6.3k

点赞数 4

分类专栏： Python 文章标签： python mnist

本文链接：https://blog.csdn.net/haluoluo211/article/details/81042529

版权

Python 专栏收录该内容

111 篇文章 0 订阅

订阅专栏

在使用机器学习以及深度学习的时，常用的示例是使用mnist数据进行分类，本文简要的实现下mnis数据的导入以及处理，问题来源stackoverflow。

直接上代码了，注释很清楚了：

import cPickle
import gzip
import numpy as np
import matplotlib.pyplot as plt

def load_data():
    path = '../../data/mnist.pkl.gz'
    f = gzip.open(path, 'rb')
    training_data, validation_data, test_data = cPickle.load(f)
    f.close()

    X_train, y_train = training_data[0], training_data[1]
    print X_train.shape, y_train.shape
    # (50000L, 784L) (50000L,)

    # get the first image and it's label
    img1_arr, img1_label = X_train[0], y_train[0]
    print img1_arr.shape, img1_label
    # (784L,) , 5

    # reshape first image(1 D vector) to 2D dimension image
    img1_2d = np.reshape(img1_arr, (28, 28))
    # show it
    plt.subplot(111)
    plt.imshow(img1_2d, cmap=plt.get_cmap('gray'))
    plt.show()

输出结果如下：

这里写图片描述

对label进行向量化：

def vectorized_result(label):
    e = np.zeros((10, 1))
    e[label] = 1.0
    return e

print vectorized_result(img1_label)
# output as below:
[[ 0.]
 [ 0.]
 [ 0.]
 [ 0.]
 [ 0.]
 [ 1.]
 [ 0.]
 [ 0.]
 [ 0.]
 [ 0.]]

我们也可以使用简单的for循环来将上述的784为输入向量转化为28*28维向量给CNN使用：

def load_data_v2():
    path = '../../data/mnist.pkl.gz'
    f = gzip.open(path, 'rb')
    training_data, validation_data, test_data = cPickle.load(f)
    f.close()

    X_train, y_train = training_data[0], training_data[1]
    print X_train.shape, y_train.shape
    # (50000L, 784L) (50000L,)

    X_train = np.array([np.reshape(item, (28, 28)) for item in X_train])
    y_train = np.array([vectorized_result(item) for item in y_train])

    print X_train.shape, y_train.shape
    # (50000L, 28L, 28L) (50000L, 10L, 1L)

来源自己的stack overflow回答。

FishBear_move_on

关注

4
点赞
踩
13

收藏

觉得还不错? 一键收藏
0
评论
python mnist数据导入以及处理

在使用机器学习以及深度学习的时，常用的示例是使用mnist数据进行分类，本文简要的实现下mnis数据的导入以及处理，问题来源stackoverflow。直接上代码了，注释很清楚了：import cPickleimport gzipimport numpy as npimport matplotlib.pyplot as pltdef load_data(): path =...
复制链接

扫一扫

专栏目录