tensorflow进行SVHN数据实验

最新推荐文章于 2024-04-30 09:19:20 发布

juezhanangle

最新推荐文章于 2024-04-30 09:19:20 发布

阅读量4.2k

点赞数 1

本文链接：https://blog.csdn.net/juezhanangle/article/details/73203693

版权

SVHN is a real-world image dataset for developing machine learning and object recognition algorithms with minimal requirement on data preprocessing and formatting.数据库下载地址

我所下载的数据集是32×32的形式的。然后用scipy将数据导入到代码中，先进行数据的分析,mat后缀的格式是matplotlib的数据形式，scipy能够很好的读取，但是数据形式何不相同。其中代码如下：

# coding: UTF-8
from scipy.io import loadmat as load

traindata = load('data/train_32x32.mat')
testdata = load('data/test_32x32.mat')

print ("Train Data Shape:", traindata['X'].shape)
print ("Test Data Shape:", traindata['y'].shape)

实验的数据显示数据的形式是matlib的，和普通的形式很不相同。然后进行数据的预处理和观察数据：

# coding: UTF-8
from scipy.io import loadmat as load
import matplotlib.pyplot as plt
import numpy as np


def reformat(samples, labels):
    # 改变原始数据的形状
    # (图片高，图片宽，通道数，图片数)->(图片数,图片高，图片宽，通道数)
    # labels 变成one-hot encoding
    samples = np.transpose(samples, (3, 0, 1, 2))
    labels = np.array([x[0] for x in labels])
    one_hot_labels = []
    for num in labels:
        one_hot = [0.0] * 10
        if num == 10:
            one_hot[0] = 1.0
        else:
            one_hot[num] = 1.0
        one_hot_labels.append(one_hot)
    labels = np.array(one_hot_labels).astype(np.float32)
    return samples, labels


def normalize(samples):
    # 将图片从0～255 线性映射到 -1.0～+1.0
    # 并且灰度化
    pass

def distribution(labels, name):
    # 查看一下每个label的分布。就是比例
    pass

def inspect(dataset, labels, i):
    # 显示图片看看
    print(labels[i])
    plt.imshow(dataset[i])
    plt.show()

train = load('data/train_32x32.mat')
test = load('data/test_32x32.mat')

print ("Train Data Shape:", train['X'].shape)
print ("Train Label Shape:", train['y'].shape)

train_samples = train['X']
train_labels = train['y']
test_samples = test['X']
test_labels = test['y']

_train_sample, _train_labels = reformat(train_samples, train_labels)
_test_sample, _test_labels = reformat(test_samples, test_labels)

num_labels = 10
image_size = 32


if __name__ == '__main__':
    # 探索数据
    inspect(_train_sample, _train_labels, 12322)

其中代码中reformat时进行数据的预处理，将改变原始数据的格式，并将labels变成one-hot形式。然后就是inspect函数，进行数据的显示，能够将图像进行显示。其结果如下所示。

juezhanangle

关注

1
点赞
踩
8

收藏

觉得还不错? 一键收藏
1
评论
tensorflow进行SVHN数据实验

SVHN is a real-world image dataset for developing machine learning and object recognition algorithms with minimal requirement on data preprocessing and formatting.数据库下载地址
复制链接

扫一扫