08-TensorFlow 网络导包方式下载数据集

鸣鼓ming

已于 2022-04-20 15:26:21 修改

阅读量737

点赞数

分类专栏： TensorFlow入门文章标签： tensorflow

于 2022-04-20 15:25:31 首次发布

本文链接：https://blog.csdn.net/qq_41865229/article/details/124296974

版权

TensorFlow入门专栏收录该内容

14 篇文章 6 订阅

订阅专栏

数据集不一定要下载保存到本地磁盘中才能使用, 我们也可以直接通过网络导包的方式来使用, 就比如之前的鸢尾花分类的数据集.下面介绍几个可以通过网络导包方式获取的数据集.

1.Iris数据集

1.数据集介绍

共有数据150组，每组包括花萼长、花萼宽、花瓣长、花瓣宽4个输入特征。同时给出了，这一组特征对应的鸢尾花类别。类别包括Setosa Iris（狗尾草鸢尾），Versicolour Iris（杂色鸢尾），Virginica Iris（弗吉尼亚鸢尾）三类，分别用数字0，1，2表示。
在这里插入图片描述

2.数据集获取

import tensorflow as tf
from sklearn import datasets
import numpy as np

#下载数据集
x_train = datasets.load_iris().data
y_train = datasets.load_iris().target

#打乱数据集
np.random.seed(116)
np.random.shuffle(x_train)
np.random.seed(116)
np.random.shuffle(y_train)
tf.random.set_seed(116)


print("x_train类型", x_train.shape)
print("y_train类型", y_train.shape)
print("x_train前10组数据")
print(x_train[0:10])
print("y_train前10组数据")
print(y_train[0:10])

在这里插入图片描述

2.MNIST数据集

1.数据集介绍

提供 6万张 28 * 28 像素点的0 ~ 9手写数字图片和标签，用于训练。提供 1万张 28 * 28 像素点的0 ~ 9手写数字图片和标签，用于测试。都是黑底白字的灰度图.
在这里插入图片描述
注意, 因为图片数据集是二维的矩阵数组, 所以送入一维的神经网络时需要拉伸为一维数组, 如果送入二维的神经网络就不需要拉伸.

2.数据集获取

import tensorflow as tf
from matplotlib import pyplot as plt

mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# 可视化训练集输入特征的第一个元素
plt.imshow(x_train[0], cmap='gray')  # 绘制灰度图
plt.show()

# 打印出训练集输入特征的第一个元素
print("x_train[0]:\n", x_train[0])
# 打印出训练集标签的第一个元素
print("y_train[0]:\n", y_train[0])

# 打印出整个训练集输入特征形状
print("x_train.shape:\n", x_train.shape)
# 打印出整个训练集标签的形状
print("y_train.shape:\n", y_train.shape)
# 打印出整个测试集输入特征的形状
print("x_test.shape:\n", x_test.shape)
# 打印出整个测试集标签的形状
print("y_test.shape:\n", y_test.shape)

在这里插入图片描述

3.FASHION数据集

1.数据集介绍

提供 6万张 28 * 28 像素点的衣裤等图片和标签，用于训练。提供 1万张 28 * 28 像素点的衣裤等图片和标签，用于测试。

Fashion数据集包含了10个类别的图像，分别是：t-shirt（T恤），trouser（牛仔裤），pullover（套衫），dress（裙子），coat（外套），sandal（凉鞋），shirt（衬衫），sneaker（运动鞋），bag（包），ankle boot（短靴）。
在这里插入图片描述

2.数据集获取

import tensorflow as tf
from matplotlib import pyplot as plt

fashion = tf.keras.datasets.fashion_mnist
(x_train, y_train),(x_test, y_test) = fashion.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# 可视化训练集输入特征的第一个元素
plt.imshow(x_train[0], cmap='gray')  # 绘制灰度图
plt.show()

# 打印出训练集输入特征的第一个元素
print("x_train[0]:\n", x_train[0])
# 打印出训练集标签的第一个元素
print("y_train[0]:\n", y_train[0])

# 打印出整个训练集输入特征形状
print("x_train.shape:\n", x_train.shape)
# 打印出整个训练集标签的形状
print("y_train.shape:\n", y_train.shape)
# 打印出整个测试集输入特征的形状
print("x_test.shape:\n", x_test.shape)
# 打印出整个测试集标签的形状
print("y_test.shape:\n", y_test.shape)

在这里插入图片描述

4.Cifar10数据集

1.数据集介绍

CIFAR-10是一个更接近普适物体的彩色图像数据集。CIFAR-10 是由Hinton 的学生Alex Krizhevsky 和Ilya Sutskever 整理的一个用于识别普适物体的小型数据集。一共包含10 个类别的RGB 彩色图片：飞机（ airplane ）、汽车（ automobile ）、鸟类（ bird ）、猫（ cat ）、鹿（ deer ）、狗（ dog ）、蛙类（ frog ）、马（ horse ）、船（ ship ）和卡车（ truck ）。
在这里插入图片描述

提供 5万张 32 * 32 像素点的十分类彩色图片和标签，用于训练。
提供 1万张 32 * 32 像素点的十分类彩色图片和标签，用于测试。

2.数据集获取

import tensorflow as tf
from matplotlib import pyplot as plt

cifar10 = tf.keras.datasets.cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

# 可视化训练集输入特征的第一个元素
plt.imshow(x_train[0])  # 绘制图片
plt.show()

# 打印出训练集输入特征的第一个元素
print("x_train[0]:\n", x_train[0])
# 打印出训练集标签的第一个元素
print("y_train[0]:\n", y_train[0])

# 打印出整个训练集输入特征形状
print("x_train.shape:\n", x_train.shape)
# 打印出整个训练集标签的形状
print("y_train.shape:\n", y_train.shape)
# 打印出整个测试集输入特征的形状
print("x_test.shape:\n", x_test.shape)
# 打印出整个测试集标签的形状
print("y_test.shape:\n", y_test.shape)

在这里插入图片描述

鸣鼓ming

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
08-TensorFlow 网络导包方式下载数据集

数据集不一定要下载保存到本地磁盘中才能使用, 我们也可以直接通过网络导包的方式来使用, 就比如之前的鸢尾花分类的数据集.下面介绍几个可以通过网络导包方式获取的数据集.1.Iris数据集1.数据集介绍共有数据150组，每组包括花萼长、花萼宽、花瓣长、花瓣宽4个输入特征。同时给出了，这一组特征对应的鸢尾花类别。类别包括Setosa Iris（狗尾草鸢尾），Versicolour Iris（杂色鸢尾），Virginica Iris（弗吉尼亚鸢尾）三类，分别用数字0，1，2表示。2.数据集获取impor
复制链接

扫一扫