【机器学习】——手写数字识别之sklearn.datasets数据集

sklearn自带一些数据集,其中手写数字数据集可通过load_digits加载,load_digits内部:

def load_linnerud():
    """Load and return the linnerud dataset (multivariate regression).
    Samples total: 20
    Dimensionality: 3 for both data and targets
    Features: integer
    Targets: integer
    Returns
    -------
    data : Bunch
        Dictionary-like object, the interesting attributes are: 'data' and
        'targets', the two multivariate datasets, with 'data' corresponding to
        the exercise and 'targets' corresponding to the physiological
        measurements, as well as 'feature_names' and 'target_names'.
    """
    base_dir = join(dirname(__file__), 'data/')
    # Read data
    data_exercise = np.loadtxt(base_dir + 'linnerud_exercise.csv', skiprows=1)
    data_physiological = np.loadtxt(base_dir + 'linnerud_physiological.csv',
                                    skiprows=1)
    # Read header
    with open(base_dir + 'linnerud_exercise.csv') as f:
        header_exercise = f.readline().split()
    with open(base_dir + 'linnerud_physiological.csv') as f:
        header_physiological = f.readline().split()
    with open(dirname(__file__) + '/descr/linnerud.rst') as f:
        descr = f.read()
 
    return Bunch(data=data_exercise, feature_names=header_exercise,
                 target=data_physiological,
                 target_names=header_physiological,
                 DESCR=descr)

数据集:

import matplotlib
import matplotlib.pyplot as plt
from sklearn import datasets

# 手写数字数据集,封装好的对象,可以理解为一个字段
digits = datasets.load_digits()
# 可以使用keys()方法来看一下数据集的详情
print(digits.keys())

# 查看sklearn.datasets提供的数据描述
# 5620张图片,每张图片有64个像素点即特征(8*8整数像素图像)
# 每个特征的取值范围是1~16(sklearn中的不全),对应的分类结果是10个数字
# print(digits.DESCR)

# 特征的shape
X = digits.data
print(X.shape)
# 标签的shape
y = digits.target
print(y.shape)
# 标签分类
print(digits.target_names)
# 去除某一个具体的数据,查看其特征以及标签信息
some_digit = X[666]
print(some_digit)
print(y[666])
# 也可以这条数据进行可视化
some_digmit_image = some_digit.reshape(8, 8)
plt.imshow(some_digmit_image, cmap = matplotlib.cm.binary)
plt.show()

数据可视化结果:

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值