Kaggle实战-最简单的DIGIT RECOGNIZER

最新推荐文章于 2024-06-10 18:44:50 发布

ep_mashiro

最新推荐文章于 2024-06-10 18:44:50 发布

阅读量4k

点赞数 1

分类专栏： python 机器学习文章标签：数据 Kaggle

本文链接：https://blog.csdn.net/tinkle181129/article/details/55261251

版权

Digit Recognizer from kaggle

link: https://www.kaggle.com/c/digit-recognizer

Digit Recognizer是kaggle上很基本的一道题目。

数据集描述：

The data files train.csv and test.csv contain gray-scale images of hand-drawn digits, from zero through nine.

Each image is 28 pixels in height and 28 pixels in width, for a total of 784 pixels in total. Each pixel has a single pixel-value associated with it, indicating the lightness or darkness of that pixel, with higher numbers meaning darker. This pixel-value is an integer between 0 and 255, inclusive.

The training data set, (train.csv), has 785 columns. The first column, called “label”, is the digit that was drawn by the user. The rest of the columns contain the pixel-values of the associated image.

Each pixel column in the training set has a name like pixelx, where x is an integer between 0 and 783, inclusive. To locate this pixel on the image, suppose that we have decomposed x as x = i * 28 + j, where i and j are integers between 0 and 27, inclusive. Then pixelx is located on row i and column j of a 28 x 28 matrix, (indexing by zero).

首先查看下数据集

#coding = utf8
%matplotlib inline
import pandas as pd  # data processing, CSV file I/O (e.g. pd.read_csv)

def opencsv():  # open with pandas
    data = pd.read_csv('data/train.csv')
    data1 = pd.read_csv('data/test.csv')
    train_data = data.values[0:, 1:]  # 读入全部训练数据
    train_label = data.values[0:, 0]
    test_data = data1.values[0:, 0:]  # 测试全部测试个数据
    print 'Data Load Done!'
    return train_data, train_label, test_data
train_data, train_label, test_data = opencsv() 
# Train_data 中存储了训练集的784个特征，Test_data存储了测试集的784个特征，train_lable则存储了训练集的标签
# 可以看出这道题是典型的监督学习问题

Data Load Done!

import matplotlib.pyplot as plt
from numpy import *
print shape(train_data),shape(test_data) #训练集有42000个。测试集有28000个
def showPic(data):
    plt.figure(figsize=(7,7))
    # 查看前70幅图
    for digit_num in range(0,70):
        plt.subplot(7,10,digit_num+1)
        grid_data = data[digit_num].r

最低0.47元/天解锁文章

ep_mashiro

关注

1
点赞
踩
10

收藏

觉得还不错? 一键收藏
0
评论
Kaggle实战-最简单的DIGIT RECOGNIZER

Digit Recognizer from kagglelink: https://www.kaggle.com/c/digit-recognizerDigit Recognizer是kaggle上很基本的一道题目。数据集描述：The data files train.csv and test.csv contain gray-scale images of hand-drawn digits, f
复制链接

扫一扫

专栏目录