MNIST 中 fetch_mldata 指定正确的数据集路径
1.MNST数据集文件下载链接
链接:https://pan.baidu.com/s/1VFI8RMVLyGHTmZJI6siz-g
提取码:1234
2.将下载的数据集文件放到scikit数据根目录下的mldata目录,这个是默认路径
默认根目录查询方式:
1 from sklearn.datasets.base import get_data_home
2 print (get_data_home())
可以查询出本地电脑的存储路径为: C:\Users\username\scikit_learn_data
在此设定下能够成功运行起来的MNIST代码是:
from sklearn.datasets.base import get_data_home
print(get_data_home())
mnist = fetch_mldata('MNIST original')
mnist
如下代码是图片中数字识别的完整代码:
(一个数字:28*28 pixel)
from sklearn.datasets import fetch_mldata
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
import numpy as np
from sklearn.datasets.base import get_data_home
print(get_data_home())
mnist = fetch_mldata('MNIST original')
mnist
print('\n\n\n')
print('代码运行结果:')
print('样本数量:{}, 样本特征数:{}'.format(mnist.data.shape[0],mnist.data.shape[1]))
X=mnist.data/255
y=mnist.target
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=5000, random_state=62)
mlp_hw = MLPClassifier(solver= 'lbfgs',hidden_layer_sizes=[100, 100], activation='relu', alpha= 1e-5, random_state = 62)
mlp_hw.fit(X_train, y_train)
print("code run result")
print("modle score")
print('Test set score:{:.2f}%' .format(mlp_hw.score(X_test,y_test)*100))
from PIL import Image
image = Image.open('num.jpg').convert('F')
image = image.resize((28, 28))
arr = []
for i in range(28):
for j in range(28):
pixel = 1.0 - float(image.getpixel((j, i)))/255.
arr.append(pixel)
arr1 = np.array(arr).reshape(1, -1)
print('The number in the picture is:{:.0f}' .format(mlp_hw.predict(arr1)[0]))
图片中数字的识别效果: