分享朋友的机器学习应用案例:使用机器学习实现财富自由www.abuquant.com
使用tensorflow自带的手写数据集来完成按照像素的分类。使用softmax进行多分类.
import numpy as np
import tensorflow as tf
from tensorflow.examples.tutorials import mnist
from IPython.display import display, HTML
import matplotlib.pyplot as plt
读取数据集
mnist_data = mnist.input_data.read_data_sets('/data/mnist', one_hot=True) # one_hot 是 y是否one-hot表示
Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting /data/mnist/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting /data/mnist/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting /data/mnist/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting /data/mnist/t10k-labels-idx1-ubyte.gz
# 检查数据维度情况
display('train image shape:')
display(mnist_data.train.images.shape)
display('label y shape')
display(mnist_data.train.labels.shape)
'train image shape:'
(55000, 784)
'label y shape'
(55000, 10)
# 从上面可以看出一个image是 1*784的一维向量, label是10个分类中的一个
# 再来看看一个图像究竟是张的什么样子
def plot_mnist(image_array):
"""
根据手写识别的数组来进行输出最终的手写识别图片
:param image_array: 手写识别m*n数组
:return:
"""
fig = plt.figure()
plt.imshow(image_array, cmap='gray')
plt.show()
image_index = 1 # 取第一章图片看看
image = mnist_data.train.images[image_index]
image = image.reshape(28, 28)
plot_mnist(image)
# 看看label
display(mnist_data.train.labels[image_index])
array([ 0., 0., 0., 1., 0., 0., 0., 0., 0., 0.])
从上面的结果,看到第2个图片是3,label中第4个位置是1.
构建图
# 1. 创建place_holder
batch_size = 128 # 每次放入128个图片进行训练
image_shape = 784
label_shape = 10
X = tf.placeholder(tf.float32, [batch_size, image_shape], name='X')
Y = tf.placeholder(tf.float32, [batch_size, label_shape], name='Y')
# 2 创建权重和参数变量
w = tf.Variable(tf.random_normal(shape=(image_shape, label_shape), stddev=0.01), name='weight')
b = tf.Variable(tf.zeros([1, 10]), name='bias')
# X*w + b = label 所以, 每一个x.shape=1*784, w = 784*10.
# 3 前向函数
logit = tf.matmul(X, w) + b
# 4 loss 函数
entropy = tf.nn.softmax_cross_entropy_with_logits(logits=logit, labels=Y, name='loss')
loss = tf.reduce_mean(entropy)
# 5 准备优化器
learning_rate = 0.01
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(loss)
训练网络
epoch_num = 50 # 训练50轮
with tf.Session() as session:
writer = tf.summary.FileWriter('./graph/softmax', session.graph)
session.run(tf.global_variables_initializer())
n_batches = int(mnist_data.train.num_examples / batch_size)
for i in xrange(epoch_num):
total_loss = 0
for _ in xrange(n_batches):
X_batch, Y_batch = mnist_data.train.next_batch(batch_size)
_, l = session.run([optimizer, loss],
feed_dict={X: X_batch, Y: Y_batch})
total_loss += l
print 'epoch[%d]: Average loss: %f' % (i, total_loss/n_batches)
# 构建预测图
pred = tf.nn.softmax(logit)
correct_preds = tf.equal(tf.arg_max(pred, 1), tf.arg_max(Y, 1))
accuracy = tf.reduce_sum(tf.cast(correct_preds, tf.float32))
test_n_batches = int(mnist_data.test.num_examples / batch_size)
total_pred_correct = 0
for i in xrange(test_n_batches):
X_batch_test, Y_batch_test = mnist_data.test.next_batch(batch_size)
accuracy_batch = session.run([accuracy],
feed_dict={X: X_batch_test,
Y: Y_batch_test})
total_pred_correct += accuracy_batch[0]
print 'Accuracy: %f' % (total_pred_correct / mnist_data.test.num_examples)
writer.close()
epoch[0]: Average loss: 0.370386
epoch[1]: Average loss: 0.292554
epoch[2]: Average loss: 0.282772
epoch[3]: Average loss: 0.278612
epoch[4]: Average loss: 0.272398
epoch[5]: Average loss: 0.272257
epoch[6]: Average loss: 0.272987
epoch[7]: Average loss: 0.269087
epoch[8]: Average loss: 0.263485
epoch[9]: Average loss: 0.268402
epoch[10]: Average loss: 0.260866
epoch[11]: Average loss: 0.264210
epoch[12]: Average loss: 0.261356
epoch[13]: Average loss: 0.259465
epoch[14]: Average loss: 0.256944
epoch[15]: Average loss: 0.264056
epoch[16]: Average loss: 0.253583
epoch[17]: Average loss: 0.260119
epoch[18]: Average loss: 0.254224
epoch[19]: Average loss: 0.258548
epoch[20]: Average loss: 0.253865
epoch[21]: Average loss: 0.259262
epoch[22]: Average loss: 0.258289
epoch[23]: Average loss: 0.254773
epoch[24]: Average loss: 0.253627
epoch[25]: Average loss: 0.252201
epoch[26]: Average loss: 0.250092
epoch[27]: Average loss: 0.253773
epoch[28]: Average loss: 0.255537
epoch[29]: Average loss: 0.252305
epoch[30]: Average loss: 0.252843
epoch[31]: Average loss: 0.250082
epoch[32]: Average loss: 0.256573
epoch[33]: Average loss: 0.245833
epoch[34]: Average loss: 0.256379
epoch[35]: Average loss: 0.247376
epoch[36]: Average loss: 0.251196
epoch[37]: Average loss: 0.249400
epoch[38]: Average loss: 0.248824
epoch[39]: Average loss: 0.252425
epoch[40]: Average loss: 0.247105
epoch[41]: Average loss: 0.250089
epoch[42]: Average loss: 0.252110
epoch[43]: Average loss: 0.247810
epoch[44]: Average loss: 0.246959
epoch[45]: Average loss: 0.249442
epoch[46]: Average loss: 0.249063
epoch[47]: Average loss: 0.255748
epoch[48]: Average loss: 0.247588
epoch[49]: Average loss: 0.244467
Accuracy: 0.919400