本小节目标是通过实例展示如何使用tensorflow识别验证码(MNIST图像)。
1、数据集
def load_data():
with gzip.open('../data/MNIST/mnist.pkl.gz') as fp:
training_data, valid_data, test_data = pickle.load(fp)
return training_data, valid_data, test_data
training_data, valid_data, test_dat=load_data()
x_training_data,y_training_data=training_data
x1,y1=test_dat
不过运行过程中会报错
Traceback (most recent call last):
File "C:/Users/liujiannan/PycharmProjects/pythonProject/Web安全之机器学习入门/code/15-1.py", line 21, in <module>
training_data, valid_data, test_dat=load_data()
File "C:/Users/liujiannan/PycharmProjects/pythonProject/Web安全之机器学习入门/code/15-1.py", line 18, in load_data
training_data, valid_data, test_data = pickle.load(fp)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x90 in position 614: ordinal not in range(128)
修改如下所示
def load_data():
with gzip.open('../data/MNIST/mnist.pkl.gz') as fp:
training_data, valid_data, test_data = pickle.load(fp, encoding="bytes")
return training_data, valid_data, test_data
2、label特征化
这里使用one-hot法,如下所示
y_training_data=get_one_hot(y_training_data)
y1=get_one_hot(y1)
3、源码修改
(1)报警信息
Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
如果安装的是CPU版本(pip install tensorflow),在源码中加入如下代码,忽略警告
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
(2)报警信息
Use `tf.global_variables_initializer` instead.
需要将如下源码
tf.initialize_all_variables
改为
tf.global_variables_initializer
(3)报警信息
The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.
修改方法如下
import tensorflow.compat.v1 as tf
4.训练过程
tensorflow的计算过程如下
5、完整代码
import tensorflow as tf
import pickle
import gzip
import tensorflow.compat.v1 as tf
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
def get_one_hot(x,size=10):
v=[]
for x1 in x:
x2=[0]*size
x2[(x1-1)]=1
v.append(x2)
return v
def load_data():
with gzip.open('../data/MNIST/mnist.pkl.gz') as fp:
training_data, valid_data, test_data = pickle.load(fp, encoding="bytes")
return training_data, valid_data, test_data
training_data, valid_data, test_dat=load_data()
x_training_data,y_training_data=training_data
x1,y1=test_dat
y_training_data=get_one_hot(y_training_data)
y1=get_one_hot(y1)
batch_size=100
x = tf.placeholder("float", [None, 784])
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x,W) + b)
y_ = tf.placeholder("float", [None,10])
cross_entropy = -tf.reduce_sum(y_*tf.log(y))
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
for i in range(int(len(x_training_data)/batch_size)):
batch_xs=x_training_data[(i*batch_size):((i+1)*batch_size)]
batch_ys=y_training_data[(i*batch_size):((i+1)*batch_size)]
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: x1, y_: y1}))
6、运行结果
0.9097