最近在学习卷积神经网络(cnn)和tensorflow,就把以前在kaggle上用其他方法做过的手写识别的项目拿过来练练手。
下面是数据来源的链接:
https://www.kaggle.com/c/digit-recognizer/data
首先我们先读入数据:
train=pd.read_csv("/Users/zhouhonghao/Desktop/train.csv" ,header=0 )
其中,第一列代表的是label,也就是数字的大小,后面表示的是将28*28的手写照片展开成1*(28*28)=1*784的向量,一共有42000行,代表有42000张手写照片。
先把label列提取出来:
label=train[['label']]
label=label.as_matrix()
为了方便输出,将label转化为一个1*10的向量,如label=9,可以转化为(0,0,0,0,0,0,0,0,0,1)
def dense_to_one_hot (label) :
index=label[0 ]
return_label=[]
for i in range(10 ):
if (i==index):
return_label.append(1 )
else :
return_label.append(0 )
return return_label
new_label=[]
for i in label:
mid_index=dense_to_one_hot(i)
new_label.append(mid_index)
label=np.array(new_label)
接下来对训练数据进行,把灰度调整到(0,1)区间,并得到手写数字图像的高和宽:
image=train.iloc [:,1 :].values
image = image.astype (np.float )
image=np.multiply (image,1.0 /255 )
image_size=image.shape [1 ]
image_width=image_height=np.ceil (np.sqrt (image_size)).astype (np.uint 8)
然后是划分将数据集划分为训练集以及验证集:
validation_image=image[:VALIDATION_SIZE]
validation_label=label[:VALIDATION_SIZE]
train_image=image[VALIDATION_SIZE:]
train_label=label[VALIDATION_SIZE:]
然后可以开始搭建神经网络了,本次搭建的神经网络包含2个卷积层、2个池化层和2个全连接层。先给出权重函数代码:
def weight_variable (shape) :
initial=tf.truncated_normal(shape, stddev=0.1 )
return tf.Variable(initial)
def bias_variable (shape) :
initial=tf.constant(0.1 ,shape=shape)
return tf.Variable(initial)
之后是卷积层函数以及池化层函数:
def conv2d (x,w) :
return tf.nn.conv2d(x,w,strides=[1 ,1 ,1 ,1 ],padding='SAME' )
def max_pooling (x) :
return tf.nn.max_pool(x,ksize=[1 ,2 ,2 ,1 ],strides=[1 ,2 ,2 ,1 ],padding='SAME' )
对于tf.nn.conv2d(x,w,strides=[1,1,1,1],padding='SAME'),其参数x表示的是输入图像,w表示的是卷积核,对于strides=[1,1,1,1],表示的是其在图像batch、高、宽以及channels移动的步长,这里都是1,padding='SAME'表示卷积后的图片与原图相等。tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,2,2,1],padding='SAME')中,ksize表示的pool层窗口函数的大小,其他的参数意义和卷积层相同。
接下来就要开始构建卷积层了,代码如下:
w1=weight_variable([5 ,5 ,1 ,32 ])
b1=bias_variable([32 ])
image=tf.reshape(x,[-1 ,image_width,image_height,1 ],name='reshape')
h1=tf.nn.relu(conv2d(image,w1)+b1)
h1_pool=max_pooling(h1)
w2=weight_variable([5 ,5 ,32 ,64 ])
b2=bias_variable([64 ])
h2=tf.nn.relu(conv2d(h1_pool,w2)+b2)
h2_pool=max_pooling(h2)
w_fc1=weight_variable([7 *7 *64 ,1024 ])
b_fc1=bias_variable([1024 ])
h2_pool=tf.reshape(h2_pool,[-1 ,7 *7 *64 ])
keep_prob = tf.placeholder('float')
h_fc1=tf.nn.relu(tf.matmul(h2_pool,w_fc1)+b_fc1)
h_dropout1=tf.nn.dropout(h_fc1,keep_prob)
w_fc2=weight_variable([1024 ,10 ])
b_fc2=bias_variable([10 ])
y=tf.nn.softmax(tf.matmul(h_dropout1,w_fc2)+b_fc2,name='prob')
其中tf.nn.relu()为激活函数,y为得到的结果,可以得到相应的损失函数cross_entropy,y_为正确的值:
cross_entropy=-tf.reduce_sum(y _ *tf .log (y ))
我们所要做的就是要最小化损失函数,可以利用梯度下降的方法来实现:
train_step=tf.train .AdagradOptimizer (LEARN_RATE).minimize (cross_entropy)
train_step2=tf.train .AdagradOptimizer (LEARN_RATE2).minimize (cross_entropy)
train_step3=tf.train .AdagradOptimizer (LEARN_RATE3).minimize (cross_entropy)
其中LEARN_RATE、LEARN_RATE2、LEARN_RATE3分别为学习速度,本次实验分别为0.001、0.0001以及0.00001。
在构建完神经网络之后,就可以开始训练了,为了减少计算量,每次从训练集中拿出50个数据来计算,迭代4000次,其中前1500次迭代学习速度为LEARN_RATE,1500到3000次迭代学习速度为LEARN_RATE2,最后1000次迭代学习速度为LEARN_RATE3。每隔100次迭代,分别计算其在训练集中的精度以及在验证集中的精度。
correct_prediction=tf.equal(tf.argmax(y,1 ),tf.argmax(y_,1 ))
accuracy=tf.reduce_mean(tf.cast(correct_prediction,'float' ))
epochs_completed = 0
index_in_epoch = 0
num_examples = train_image.shape[0 ]
def next_batch (batch_size) :
global train_label
global train_image
global index_in_epoch
global epochs_completed
start = index_in_epoch
index_in_epoch += batch_size
if index_in_epoch > num_examples:
epochs_completed += 1
perm = np.arange(num_examples)
np.random.shuffle(perm)
train_image = train_image[perm]
train_label = train_label[perm]
start = 0
index_in_epoch = batch_size
assert batch_size <= num_examples
end = index_in_epoch
return train_image[start:end],train_label[start:end]
init=tf.initialize_all_variables()
saver=tf.train.Saver()
print 'begin to creat model...'
with tf.Session() as sess:
start=datetime.datetime.now()
sess.run(init)
train_accuracies=[]
validation_accuracies=[]
x_range=[]
display_step=100
for i in range(TRAINING_ITERATIONS):
batch_xs, batch_ys = next_batch(BATCH_SIZE)
endtime=datetime.datetime.now()
if i% display_step==0 or (i+1 )==TRAINING_ITERATIONS:
train_accuracy =accuracy.eval(feed_dict={x:batch_xs,y_:batch_ys,keep_prob:1.0 })
train_accuracies.append(train_accuracy)
if (VALIDATION_SIZE):
validation_accuracy=accuracy.eval(feed_dict={x:validation_image,y_:validation_label,keep_prob:1.0 })
print ('training_accuracy / validation_accuracy => %.2f / %.2f for step %d,time=%d' % (
train_accuracy, validation_accuracy, i,(endtime-start).seconds))
if i<1500 :
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys, keep_prob: DROPOUT})
elif i>1500 &i<3000 :
sess.run(train_step2,feed_dict={x:batch_xs,y_:batch_ys,keep_prob:DROPOUT})
else :
sess.run(train_step3,feed_dict={x:batch_xs,y_:batch_ys,keep_prob:DROPOUT})
最后输出的结果为: