更新时间:
- 2017.5.10: 删掉了大部分的冗余,留下一些重要的东西.更新官方文档地址
- 2019.4.17:迁移到tensorflow 2.x
这节会讲一下基本的DNN,也就是多层感知机来同时把相关的API全部过一遍。分别涉及到激活函数,全连接等等。然后用一个简单的全连接网络分类MNIST来综合。
一.激活函数
常用的激活函数的理论总结可以看之前的博客:深度学习笔记六:常见激活函数总结
激活操作提供了在神经网络中使用的不同类型的非线性模型。包括光滑非线性模型(sigmoid, tanh, elu, softplus, and softsign)。连续但是不是处处可微的函数(relu, relu6, crelu and relu_x)。当然还有随机正则化 (dropout)
所有的激活操作都是作用在每个元素上面的,输出一个tensor和输入的tensor又相同的形状和数据类型。
这里列出tensorflow提供的这些激活函数,但是就不细讲原理了,参照链接就行.至于使用也是非常简单的.只以relu作为例子,其他的使用方式差不多.具体可以看文档,
tf.nn.relu(features, name=None)
作用:计算修正线性单元(非常常用),数学表达为
m
a
x
(
f
e
a
t
u
r
e
s
,
0
)
.
max(features, 0).
max(features,0). 并且返回和feature一样的形状的tensor。
参数:
- features: tensor类型,必须是这些类型:A Tensor. float32, float64, int32, int64, uint8, int16, int8, uint16, half.
- name :操作名称(可选)
二.多层感知机例子
多层感知机的概念很简单了,就不多说了。这里直接上代码。
import pandas as pd
import numpy as np
import tensorflow as tf
EPOCH=20
BATCH_SIZE=100
TRAIN_EXAMPLES=42000
LEARNING_RATE=0.01
#------------------------Generate Data---------------------------#
#generate data
train_frame = pd.read_csv("../Mnist/train.csv")
test_frame = pd.read_csv("../Mnist/test.csv")
# pop the labels and one-hot coding
train_labels_frame = train_frame.pop("label")
# get values
# one-hot on labels
X_train = train_frame.astype(np.float32).values/255
y_train=pd.get_dummies(data=train_labels_frame).values
X_test = test_frame.astype(np.float32).values/255
#trans the shape to (batch,time_steps,input_size)
#X_train=np.reshape(X_train,newshape=(-1,28,28))
#X_test=np.reshape(X_test,newshape=(-1,28,28))
print(X_train.shape)
print(y_train.shape)
print(X_test.shape)
#------------------------------------------------------------------#
def train():
w_1=tf.Variable(
initial_value=tf.random.normal(shape=(784,200)),
name="w_1"
)
b_1=tf.Variable(
initial_value=tf.zeros(shape=(200,)),
name="b_1"
)
w_2 = tf.Variable(
initial_value=tf.random.normal(shape=(200, 10)),
name="w_2"
)
b_2 = tf.Variable(
initial_value=tf.zeros(shape=(10, )),
name="b_2"
)
optimizer=tf.keras.optimizers.SGD(LEARNING_RATE)
for epoch in range(1,EPOCH+1):
print("epoch:",epoch)
train_losses = []
accus = []
for j in range(TRAIN_EXAMPLES//BATCH_SIZE):
with tf.GradientTape() as tape:
logits_1=tf.matmul(X_train[j*BATCH_SIZE:(j+1)*BATCH_SIZE],w_1)+b_1
logits_1=tf.nn.relu(logits_1)
logits_2=tf.matmul(logits_1,w_2)+b_2
entropy=tf.nn.softmax_cross_entropy_with_logits(
labels=y_train[j*BATCH_SIZE:(j+1)*BATCH_SIZE],
logits=logits_2
)
loss=tf.math.reduce_mean(entropy)
#print("loss:",loss)
#计算梯度
gradient=tape.gradient(target=loss,sources=[w_1,b_1,w_2,b_2])
#print("gradient:",gradient)
#应用梯度
optimizer.apply_gradients(zip(gradient,[w_1,b_1,w_2,b_2]))
train_losses.append(loss.numpy())
correct_prediction = tf.equal(tf.argmax(logits_2, 1), tf.argmax(y_train[j*BATCH_SIZE:(j+1)*BATCH_SIZE], 1))
accuracy = tf.math.reduce_mean(tf.cast(correct_prediction, "float")).numpy()
accus.append(accuracy)
print("average training loss:", sum(train_losses) / len(train_losses))
print("accuracy:",sum(accus)/len(accus))
correct_prediction = tf.equal(tf.argmax(logits_2, 1), tf.argmax(y_train[j * BATCH_SIZE:(j + 1) * BATCH_SIZE], 1))
accuracy = tf.math.reduce_mean(tf.cast(correct_prediction, "float"))
if __name__=="__main__":
train()
结果:
epoch: 1
average training loss: 18.653506293750944
accuracy: 0.615499999924075
epoch: 2
average training loss: 7.001213308175405
accuracy: 0.7886190459841773
epoch: 3
average training loss: 5.23225352309999
accuracy: 0.8262380927801132
epoch: 4
average training loss: 4.342436097917103
accuracy: 0.8477142846300488
epoch: 5
average training loss: 3.7712030248982567
accuracy: 0.8600238083373932
epoch: 6
average training loss: 3.354956570480551
accuracy: 0.8689761902604785
epoch: 7
average training loss: 3.0314747077013764
accuracy: 0.8755714285941351
epoch: 8
average training loss: 2.772025206365756
accuracy: 0.8820000014134816
epoch: 9
average training loss: 2.5587227440749607
accuracy: 0.8875000014191582
epoch: 10
average training loss: 2.3799409723707607
accuracy: 0.891666668795404
epoch: 11
average training loss: 2.226951101311438
accuracy: 0.8956428587436676
epoch: 12
average training loss: 2.0936326054368344
accuracy: 0.8990000016632534
epoch: 13
average training loss: 1.9764864536135325
accuracy: 0.9019523836317517
epoch: 14
average training loss: 1.8722826677174973
accuracy: 0.9048333345424562
epoch: 15