声明
声明:本系列博客是我在学习人工智能实践:TensorFlow笔记(曹健,北京大学,软件与微电子学院)所做的笔记。所以,其中的绝大部分内容引自这系列视频,博客中的代码也是视频配套所附带的代码,其中部分代码可能会因需要而改动。侵删。在本系列博客中,其中包含视频中的引用,也包括我自己对知识的理解,思考和总结。本系列博客的目的主要有两个,一个是可以作为我自己的学习笔记,时常复习巩固。第二个是可以为想学习TensorFlow 2 相关知识的朋友提供一些参考。
正文
1. 人工智能三学派
- 行为主义:基于控制论,构建感知-动作控制系统
- 符号主义:基于算数逻辑表达式,求解问题时先把问题描述为表达式,再求解表达式。
- 连接主义:仿生学。模仿神经元连接关系。
2.用计算机仿出神经网络连接关系
用计算机仿出神经网络连接关系,让计算机具备感性思维。
1.准备数据:采集大量“特征/标签”数据
2.搭建网络:搭建神经网络结构
3.优化参数:训练网络获取最佳参数(反传)
4.应用网络:将网络保存为模型,输入新数据,输出分类或预测结果(前传)
神经网络:采集大量数据对构成数据集,把数据集喂入搭建好的神经网络结构,网络优化参数得到模型,模型读入新输入特征,输出识别结果。
3. 用鸢尾花分类的例子感受神经网络的设计过程
图示
用神经网络实现鸢尾花的分类:
- 输入数据是花萼长,花萼宽,花瓣长,花瓣宽。
- 输出数据是每个种类的可能性大小
- 神经网络里的每一个具有计算功能的小球就是一个神经元
神经元的计算模型
神经元的计算模型(MP模型):
为了求解简单,先暂时去掉非线性函数,把MP模型简化为这样:
可以写成这个式子:y=x*w+b
x是一行四列的输入特征,w是线上的权重(四行三列),b是三个偏置项,输出y是一行三列的。这个y就是三种鸢尾花各自的可能性大小。
前向传播
- 搭建网络时随机初始化了所有参数:
喂入一组输入特征,和它们对应的标签。神经网络执行前向传播。
带入数据,计算出y,这个过程就叫做前向传播。
输出结果为1类鸢尾花,而不是0类,原因是w和b最初是随机设置的。
要想得到最好的输出结果,也就是和标签一样,我们要找到最优的参数w和b。
那么,如何能找到最优的w和b呢?
做法是这样的,我们把预测值与标准答案的差距用损失函数来表示,当损失函数的值最小时,所对应的w和b就是我们要找的最优参数。损失函数的定义有多种方法,均方误差是一种常用的损失函数。
均方误差
均方误差:
M S E ( y , y ′ ) = ∑ k = 0 n ( y − y ′ ) 2 n MSE(y,y')={\sum_{k=0}^n(y-y')^2\over n} MSE(y,y′)=n∑k=0n(y−y′)2
注意:y是预测值;y’是标签
损失函数的梯度表示损失函数对各参数求偏导后的向量,损失函数梯度下降的方向就是损失函数减小的方向。
梯度下降法
梯度下降法:沿损失函数梯度下降的方向,寻找损失函数的最小值,得到最优参数的方法。
梯度下降法更新参数的公式:
w t + 1 = w t − l r ∗ ∂ l o s s ∂ w t w_{t+1}=w_t-lr*{\partial loss\over \partial w_t} wt+1=wt−lr∗∂wt∂loss
b t + 1 = b t − l r ∗ ∂ l o s s ∂ b t b_{t+1}=b_t-lr*{\partial loss\over \partial b_t} bt+1=bt−lr∗∂bt∂loss
w t + 1 ∗ x + b t + 1 → y w_{t+1}*x+b_{t+1} \rightarrow y wt+1∗x+bt+1→y
注意:lr是学习率,是梯度下降的速度,是一个超参数。
反向传播
反向传播:从后向前,逐层求损失函数对每层神经元参数的偏导数,迭代更新所有参数。
以下代码用以感受一下反向传播,梯度下降,使损失函数减小,参数更新的过程。
import tensorflow as tf
w = tf.Variable(tf.constant(5, dtype=tf.float32))
lr = 0.2
epoch = 40
for epoch in range(epoch): # for epoch 定义顶层循环,表示对数据集循环epoch次,此例数据集数据仅有1个w,初始化时候constant赋值为5,循环40次迭代。
with tf.GradientTape() as tape: # with结构到grads框起了梯度的计算过程。
loss = tf.square(w + 1)
grads = tape.gradient(loss, w) # .gradient函数告知谁对谁求导
w.assign_sub(lr * grads) # .assign_sub 对变量做自减 即:w -= lr*grads 即 w = w - lr*grads
print("After %s epoch,w is %f,loss is %f" % (epoch, w.numpy(), loss))
# 最终目的:找到 loss 最小 即 w = -1 的最优参数w
运行结果:
After 0 epoch,w is 2.600000,loss is 36.000000
After 1 epoch,w is 1.160000,loss is 12.959999
After 2 epoch,w is 0.296000,loss is 4.665599
After 3 epoch,w is -0.222400,loss is 1.679616
After 4 epoch,w is -0.533440,loss is 0.604662
After 5 epoch,w is -0.720064,loss is 0.217678
After 6 epoch,w is -0.832038,loss is 0.078364
...
After 29 epoch,w is -0.999999,loss is 0.000000
After 30 epoch,w is -0.999999,loss is 0.000000
After 31 epoch,w is -1.000000,loss is 0.000000
After 32 epoch,w is -1.000000,loss is 0.000000
After 33 epoch,w is -1.000000,loss is 0.000000
After 34 epoch,w is -1.000000,loss is 0.000000
After 35 epoch,w is -1.000000,loss is 0.000000
After 36 epoch,w is -1.000000,loss is 0.000000
After 37 epoch,w is -1.000000,loss is 0.000000
After 38 epoch,w is -1.000000,loss is 0.000000
After 39 epoch,w is -1.000000,loss is 0.000000
- 如果将学习率设置成0.001,则运行结果变成:
After 0 epoch,w is 4.988000,loss is 36.000000
After 1 epoch,w is 4.976024,loss is 35.856144
After 2 epoch,w is 4.964072,loss is 35.712864
After 3 epoch,w is 4.952144,loss is 35.570156
After 4 epoch,w is 4.940240,loss is 35.428020
After 5 epoch,w is 4.928360,loss is 35.286449
After 6 epoch,w is 4.916503,loss is 35.145447
After 7 epoch,w is 4.904670,loss is 35.005009
After 8 epoch,w is 4.892860,loss is 34.865124
After 9 epoch,w is 4.881075,loss is 34.725803
After 10 epoch,w is 4.869313,loss is 34.587044
After 11 epoch,w is 4.857574,loss is 34.448833
After 12 epoch,w is 4.845859,loss is 34.311172
After 13 epoch,w is 4.834167,loss is 34.174068
After 14 epoch,w is 4.822499,loss is 34.037510
After 15 epoch,w is 4.810854,loss is 33.901497
After 16 epoch,w is 4.799233,loss is 33.766029
After 17 epoch,w is 4.787634,loss is 33.631104
After 18 epoch,w is 4.776059,loss is 33.496712
After 19 epoch,w is 4.764507,loss is 33.362858
After 20 epoch,w is 4.752978,loss is 33.229538
After 21 epoch,w is 4.741472,loss is 33.096756
After 22 epoch,w is 4.729989,loss is 32.964497
After 23 epoch,w is 4.718529,loss is 32.832775
After 24 epoch,w is 4.707092,loss is 32.701576
After 25 epoch,w is 4.695678,loss is 32.570904
After 26 epoch,w is 4.684287,loss is 32.440750
After 27 epoch,w is 4.672918,loss is 32.311119
After 28 epoch,w is 4.661572,loss is 32.182003
After 29 epoch,w is 4.650249,loss is 32.053402
After 30 epoch,w is 4.638949,loss is 31.925320
After 31 epoch,w is 4.627671,loss is 31.797745
After 32 epoch,w is 4.616416,loss is 31.670683
After 33 epoch,w is 4.605183,loss is 31.544128
After 34 epoch,w is 4.593973,loss is 31.418077
After 35 epoch,w is 4.582785,loss is 31.292530
After 36 epoch,w is 4.571619,loss is 31.167484
After 37 epoch,w is 4.560476,loss is 31.042938
After 38 epoch,w is 4.549355,loss is 30.918892
After 39 epoch,w is 4.538256,loss is 30.795341
发现loss值在不断减小,由于学习率过小,导致参数更新过慢,迭代40轮后还未找到w的最优值。
- 如果将学习率设置为0.999,运行结果为:
After 0 epoch,w is -6.988000,loss is 36.000000
After 1 epoch,w is 4.976024,loss is 35.856144
After 2 epoch,w is -6.964072,loss is 35.712860
After 3 epoch,w is 4.952145,loss is 35.570156
After 4 epoch,w is -6.940241,loss is 35.428024
After 5 epoch,w is 4.928361,loss is 35.286461
After 6 epoch,w is -6.916504,loss is 35.145462
After 7 epoch,w is 4.904671,loss is 35.005020
After 8 epoch,w is -6.892861,loss is 34.865135
After 9 epoch,w is 4.881076,loss is 34.725815
After 10 epoch,w is -6.869314,loss is 34.587051
After 11 epoch,w is 4.857575,loss is 34.448849
After 12 epoch,w is -6.845860,loss is 34.311192
After 13 epoch,w is 4.834168,loss is 34.174084
After 14 epoch,w is -6.822500,loss is 34.037521
After 15 epoch,w is 4.810855,loss is 33.901508
After 16 epoch,w is -6.799233,loss is 33.766033
After 17 epoch,w is 4.787635,loss is 33.631107
After 18 epoch,w is -6.776060,loss is 33.496716
After 19 epoch,w is 4.764508,loss is 33.362869
After 20 epoch,w is -6.752979,loss is 33.229557
After 21 epoch,w is 4.741473,loss is 33.096771
After 22 epoch,w is -6.729990,loss is 32.964516
After 23 epoch,w is 4.718530,loss is 32.832787
After 24 epoch,w is -6.707093,loss is 32.701580
After 25 epoch,w is 4.695680,loss is 32.570911
After 26 epoch,w is -6.684288,loss is 32.440765
After 27 epoch,w is 4.672919,loss is 32.311131
After 28 epoch,w is -6.661573,loss is 32.182014
After 29 epoch,w is 4.650250,loss is 32.053413
After 30 epoch,w is -6.638950,loss is 31.925329
After 31 epoch,w is 4.627672,loss is 31.797762
After 32 epoch,w is -6.616417,loss is 31.670694
After 33 epoch,w is 4.605185,loss is 31.544140
After 34 epoch,w is -6.593974,loss is 31.418095
After 35 epoch,w is 4.582787,loss is 31.292547
After 36 epoch,w is -6.571621,loss is 31.167505
After 37 epoch,w is 4.560478,loss is 31.042959
After 38 epoch,w is -6.549357,loss is 30.918919
After 39 epoch,w is 4.538259,loss is 30.795368
发现学习率设置过大,w的值始终在最优值左右两侧来回跳动,未能找到最优w
感谢观看!
如有错误,欢迎批评指正!