8月20日计算机视觉理论学习笔记——神经网络与BP算法


前言

本文为8月20日计算机视觉理论学习笔记——神经网络与BP算法,分为三个章节:

  • Delta 学习规则;
  • 梯度下降;
  • Numpy 实现反向传播。

一、Delta 学习规则

有监督学习算法,根据神经元的实际输出与期望输出差别来调整连接权:

△ w i j = a ⋅ ( d i − y i ) x j ( t ) \bigtriangleup w_{ij} = a\cdot (d_i - y_i)x_j(t) wij=a(diyi)xj(t)

其中: △ w i j \bigtriangleup w_{ij} wij 为权重增量, d i d_i di 是神经元 i i i 的期望输出, y i y_i yi 是神经元 i i i 的实际输出, a a a 是学习速度。

  • 目标函数:
    J ( w ) = 1 2 ∣ ∣ t − z ∣ ∣ 2 = 1 2 ∑ k = 1 c ( t k − z k ) 2 J(w) = \frac{1}{2} ||\textbf {t} - \textbf{z}||^2 = \frac{1}{2} \sum_{k=1}^{c}(t_k - z_k)^2 J(w)=21∣∣tz2=21k=1c(tkzk)2

二、梯度下降

w ( m + 1 ) = w ( m ) + △ w ( m ) = w ( m ) − η ∂ J ∂ w w(m+1) = w(m) + \bigtriangleup w(m) = w(m) - \eta \frac{\partial J}{\partial w} w(m+1)=w(m)+w(m)=w(m)ηwJ

1、输出层权重改变量

1

J ( w ) = 1 2 ∑ k = 1 c ( t k − z k ) 2 ∂ J ∂ w k j = ∂ J ∂ n e t k ∂ n e t k ∂ w k j J(w) = \frac{1}{2} \sum_{k=1}^{c}(t_k - z_k)^2\\ \frac{\partial J}{\partial w_{kj}} = \frac{\partial J}{\partial net_k} \frac{\partial net_k}{\partial w_{kj}} J(w)=21k=1c(tkzk)2wkjJ=netkJwkjnetk
其中,输出单元的总输入 n e t k = ∑ i = 1 n H w k i y i net_k = \sum_{i=1}^{n_H} w_{ki}y_i netk=i=1nHwkiyi ∂ n e t k ∂ w k j = y j \frac{\partial net_k}{\partial w_{kj}} = y_j wkjnetk=yj

∂ J ∂ n e t k = ∂ J ∂ z k ∂ z k ∂ n e t k = − ( t k − z k ) f ′ ( n e t k ) \frac{\partial J}{\partial net_k} = \frac{\partial J}{\partial z_k} \frac{\partial z_k}{\partial net_k} = -(t_k - z_k)f'(net_k) netkJ=zkJnetkzk=(tkzk)f(netk)

δ k = ( t k − z k ) f ′ ( n e t k ) \delta_k = (t_k - z_k)f'(net_k) δk=(tkzk)f(netk),则:
∂ J ∂ n e t k = − ( t k − z k ) f ′ ( n e t k ) y j = − δ k y j \frac{\partial J}{\partial net_k} = -(t_k - z_k)f'(net_k)y_j = -\delta _ky_j netkJ=(tkzk)f(netk)yj=δkyj

2

2、隐藏层权重该变量

3
∂ J ∂ w j i = ∂ J ∂ y j ∂ y j ∂ n e t j ∂ n e t j ∂ w j i \frac{\partial J}{\partial w_{ji}} = \frac{\partial J}{\partial y_j} \frac{\partial y_j}{\partial net_j} \frac{\partial net_j}{\partial w_{ji}} wjiJ=yjJnetjyjwjinetj

又,
n e t j = ∑ m = 1 d w j m x m net_j = \sum_{m=1}^{d} w_{jm} x_m netj=m=1dwjmxm
则:
∂ y j ∂ n e t j = f ′ ( n e t j ) ∂ n e t j ∂ w j i = x i ∂ J ∂ y j = ∂ ∂ y j [ 1 2 ∑ k = 1 c ( t k − z k ) 2 ] = − ∑ k = 1 c ( t k − z k ) f ′ ( n e t k ) w k j ∂ J ∂ w j i = − [ ∑ k = 1 c ( t k − z k ) f ′ ( n e t k ) w k j ] f ′ ( n e t j ) x i \frac{\partial y_j}{\partial net_j} = f'(net_j)\\ \frac{\partial net_j}{\partial w_{ji}} = x_i\\ \frac{\partial J}{\partial y_j} = \frac{\partial }{\partial y_j} [\frac{1}{2} \sum_{k=1}^{c} (t_k - z_k)^2 ] = -\sum_{k=1}^{c} (t_k - z_k) f'(net_k) w_{kj}\\ \frac{\partial J}{\partial w_{ji}} = -[\sum_{k=1}^{c} (t_k - z_k) f'(net_k) w_{kj}] f'(net_j) x_i netjyj=f(netj)wjinetj=xiyjJ=yj[21k=1c(tkzk)2]=k=1c(tkzk)f(netk)wkjwjiJ=[k=1c(tkzk)f(netk)wkj]f(netj)xi

δ j = f ′ ( n e t j ) ∑ k = 1 c δ k w k j \delta_j = f'(net_j) \sum_{k=1}^{c} \delta_k w_{kj} δj=f(netj)k=1cδkwkj,则:
∂ J ∂ w j i = − δ j x i \frac{\partial J}{\partial w_{ji}} = -\delta_j x_i wjiJ=δjxi

4
总结如下:

  • 权重增量 = -1 × 学习步长 × 目标函数对权重的偏导数;
  • 目标函数对权重的偏导数 = -1 × 残差 × 当前层的输入;
  • 残差 = 当前层激励函数的导数 × 上层反传来的误差;
  • 上层反传来的误差 = 上层残差的加权和。

代码如下:

tf.set_random_seed(777)
learning_rate = 0.1

x_data = [[0, 0],
          [0, 1],
          [1, 0],
          [1, 1]]

y_data = [[0],
          [1],
          [1],
          [0]]

x_data = np.array(x_data, dtype=np.float32)
y_data = np.array(y_data, dtype=np.float32)

X = tf.placeholder(tf.float32, [None, 2])
Y = tf.placeholder(tf.float32, [None, 1])

W = tf.Variable(tf.random_normal([2, 1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')

# 期望值
hypothesis = tf.sigmoid(tf.matmul(X, W) + b)

# 损失函数
loss = -tf.reduce_mean(Y * tf.log(hypothesis) + (1 - Y) *
                       tf.log(1 - hypothesis))

# 训练
train = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(loss)
# Accuracy computation
# True if hypothesis > 0.5 else False
pred = tf.cast(hypothesis>0.5, dtype=tf.float32)
acc = tf.reduce_mean(tf.cast(tf.equal(pred, Y), dtype=tf.float32))

# Launch graph
with tf.Session() as sess:
    # 初始化变量
    sess.run(tf.global_variables_initializer())
    
    for step in range(10001):
        sess.run(train, feed_dict={X: x_data, Y: y_data})
        if step % 100 == 0:
            print(step, sess.run(loss, feed_dict={
                  X: x_data, Y: y_data}), sess.run(W))
            
    # 展示准确率
    h, c, a = sess.run([hypothesis, pred, acc],
                       feed_dict={X: x_data, Y: y_data})
    print("\nHypothesis: ", h, "\nCorrect: ", c, "\nAccuracy: ", a)

>>> Hypothesis:  [[0.5]
	 [0.5]
	 [0.5]
	 [0.5]] 
	Correct:  [[0.]
	 [0.]
	 [0.]
	 [0.]] 
	Accuracy:  0.5

3、随机梯度下降(SGD)

用部分样本迭代。


三、Numpy 实现反向传播

# 定义双曲函数和它们的导数
def tanh(x):
    return np.tanh(x)

def tanh_deriv(x):
    return 1. - np.tanh(x)**2

def logistic(x):
    return 1 / (1 + np.exp(-x))

def logistic_derivative(x):
    return logistic(x) * (1 - logistic(x))

# 定义神经网络
class NeuralNetwork:
    #初始化,layes表示的是一个list,eg[10,10,3]表示第一层10个神经元,第二层10个神经元,第三层3个神经元
    
    def __init__(self, layers, activation='tanh'):
        '''
        layers: 列表,至少有 2个值;
        activation: 'tanh' or 'logistic'
        '''
        
        if activation == 'logistic':
            self.activation = logistic
            self.activation_deriv = logistic_derivative
            
        elif activation == 'tanh':
            self.activation = tanh
            self.activation_deriv = tanh_deriv
            
        self.weights = []
        #循环从1开始,相当于以第二层为基准,进行权重的初始化
        for i in range(1, len(layers) - 1):
            # 对当前神经节点的前驱赋值
            self.weights.append((2*np.random.random((layers[i - 1] + 1, layers[i] + 1))-1)*0.25)
            
            # 对当前神经节点的后继赋值
            self.weights.append((2*np.random.random((layers[i] + 1, layers[i + 1]))-1)*0.25)
            
    #训练函数   ,X矩阵,每行是一个实例 ,y是每个实例对应的结果
        
    def fit(self, X, y, learning_rate=0.1, epochs=100):
        X = np.atleast_2d(X) # 确定 X 至少是二维数据
        temp = np.ones([X.shape[0], X.shape[1] + 1]) # 初始化矩阵
        temp[:, 0:-1] = X 
        
        X = temp
        y = np.array(y)
        
        for k in range(epochs):
            #随机选取一行,对神经网络进行更新
            i = np.random.randint(X.shpae[0])
            a = [X[i]]
            
            # 完成所有正向的更新
            for l in range(len(self.weights)):  
                a.append(self.activation(np.dot(a[l], self.weights[l])))
                
            error = y[i] - a[-1]  
            deltas = [error * self.activation_deriv(a[-1])]
            if  k%1000 == 0:
                print(k,'...',error*error*100)
                
            # 反向计算误差,更新权重
            for l in range(len(a)-2, 0, -1): # 从倒数第二层开始
                deltas.append(deltas[-1].dot(self.weights[l].T)*self.activation_deriv(a[l]))
                
            deltas.reverse()
            for i in range(len(self.weights)):
                layer = np.atleast_2d(a[i])
                delta = np.atleast_2d(deltas[i])
                self.weights[i] = learning_rate * layer.T.dot(delta)
                
    # 预测函数
    def predict(self, x):
        x = np.array(x)
        temp = np.ones(x.shape[0] + 1)
        temp[0:-1] = x
        a = temp
        for l in range(0, len(self.weights)):
            a = self.activation(np.self.weights[l])
        return a

nn = NeuralNetwork([2,2,1], 'tanh')  
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])  
y = np.array([0, 1, 1, 0])  
nn.fit(X, y)  
for i in [[0, 0], [0, 1], [1, 0], [1,1]]:  
    print(i,nn.predict(i))

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值