神经网络(4)代码解决异或问题

最新推荐文章于 2024-09-04 18:33:27 发布

是忘生啊

最新推荐文章于 2024-09-04 18:33:27 发布

阅读量1.3k

点赞数 1

分类专栏：机器学习文章标签：神经网络机器学习矩阵

本文链接：https://blog.csdn.net/m0_51456926/article/details/122491245

版权

机器学习专栏收录该内容

26 篇文章 0 订阅

订阅专栏

1.构建神经网络

为了解决异或问题，我们构建如图所示的神经网络
在这里插入图片描述
该神经网络一共有3层，其中 $z^{(l)}$ 表示第l层未经激活函数之前的中间变量， $a^{(l)}$ 表示第l层输出， $\theta_{1}$ 表示输入层和隐藏层之间的权重矩阵， $\theta_{2}$ 表示隐藏层和输出层之间的权值矩阵。

2.代码

import numpy as np

#输入数据
X=np.array([[0,0],
            [0,1],
            [1,0],
            [1,1]])
y=np.array([[0],[1],[1],[0]])
print(y.shape)

(4, 1)

sigmoid()函数为： $g(z)=\frac{1}{1+e^{-z}}$
其导数为：
$g^{'}(z)=\frac{e^{-z}}{(1+e^{-z})^{2}}$

$=\frac{1}{1+e^{-z}}\frac{e^{-z}}{1+e^{-z}}$

$=\frac{1}{1+e^{-z}}(\frac{1+e^{-z}}{1+e^{-z}}-\frac{1}{1+e^{-z}})$

$= g (z) (1 - g (z))$

#sigmoid函数
def sigmoid(x):
    return 1.0/(1+np.exp(-x))
#sigmoid函数的导数
def dsigmoid(x):
    return np.multiply(sigmoid(x),1-sigmoid(x))

我们构建的神经网络中：
$\theta_{1}$ 是一个4行3列的一个矩阵
$\theta_{2}$ 是一个1行5列的一个矩阵

前向传播算法：
$KaTeX parse error: Can't use function '$' in math mode at position 8: a^{(1)}$̲=x(加上偏置项x_{0}=1…$
$z(2)=a^{(1)}\theta_{1}$
$a^{(2)}=sigmoid(z^{(2)})$
$z^{(3)}=a^{(2)}\theta_{2}(a_{(2)}加上偏置项a^{(2)}_{0}=1)$
$result=a^{(3)}=sigmoid(z^{(3)})$

def forward_propagate(X,theta1,theta2):

  X=np.matrix(X)
  theta1=np.matrix(theta1)
  theta2=np.matrix(theta2)#将其都转化为矩阵

  m=X.shape[0]

  # print(X.shape)
  # print(theta1.shape)
  # print(theta2.shape)
  #为输入层添加偏置
  a1=np.concatenate((np.ones((m,1)),X),axis=1)
  #print(a1.shape)
  #计算z2
  z2=a1*theta1.T
  
  a2=np.concatenate((np.ones((m,1)),sigmoid(z2)),axis=1)
  #z2
  z3=a2*theta2.T

  res=sigmoid(z3)#输出层
  
  return a1,z2,a2,z3,res

在给参数模型参数设置初值时，在前面的线性回归，逻辑回归中我们都给参数设置了初值0，但这种设置初值的方法在神经网络中是行不通的。因为如果给参数全部都设置为一样的值0，那么我们由输入层得到的第二层的所有激活单元的值就会一样。所有的参数设置为同一个非0数也不可以。

#参数初始化
theta1=np.random.rand(4,3)
theta2=np.random.rand(1,5)
theta1,theta2
print(theta1.shape,'   ',theta2.shape)
a1,z2,a2,z3,res=forward_propagate(X,theta1,theta2)
res

(4, 3)     (1, 5)





matrix([[0.76925913],
        [0.77569001],
        [0.79155468],
        [0.7967637 ]])

#代价函数
def cost(X,y,theta1,theta2):
    X=np.matrix(X)
    y=np.matrix(y)
    theta1=np.matrix(theta1)
    theta2=np.matrix(theta2)
    
    a1,z2,a2,z3,res=forward_propagate(X,theta1,theta2)

    #print(res)
    J=0
    #print(y.shape[0])
    for i in range(y.shape[0]):
        first=np.multiply(y[i,:],np.log(res[i,:]))
        second=np.multiply(1-y[i,:],np.log(1-res[i,:]))
        print(first,'    ',second)
        J+=-np.sum((first+second))
    return J/y.shape[0]
cost=cost(X,y,theta1,theta2)
print(cost)

[[-0.]]      [[-1.46645997]]
[[-0.2540023]]      [[-0.]]
[[-0.23375632]]      [[-0.]]
[[-0.]]      [[-1.59338592]]
0.886901128052618

反向传播：
$\delta^{(3)}=a^{(3)}-y$
$\delta^{(2)}=(\theta^{(2)})^{T}\delta^{3}g^{'}(z^{(2)})$
$\frac{\partial}{\partial{\theta^{(2)}}}J(\theta)=a^{(2)}\delta^{(3)}$
$\frac{\partial}{\partial{\theta^{(1)}}}J(\theta)=a^{(l)}\delta^{(2)}$
反向传播算法：
反向传播算法步骤：
1.对神经网络的权值进行随机初始化
2：遍历所有样本
1.运用前向传播算法，得到预测值 $a^{L}=h_{\theta}(x)$ .
2.运用反向传播算法，从输出层开始计算每层的误差，以此来求取偏导。输出层的误差即为预测值与真实值之间的差值： $\delta^{L}=a^{L}-y$ ,对于隐藏层中的每一层的误差，都通过上一层的误差来计算。得到 $\delta^{l}=(\theta^{(l)})^{T}\delta^{l+1}.*a^{(l)}*(1-a^{(l)},a^{(l)}_{0}=1$
依次求解并累加误差。 $\Delta^{(l)}_{i,j}=\Delta^{(l)}_{i,j}+a_{j}^{(l)}\delta^{l+1}_{i}$ ,向量化实现： $\Delta^{l}=|Delata^{l}+\delta^{l+1}(a^{(l)})^{T}$
3.遍历完所有样本后，求得偏导为： $\frac{\partial}{\partial \theta^{(l)}_{i,j}}J(\theta)=D^{(l)}_{i,j}$
在这里插入图片描述

def back_propagate(X,y,theta1,theta2):
    X=np.matrix(X)
    y=np.matrix(y)
    theta1=np.matrix(theta1)
    theta2=np.matrix(theta2)

    a1,z2,a2,z3,res=forward_propagate(X,theta1,theta2)
    Deleta_1=Deleta_2=0
    for i in range(y.shape[0]):
        deleta3=res[i]-y[i]#1*1
    

        z2_i=np.concatenate((np.ones((1,1)),z2[i,:]),axis=1)

        
        
        deleta2=np.multiply((theta2.T*deleta3.T).T,dsigmoid(z2_i))

        Deleta_1+=(deleta2[:,1:]).T*a1[i]
        Deleta_2+=(deleta3.T)*a2[i]
    
    return Deleta_1,Deleta_2

Deleta_1,Deleta_2=back_propagate(X,y,theta1,theta2)
Deleta_1
Deleta_2

matrix([[1.13326752, 0.82401803, 0.77403078, 0.76294633, 0.84984931]])

lr=0.1#学习率
epochs=2000#迭代次数
#print(theta1,' ',theta2)
#cost(X,y,theta1,theta2)
for i in range(epochs+1):
    Deleta_1,Deleta_2=back_propagate(X,y,theta1,theta2)
    a1,z2,a2,z3,res=forward_propagate(X,theta1,theta2)
    if i%50==0:
        print('error:',np.mean(np.abs(res-y)))#计算误差
    theta1=theta1-lr*Deleta_1#进行梯度下降，调整权值
    theta2=theta2-lr*Deleta_2
a1,z2,a2,z3,res=forward_propagate(X,theta1,theta2)

print('theta1:',theta1)
print('theta2:',theta2)
res
#cost=cost(X,y,theta1,theta2)

error: 0.040183617427798815
error: 0.037792867515279736
error: 0.03565109495819674
error: 0.03372308789986214
error: 0.031979741909257675
error: 0.030396830901489855
error: 0.028954055102851477
error: 0.027634297830494314
error: 0.026423040868398265
error: 0.025307901217641898
error: 0.024278261430987456
error: 0.023324972631312165
error: 0.022440114380234615
error: 0.021616799315726496
error: 0.02084901327611314
error: 0.0201314837296932
error: 0.019459570918586792
error: 0.018829177335373613
error: 0.018236672078206877
error: 0.017678827345008367
error: 0.017152764882015652
error: 0.016655910634840936
error: 0.016185956189981573
error: 0.01574082586288858
error: 0.015318648501479008
error: 0.014917733243664221
error: 0.014536548603465642
error: 0.014173704369793188
error: 0.013827935890556024
error: 0.013498090386740928
error: 0.01318311499983097
error: 0.012882046324045687
error: 0.012594001214464417
error: 0.012318168694768972
error: 0.012053802815420003
error: 0.011800216335601915
error: 0.011556775121061647
error: 0.011322893165701568
error: 0.011098028157998123
error: 0.010881677524456151
error: 0.01067337489171166
theta1: [[-1.68701807  6.3911878   6.1133159 ]
 [ 1.76099424  7.3679309  -4.98657228]
 [-0.56086751 -1.58570626  5.23497725]
 [-0.69598197 -1.56579716  5.46369586]]
theta2: [[ 3.62659633 11.38297911 -9.38942738 -4.88871387 -5.35806465]]





matrix([[0.00209311],
        [0.98738079],
        [0.98903377],
        [0.0169986 ]])

3.np.insert 和np.concate nate

b=np.array([[3,4],
            [3,4],])
e=np.array([[6,5],
            [6,5],])
x=np.ones((4,1))
#print(x)
f=np.concatenate((b,b))
a=np.concatenate((b,b),axis=0)#不写axis默认按列拼接
c=np.concatenate((b,e),axis=1)#按列拼接
d=np.concatenate((b,e),axis=-1)#按列拼接
print(f)

b=np.array([[3,4],
            [3,4],])
e=np.array([[6,5],
            [6,5],])
x=np.ones((4,1))
#a=np.insert(arr,obj,values,axis)
#arr原始数组，可一可多
#obj插入元素的位置
#values 插入内容
#axis按行插入还是按列（0，行，1列）
#f=np.insert(b,0,e,axis=0)
'''[[6 5]
 [6 5]
 [3 4]
 [3 4]]'''
f=np.insert(b,1,e,axis=0)
'''[[3 4]
 [6 5]
 [6 5]
 [3 4]]'''
f=np.insert(b,0,e,axis=1)
'''[[6 6 3 4]
 [5 5 3 4]]'''
f=np.insert(b,1,e,axis=1)
'''[[3 6 6 4]
 [3 5 5 4]]'''
print(f)