对比学习—线性、BP神经网络-异或问题

最新推荐文章于 2022-05-26 09:45:47 发布

向上的研究僧

最新推荐文章于 2022-05-26 09:45:47 发布

阅读量599

点赞数

文章标签：神经网络 python

本文链接：https://blog.csdn.net/qq_43726771/article/details/105167250

版权

1.线性神经网络解决异或问题

（1）线性神经网络与感知器主要的区别在于，感知器的激活函数只能输出两种可能性的值（sign），而线性神经网络的输出可以是任意值，其激活函数是线性函数（y=x）。

（2）异或问题属于非线性问题，线性神经网络处理非线性问题时，把输入数据非线性化，变成多项式形式。除了输入数据的处理，最终还是和解决线性问题一样。x0, x1, x2, x1^2, x1*x2, x2^2

import numpy as np
import matplotlib.pyplot as plt
#输入数据,4个样本
X=np.array([[1,0,0,0,0,0],
            [1,0,1,0,0,1],
            [1,1,0,1,0,0],
            [1,1,1,1,1,1]])
#标签，4个标签值
Y=np.array([-1,1,1,-1])
W=(np.random.random(X.shape[1])-0.5)*2
print(W)
lr=0.11
n=0
output=0
def update():
    global X,Y,W,lr,n,output
    n+=1
    output=np.dot(X,W.T)#线性神经网络在结构上和感知器十分类似，只是激活函数不同，把原来的sign函数改成了purelin函数（y=x）
    W_C=lr*((Y-output).dot(X))/int(X.shape[0])
    W=W+W_C
for _ in range(10000):
    update()#更新权值
    print("更新后的权值:",W)
    print("打印迭代的次数",n)
    output=np.dot(X,W.T)
    print(output)
    if(output==Y.T).all():
        print("Finished!")
        print("epoch",n)
        break
 #正样本
x1=[0,1]
y1=[1,0]
#负样本
x2=[0,1]
y2=[0,1]
def calculate(x,root):#root有两个解
    a=W[5]
    b=W[2]+x*W[4]
    c=W[0]+x*W[1]+x*x*W[3]
    if root==1:
        return (-b+np.sqrt(b*b-4*a*c))/(2*a)
    if root==2:
        return (-b-np.sqrt(b*b-4*a*c))/(2*a)
xdata=np.linspace(-1,2)
plt.figure()
plt.plot(xdata,calculate(xdata,1),'r')
plt.plot(xdata,calculate(xdata,2),'r')
plt.plot(x1,y1,'bo')
plt.plot(x2,y2,'yo')
plt.show()

2.BP神经网络解决异或问题

这篇文章把原理讲的很清楚：https://www.jianshu.com/p/2276ba084602
这里提一下自己练习时候的感悟：
（1）定义标签Y由一维矩阵变成了二维矩阵，和后面的计算有关系；
（2）由于多了一个隐藏层，所以有两个权重矩阵，代码中也很好的体现了误差反向传播这个过程；
（3）最后的自定义函数judge()和map()结合使用挺巧妙的；
这里贴一下跑成功的代码：

import numpy as np
#输入数据
X=np.array([[1,0,0],
            [1,0,1],
            [1,1,0],
            [1,1,1]])
#标签!!!!!注意数组Y的维数是2，每行4个元素，有1行  https://blog.csdn.net/qq_43332629/article/details/90577009
Y=np.array([[0,1,1,0]])
print(Y.shape)
#权值初始化，取值范围是-1到1
V=np.random.random((3,4))*2-1
W=np.random.random((4,1))*2-1
print(V)
print(W)
#学习率的设置
lr=0.11
#激活函数
def sigmoid(x):
    return 1/(1+np.exp(-x))
#对激活函数求导
def dsigmoid(x):
    return x*(1-x)
#更新权重值
def update():
    global X,Y,W,V,lr
    L1=sigmoid(np.dot(X,V))#隐藏层输出（4，4）
    L2=sigmoid(np.dot(L1,W))#输出层输出（4，1）
    #具体公式推导 https://www.jianshu.com/p/2276ba084602，重要！！！！
    L2_delta=(Y.T-L2)*dsigmoid(L2)# L2的倒数=（理想输出-实际输出）*L2经过激活函数的倒数
    L1_delta=L2_delta.dot(W.T)*dsigmoid(L1)# L1的倒数=L2的倒数*权值（上一层的反馈）*L1经过激活函数的倒数
    print(L2_delta)
    W_C=lr*L1.T.dot(L2_delta)#学习率*L1.T*L2的倒数
    V_C=lr*X.T.dot(L1_delta)#学习率*X.T*L1的倒数
    
    W=W+W_C
    V=V+V_C
for i in range(20000):
    update()
    if i%500==0:
        L1=sigmoid(np.dot(X,V))#隐藏层输出（4，4）
        L2=sigmoid(np.dot(L1,W))#输出层输出（4，1）
        print("Error",np.mean(np.abs(Y.T-L2)))
L1=sigmoid(np.dot(X,V))#隐藏层输出（4，4）
L2=sigmoid(np.dot(L1,W))#输出层输出（4，1）
print(L2)
def judge(x):
    if x>=0.5:
        return 1
    else:
        return 0
for i in map(judge,L2):#map() 会根据提供的函数对指定序列做映射;function -- 函数名；iterable -- 一个或多个序列
    print(i)