单隐藏层神经网络编程+无调包

1. Generate n n n equidistant data points within the interval [ − 1 , 1 ] [-1,1] [1,1].——生成-1到1区间内的n个点,这n个点为Runge函数上的点

# You don't need this line if you don't run this code in Jupyter notebook
%matplotlib inline

############################################
# You only need this part
import numpy as np

# n is the number of data points
n=21

# Choose x to be equidistant points
x=np.linspace(-1,1,n)

# Calculate the corresponding y
y=1/(1+25*x**2)
############################################

# For visulization
import matplotlib.pyplot as plt
fig1=plt.figure()
ax1=fig1.add_subplot(111)
ax1.plot(x,y,"or")
plt.show()

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-g3uSgn7t-1603597929509)(output_2_0.png)]

2. Calculate the Loss function——计算loss函数( y j y_j yj为真值, y j ^ \hat {y_j} yj^为近似值)

Given dataset { ( x j , y j ) } j = 1 n \{(x_j,y_j)\}^n_{j=1} {(xj,yj)}j=1n. A neural network with one hidden layer and m m m nodes approximates y j y_j yj with y j ^ \hat{y_j} yj^:
y j ^ = ∑ i = 1 m c i f ( w i x j + b i ) \hat{y_j}=\sum^m_{i=1} c_i f(w_ix_j+b_i) yj^=i=1mcif(wixj+bi)
where w , b , c ∈ R m w,b,c\in\mathbb{R}^m w,b,cRm are the parameters, f f f is the activation function.
The loss function is defined as:

L ( w , b , c ) = 1 2 ∑ j = 1 n ∥ y j − y j ^ ∥ 2 = 1 2 ∑ j = 1 n ∥ y j − ∑ i = 1 m c i f ( w i x j + b i ) ∥ 2 \begin{aligned} L(w,b,c)&=\frac{1}{2}\sum^n_{j=1}\|y_j-\hat{y_j}\|^2\\ &=\frac{1}{2}\sum^n_{j=1}\|y_j-\sum^m_{i=1} c_i f(w_ix_j+b_i)\|^2 \end{aligned} L(w,b,c)=21j=1nyjyj^2=21j=1nyji=1mcif(wixj+bi)2

##############################################################################
# You only need this part

# The relu function
def my_relu(x):
    return (abs(x) + x) / 2

# The sigmoid function
def my_sigmoid(x):
    return 1/(1+np.exp(-x))

# This function is used for calculating \hat{y}_j given x_j,w,b,c
def my_predict(x,w,b,c,method):
    yhat=0
    m=w.shape[0]
    if method=="relu":
        for i in range(m):
            z=w[i]*x+b[i]
            yhat=yhat+c[i]*my_relu(z)
    elif method=="sigmoid":
        for i in range(m):
            z=w[i]*x+b[i]
            yhat=yhat+c[i]*my_sigmoid(z) 
    return yhat

# This function calculates the value of the loss function
def my_loss(x,y,w,b,c,method):
    loss=0
    for i in range(y.shape[0]):
        yhat=my_predict(x[i],w,b,c,method)
        loss=loss+(yhat-y[i])**2
    return 0.5*loss
###############################################################################


# m is the number of nodes you use
m=2

# w,b,c are the decision variables to be optimized, here we just select fixed values
w=np.random.randn(m)
b=np.random.randn(m)
c=np.random.randn(m)

# Try relu activation function and sigmoid activation function
print(my_loss(x,y,w,b,c,"relu"))
print(my_loss(x,y,w,b,c,"sigmoid"))
11.637683397902427
8.748887143303836

3. Calculate the gradient——计算梯度

We then calculate the gradient with respect to each element of w , b , c w,b,c w,b,c.
(对w,b和c分别求偏导)

∂ L ∂ w i = ∑ j = 1 n ( y j − ∑ i = 1 m c i f ( w i x j + b i ) ) ( − c i ∇ f ( w i x j + b i ) ) x j ∂ L ∂ b i = ∑ j = 1 n ( y j − ∑ i = 1 m c i f ( w i x j + b i ) ) ( − c i ∇ f ( w i x j + b i ) ) ∂ L ∂ c i = ∑ j = 1 n ( y j − ∑ i = 1 m c i f ( w i x j + b i ) ) ( − f ( w i x j + b i ) ) \begin{aligned} \frac{\partial L}{\partial w_i}&=\sum^n_{j=1} \left(y_j-\sum^m_{i=1} c_i f(w_ix_j+b_i)\right)\left(-c_i\nabla f(w_ix_j+b_i)\right)x_j\\ \frac{\partial L}{\partial b_i}&=\sum^n_{j=1} \left(y_j-\sum^m_{i=1} c_i f(w_ix_j+b_i)\right)\left(-c_i\nabla f(w_ix_j+b_i)\right)\\ \frac{\partial L}{\partial c_i}&=\sum^n_{j=1} \left(y_j-\sum^m_{i=1} c_i f(w_ix_j+b_i)\right)\left(-f(w_ix_j+b_i)\right) \end{aligned} wiLbiLciL=j=1n(yji=1mcif(wixj+bi))(cif(wixj+bi))xj=j=1n(yji=1mcif(wixj+bi))(cif(wixj+bi))=j=1n(yji=1mcif(wixj+bi))(f(wixj+bi))

############################################################################
# You only need this part

# This function calculates the gradient of relu function
def my_gradrelu(x):
    return np.sign(x+abs(x))

# This function calculates the gradient of sigmoid function
def my_gradsigmoid(x):
    return my_sigmoid(x)*(1-my_sigmoid(x))

# This function calculate the gradient with respect to c
def my_gradc(x,y,w,b,c,method):
    gc=np.zeros(w.shape[0])
    for i in range(w.shape[0]):
        if method=="sigmoid":
            for j in range(x.shape[0]):
                gc[i]=gc[i]+(y[j]-my_predict(x[j],w,b,c,method))*(-my_sigmoid(w[i]*x[j]+b[i]))
        elif method=="relu":
            for j in range(x.shape[0]):
                gc[i]=gc[i]+(y[j]-my_predict(x[j],w,b,c,method))*(-my_relu(w[i]*x[j]+b[i]))
                
    return gc

# This function calculate the gradient with respect to b
def my_gradb(x,y,w,b,c,method):
    gb=np.zeros(w.shape[0])
    for i in range(w.shape[0]):
        if method=="sigmoid":
            for j in range(x.shape[0]):
                gb[i]=gb[i]+(y[j]-my_predict(x[j],w,b,c,method))*(-c[i]*my_gradsigmoid(w[i]*x[j]+b[i]))
        elif method=="relu":
            for j in range(x.shape[0]):
                gb[i]=gb[i]+(y[j]-my_predict(x[j],w,b,c,method))*(-c[i]*my_gradrelu(w[i]*x[j]+b[i]))
    return gb

# This function calculate the gradient with respect to w
def my_gradw(x,y,w,b,c,method):
    gw=np.zeros(w.shape[0])
    for i in range(w.shape[0]):
        if method=="sigmoid":
            for j in range(x.shape[0]):
                gw[i]=gw[i]+(y[j]-my_predict(x[j],w,b,c,method))*(-c[i]*my_gradsigmoid(w[i]*x[j]+b[i]))*x[j]
        elif method=="relu":
            for j in range(x.shape[0]):
                gw[i]=gw[i]+(y[j]-my_predict(x[j],w,b,c,method))*(-c[i]*my_gradrelu(w[i]*x[j]+b[i]))*x[j]
    return gw
###############################################################################
   
# m is the number of nodes you use
m=2

# As is mentioned before, w,b,c are the decision variables to be optimized, here we just select fixed values
w=np.random.randn(m)
b=np.random.randn(m)
c=np.random.randn(m)

# Try the gradient calculation function
print(my_gradc(x,y,w,b,c,"relu"))
print(my_gradb(x,y,w,b,c,"relu"))
print(my_gradw(x,y,w,b,c,"sigmoid"))
[0. 0.]
[0. 0.]
[-0.35377667  0.08969916]
  • 2
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值