单隐藏层神经网络编程+无调包

最新推荐文章于 2022-08-18 10:14:38 发布

法蒂芬

最新推荐文章于 2022-08-18 10:14:38 发布

阅读量552

点赞数 2

分类专栏：神经网络文章标签：神经网络

本文链接：https://blog.csdn.net/weixin_44039183/article/details/109271893

版权

神经网络专栏收录该内容

3 篇文章 0 订阅

订阅专栏

Computer programming of one-hidden-layer neural network(No neural network packets were used）——单隐藏层神经网络编程+无调包

1. Generate $n$ equidistant data points within the interval $[- 1, 1]$ .——生成-1到1区间内的n个点，这n个点为Runge函数上的点

# You don't need this line if you don't run this code in Jupyter notebook
%matplotlib inline

############################################
# You only need this part
import numpy as np

# n is the number of data points
n=21

# Choose x to be equidistant points
x=np.linspace(-1,1,n)

# Calculate the corresponding y
y=1/(1+25*x**2)
############################################

# For visulization
import matplotlib.pyplot as plt
fig1=plt.figure()
ax1=fig1.add_subplot(111)
ax1.plot(x,y,"or")
plt.show()

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-g3uSgn7t-1603597929509)(output_2_0.png)]

2. Calculate the Loss function——计算loss函数（ $y_j$ 为真值， $\hat {y_j}$ 为近似值）

Given dataset $\{(x_j,y_j)\}^n_{j=1}$ . A neural network with one hidden layer and $m$ nodes approximates $y_j$ with $\hat{y_j}$ :

$\hat{y_j}=\sum^m_{i=1} c_i f(w_ix_j+b_i)$

where $w,b,c\in\mathbb{R}^m$ are the parameters, $f$ is the activation function.

The loss function is defined as:

$\begin{aligned} L(w,b,c)&=\frac{1}{2}\sum^n_{j=1}\|y_j-\hat{y_j}\|^2\\ &=\frac{1}{2}\sum^n_{j=1}\|y_j-\sum^m_{i=1} c_i f(w_ix_j+b_i)\|^2 \end{aligned}$

##############################################################################
# You only need this part

# The relu function
def my_relu(x):
    return (abs(x) + x) / 2

# The sigmoid function
def my_sigmoid(x):
    return 1/(1+np.exp(-x))

# This function is used for calculating \hat{y}_j given x_j,w,b,c
def my_predict(x,w,b,c,method):
    yhat=0
    m=w.shape[0]
    if method=="relu":
        for i in range(m):
            z=w[i]*x+b[i]
            yhat=yhat+c[i]*my_relu(z)
    elif method=="sigmoid":
        for i in range(m):
            z=w[i]*x+b[i]
            yhat=yhat+c[i]*my_sigmoid(z) 
    return yhat

# This function calculates the value of the loss function
def my_loss(x,y,w,b,c,method):
    loss=0
    for i in range(y.shape[0]):
        yhat=my_predict(x[i],w,b,c,method)
        loss=loss+(yhat-y[i])**2
    return 0.5*loss
###############################################################################


# m is the number of nodes you use
m=2

# w,b,c are the decision variables to be optimized, here we just select fixed values
w=np.random.randn(m)
b=np.random.randn(m)
c=np.random.randn(m)

# Try relu activation function and sigmoid activation function
print(my_loss(x,y,w,b,c,"relu"))
print(my_loss(x,y,w,b,c,"sigmoid"))

11.637683397902427
8.748887143303836

3. Calculate the gradient——计算梯度

We then calculate the gradient with respect to each element of $w, b, c$ .

（对w，b和c分别求偏导）

$\begin{aligned} \frac{\partial L}{\partial w_i}&=\sum^n_{j=1} \left(y_j-\sum^m_{i=1} c_i f(w_ix_j+b_i)\right)\left(-c_i\nabla f(w_ix_j+b_i)\right)x_j\\ \frac{\partial L}{\partial b_i}&=\sum^n_{j=1} \left(y_j-\sum^m_{i=1} c_i f(w_ix_j+b_i)\right)\left(-c_i\nabla f(w_ix_j+b_i)\right)\\ \frac{\partial L}{\partial c_i}&=\sum^n_{j=1} \left(y_j-\sum^m_{i=1} c_i f(w_ix_j+b_i)\right)\left(-f(w_ix_j+b_i)\right) \end{aligned}$

############################################################################
# You only need this part

# This function calculates the gradient of relu function
def my_gradrelu(x):
    return np.sign(x+abs(x))

# This function calculates the gradient of sigmoid function
def my_gradsigmoid(x):
    return my_sigmoid(x)*(1-my_sigmoid(x))

# This function calculate the gradient with respect to c
def my_gradc(x,y,w,b,c,method):
    gc=np.zeros(w.shape[0])
    for i in range(w.shape[0]):
        if method=="sigmoid":
            for j in range(x.shape[0]):
                gc[i]=gc[i]+(y[j]-my_predict(x[j],w,b,c,method))*(-my_sigmoid(w[i]*x[j]+b[i]))
        elif method=="relu":
            for j in range(x.shape[0]):
                gc[i]=gc[i]+(y[j]-my_predict(x[j],w,b,c,method))*(-my_relu(w[i]*x[j]+b[i]))
                
    return gc

# This function calculate the gradient with respect to b
def my_gradb(x,y,w,b,c,method):
    gb=np.zeros(w.shape[0])
    for i in range(w.shape[0]):
        if method=="sigmoid":
            for j in range(x.shape[0]):
                gb[i]=gb[i]+(y[j]-my_predict(x[j],w,b,c,method))*(-c[i]*my_gradsigmoid(w[i]*x[j]+b[i]))
        elif method=="relu":
            for j in range(x.shape[0]):
                gb[i]=gb[i]+(y[j]-my_predict(x[j],w,b,c,method))*(-c[i]*my_gradrelu(w[i]*x[j]+b[i]))
    return gb

# This function calculate the gradient with respect to w
def my_gradw(x,y,w,b,c,method):
    gw=np.zeros(w.shape[0])
    for i in range(w.shape[0]):
        if method=="sigmoid":
            for j in range(x.shape[0]):
                gw[i]=gw[i]+(y[j]-my_predict(x[j],w,b,c,method))*(-c[i]*my_gradsigmoid(w[i]*x[j]+b[i]))*x[j]
        elif method=="relu":
            for j in range(x.shape[0]):
                gw[i]=gw[i]+(y[j]-my_predict(x[j],w,b,c,method))*(-c[i]*my_gradrelu(w[i]*x[j]+b[i]))*x[j]
    return gw
###############################################################################
   
# m is the number of nodes you use
m=2

# As is mentioned before, w,b,c are the decision variables to be optimized, here we just select fixed values
w=np.random.randn(m)
b=np.random.randn(m)
c=np.random.randn(m)

# Try the gradient calculation function
print(my_gradc(x,y,w,b,c,"relu"))
print(my_gradb(x,y,w,b,c,"relu"))
print(my_gradw(x,y,w,b,c,"sigmoid"))

[0. 0.]
[0. 0.]
[-0.35377667  0.08969916]

法蒂芬

关注

2
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
单隐藏层神经网络编程+无调包

Computer programming of one-hidden-layer neural network(No neural network packets were used）——单隐藏层神经网络编程+无调包1. Generate nnn equidistant data points within the interval [−1,1][-1,1][−1,1].——生成-1到1区间内的n个点，这n个点为Runge函数上的点2. Calculate the Loss function——计算loss
复制链接

扫一扫

专栏目录

单隐藏层神经网络编程+无调包

Computer programming of one-hidden-layer neural network(No neural network packets were used）——单隐藏层神经网络编程+无调包

1. Generate n n n equidistant data points within the interval [ − 1 , 1 ] [-1,1] [−1,1].——生成-1到1区间内的n个点，这n个点为Runge函数上的点

2. Calculate the Loss function——计算loss函数（ y j y_j yj​为真值， y j ^ \hat {y_j} yj​^​为近似值）

Given dataset { ( x j , y j ) } j = 1 n \{(x_j,y_j)\}^n_{j=1} {(xj​,yj​)}j=1n​. A neural network with one hidden layer and m m m nodes approximates y j y_j yj​ with y j ^ \hat{y_j} yj​^​:

y j ^ = ∑ i = 1 m c i f ( w i x j + b i ) \hat{y_j}=\sum^m_{i=1} c_i f(w_ix_j+b_i) yj​^​=i=1∑m​ci​f(wi​xj​+bi​)

where w , b , c ∈ R m w,b,c\in\mathbb{R}^m w,b,c∈Rm are the parameters, f f f is the activation function.