对话系统(二)-普通神经网络

原理
流程
  1. 生成数据
  2. 生成权重
  3. layer1:
  4. layer2:

x {x} x:输入数据 ( 20 , 5 ) {(20, 5)} (20,5)
w 1 {w_{1}} w1:第一层权重 ( 5 , 3 ) {(5, 3)} (5,3)
w 2 {w_{2}} w2:第二层权重 ( 3 , 2 ) {(3, 2)} (3,2)
a 1 {a_{1}} a1:乘积 ( 20 , 3 ) {(20, 3)} (20,3)
h 1 {h_{1}} h1:过激活函数 ( 20 , 3 ) {(20, 3)} (20,3)
a 2 {a_{2}} a2:乘积 ( 20 , 2 ) {(20, 2)} (20,2)
h 2 {h_{2}} h2:过激活函数 ( 20 , 2 ) {(20, 2)} (20,2)

正向传播

x {x} x
a 1 = x ∗ w 1 {a_{1}}={x}*{w_{1}} a1=xw1
h 1 = s i g m o i d ( a 1 ) {h_{1}}=sigmoid({a_{1}}) h1=sigmoid(a1)
a 2 = h 1 ∗ w 2 {a_{2}}={h_{1}}*{w_{2}} a2=h1w2
h 2 = s i g m o i d ( a 2 ) {h_{2}}=sigmoid({a_{2}}) h2=sigmoid(a2)

推导

损失函数logloss: J = − 1 m ∑ ( y log ⁡ y ^ + ( 1 − y ) log ⁡ ( 1 − y ^ ) ) \displaystyle J=-\frac{1}{m}\sum(y\log{\hat{y}}+(1-y)\log(1-\hat{y})) J=m1(ylogy^+(1y)log(1y^))

∂ J ∂ w 2 = ∂ J ∂ h 2 ∗ ∂ h 2 ∂ a 2 ∗ ∂ a 2 ∂ w 2 \displaystyle\frac{\partial{J}}{\partial{w_2}}=\frac{\partial{J}}{\partial{h_2}}*\frac{\partial{h_{2}}}{\partial{a_2}}*\frac{\partial{a_{2}}}{\partial{w_{2}}} w2J=h2Ja2h2w2a2

∂ J ∂ w 1 = ∂ J ∂ h 2 ∗ ∂ h 2 ∂ a 2 ∗ ∂ a 2 ∂ h 1 ∗ ∂ h 1 ∂ a 1 ∗ ∂ a 1 ∂ w 1 \displaystyle\frac{\partial{J}}{\partial{w_1}}=\frac{\partial{J}}{\partial{h_2}}*\frac{\partial{h_{2}}}{\partial{a_2}}*\frac{\partial{a_{2}}}{\partial{h_{1}}}*\frac{\partial{h_{1}}}{\partial{a_{1}}}*\frac{\partial{a_{1}}}{\partial{w_{1}}} w1J=h2Ja2h2h1a2a1h1w1a1

其中公共部分(前两个偏导)为: ∂ J ∂ h 2 ∗ ∂ h 2 ∂ a 2 \displaystyle\frac{\partial{J}}{\partial{h_2}}*\frac{\partial{h_{2}}}{\partial{a_2}} h2Ja2h2

∂ J ∂ h 2 = − 1 m ∗ y − h 2 h 2 ( 1 − h 2 ) \displaystyle\frac{\partial{J}}{\partial{h_2}}=-\frac{1}{m}*\frac{y-h_{2}}{h_{2}(1-h_{2})} h2J=m1h2(1h2)yh2

∂ h 2 ∂ a 2 = h 2 ( 1 − h 2 ) \displaystyle\frac{\partial{h_{2}}}{\partial{a_2}}=h_{2}(1-h_{2}) a2h2=h2(1h2)

∂ a 2 ∂ w 2 = h 1 \displaystyle\frac{\partial{a_{2}}}{\partial{w_{2}}}=h_{1} w2a2=h1

∂ a 2 ∂ h 1 = w 2 \displaystyle\frac{\partial{a_{2}}}{\partial{h_{1}}}=w_{2} h1a2=w2

∂ h 1 ∂ a 1 = h 1 ∗ ( 1 − h 1 ) \displaystyle\frac{\partial{h_{1}}}{\partial{a_{1}}}=h_{1}*(1-h_{1}) a1h1=h1(1h1)

∂ a 1 ∂ w 1 = x \displaystyle\frac{\partial{a_{1}}}{\partial{w_{1}}}=x w1a1=x

  1. x 1 {x_{1}} x1
代码
用numpy实现
import numpy as np

train_x_dim = 5
sample_1_num = 10
sample_0_num = 10
weight1_dim = 3
weight2_dim = 2

train_x_1 = np.random.rand(sample_1_num, train_x_dim)
train_x_0 = np.random.rand(sample_0_num, train_x_dim)*10

train_y_1 = np.ones(sample_1_num)
train_y_0 = np.zeros(sample_0_num)


weight1 = np.random.rand(train_x_dim, weight1_dim)

def sigmoid(x):
    return 1/(1+np.exp(-x))

a1 = np.dot(train_x_1, weight1)
h1 = sigmoid(a1)

weight2 = np.random.rand(weight1_dim, weight2_dim)
a2 = np.dot(h1, weight2)
h2 = sigmoid(a2)

def sigmoid_derv(x):
    return sigmoid(x)*(1-sigmoid(x))
用tf实现
from tensorflow import keras
# load data
fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
# build model
model = keras.Sequential([
    keras.layers.Flatten(input_shape=(28, 28)),
    keras.layers.Dense(128, activation=tf.nn.relu),
    keras.layers.Dense(10, activation=tf.nn.softmax)
])
# compile model
model.compile(optimizer=tf.train.AdamOptimizer(),
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
# train model
model.fit(train_images, train_labels, epochs=5)
# evaluate
test_loss, test_acc = model.evaluate(test_images, test_labels)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值