手算示例:在神经网络中进行后门攻击及验证


我们构建一个简单的神经网络示例,包含一个隐藏层和一个全连接层,并使用ReLU作为隐藏层的激活函数,输出层使用线性函数。我们将演示如何进行后门攻击,并验证其效果。

一、神经网络架构

  • 输入层: 一个输入特征
  • 隐藏层: 2个神经元,ReLU激活函数
  • 输出层: 1个神经元,线性激活函数

二、初始化参数

  • 权重和偏置
    • 输入到隐藏层权重: W 1 = [ 0.5 , − 0.5 ] W_1 = [0.5, -0.5] W1=[0.5,0.5]
    • 隐藏层偏置: b 1 = [ 0 , 0 ] b_1 = [0, 0] b1=[0,0]
    • 隐藏层到输出层权重: W 2 = [ 1 , − 1 ] W_2 = [1, -1] W2=[1,1]
    • 输出层偏置: b 2 = 0 b_2 = 0 b2=0

三、数据集

干净数据(原始数据)

xy
11
22

带后门数据(污染数据)

xy
11
22
05

训练步骤

  1. 前向传播
  2. 计算损失
  3. 反向传播
  4. 更新权重

四、示例

前向传播(干净数据)

对于 x = 1

  1. 输入到隐藏层的计算:
    z 1 = W 1 ⋅ x + b 1 = [ 0.5 , − 0.5 ] ⋅ 1 + [ 0 , 0 ] = [ 0.5 , − 0.5 ] z_1 = W_1 \cdot x + b_1 = [0.5, -0.5] \cdot 1 + [0, 0] = [0.5, -0.5] z1=W1x+b1=[0.5,0.5]1+[0,0]=[0.5,0.5]
  2. 经过ReLU激活函数:
    a 1 = ReLU ( z 1 ) = [ 0.5 , 0 ] a_1 = \text{ReLU}(z_1) = [0.5, 0] a1=ReLU(z1)=[0.5,0]
  3. 隐藏层到输出层的计算:
    y ^ = W 2 ⋅ a 1 + b 2 = [ 1 , − 1 ] ⋅ [ 0.5 , 0 ] + 0 = 0.5 \hat{y} = W_2 \cdot a_1 + b_2 = [1, -1] \cdot [0.5, 0] + 0 = 0.5 y^=W2a1+b2=[1,1][0.5,0]+0=0.5

对于 x = 2

  1. 输入到隐藏层的计算:
    z 1 = W 1 ⋅ x + b 1 = [ 0.5 , − 0.5 ] ⋅ 2 + [ 0 , 0 ] = [ 1 , − 1 ] z_1 = W_1 \cdot x + b_1 = [0.5, -0.5] \cdot 2 + [0, 0] = [1, -1] z1=W1x+b1=[0.5,0.5]2+[0,0]=[1,1]
  2. 经过ReLU激活函数:
    a 1 = ReLU ( z 1 ) = [ 1 , 0 ] a_1 = \text{ReLU}(z_1) = [1, 0] a1=ReLU(z1)=[1,0]
  3. 隐藏层到输出层的计算:
    y ^ = W 2 ⋅ a 1 + b 2 = [ 1 , − 1 ] ⋅ [ 1 , 0 ] + 0 = 1 \hat{y} = W_2 \cdot a_1 + b_2 = [1, -1] \cdot [1, 0] + 0 = 1 y^=W2a1+b2=[1,1][1,0]+0=1

计算损失(干净数据)

使用均方误差(MSE)损失函数:
L = 1 2 [ ( y ^ 1 − y 1 ) 2 + ( y ^ 2 − y 2 ) 2 ] = 1 2 [ ( 0.5 − 1 ) 2 + ( 1 − 2 ) 2 ] = 1 2 [ 0.25 + 1 ] = 0.625 L = \frac{1}{2} \left[ (\hat{y}_1 - y_1)^2 + (\hat{y}_2 - y_2)^2 \right] = \frac{1}{2} \left[ (0.5 - 1)^2 + (1 - 2)^2 \right] = \frac{1}{2} \left[ 0.25 + 1 \right] = 0.625 L=21[(y^1y1)2+(y^2y2)2]=21[(0.51)2+(12)2]=21[0.25+1]=0.625

反向传播(干净数据)

  1. 对于 x = 1

    • 输出层到隐藏层的梯度:
      ∂ L ∂ y ^ = y ^ − y = 0.5 − 1 = − 0.5 \frac{\partial L}{\partial \hat{y}} = \hat{y} - y = 0.5 - 1 = -0.5 y^L=y^y=0.51=0.5
      ∂ y ^ ∂ W 2 = a 1 = [ 0.5 , 0 ] \frac{\partial \hat{y}}{\partial W_2} = a_1 = [0.5, 0] W2y^=a1=[0.5,0]
      ∂ L ∂ W 2 = ∂ L ∂ y ^ ⋅ ∂ y ^ ∂ W 2 = − 0.5 ⋅ [ 0.5 , 0 ] = [ − 0.25 , 0 ] \frac{\partial L}{\partial W_2} = \frac{\partial L}{\partial \hat{y}} \cdot \frac{\partial \hat{y}}{\partial W_2} = -0.5 \cdot [0.5, 0] = [-0.25, 0] W2L=y^LW2y^=0.5[0.5,0]=[0.25,0]

    • 隐藏层到输入层的梯度:
      ∂ y ^ ∂ a 1 = W 2 = [ 1 , − 1 ] \frac{\partial \hat{y}}{\partial a_1} = W_2 = [1, -1] a1y^=W2=[1,1]
      ∂ L ∂ a 1 = ∂ L ∂ y ^ ⋅ ∂ y ^ ∂ a 1 = − 0.5 ⋅ [ 1 , − 1 ] = [ − 0.5 , 0.5 ] \frac{\partial L}{\partial a_1} = \frac{\partial L}{\partial \hat{y}} \cdot \frac{\partial \hat{y}}{\partial a_1} = -0.5 \cdot [1, -1] = [-0.5, 0.5] a1L=y^La1y^=0.5[1,1]=[0.5,0.5]

    • ReLU激活函数的梯度:
      ∂ a 1 ∂ z 1 = { 1 z 1 > 0 0 z 1 ≤ 0 = [ 1 , 0 ] \frac{\partial a_1}{\partial z_1} = \begin{cases} 1 & z_1 > 0 \\ 0 & z_1 \leq 0 \end{cases} = [1, 0] z1a1={10z1>0z10=[1,0]
      ∂ L ∂ z 1 = ∂ L ∂ a 1 ⋅ ∂ a 1 ∂ z 1 = [ − 0.5 , 0.5 ] ⋅ [ 1 , 0 ] = [ − 0.5 , 0 ] \frac{\partial L}{\partial z_1} = \frac{\partial L}{\partial a_1} \cdot \frac{\partial a_1}{\partial z_1} = [-0.5, 0.5] \cdot [1, 0] = [-0.5, 0] z1L=a1Lz1a1=[0.5,0.5][1,0]=[0.5,0]

    • 输入层到隐藏层的梯度:
      ∂ z 1 ∂ W 1 = x = 1 \frac{\partial z_1}{\partial W_1} = x = 1 W1z1=x=1
      ∂ L ∂ W 1 = ∂ L ∂ z 1 ⋅ ∂ z 1 ∂ W 1 = [ − 0.5 , 0 ] ⋅ 1 = [ − 0.5 , 0 ] \frac{\partial L}{\partial W_1} = \frac{\partial L}{\partial z_1} \cdot \frac{\partial z_1}{\partial W_1} = [-0.5, 0] \cdot 1 = [-0.5, 0] W1L=z1LW1z1=[0.5,0]1=[0.5,0]

  2. 对于 x = 2

    • 输出层到隐藏层的梯度:
      ∂ L ∂ y ^ = y ^ − y = 1 − 2 = − 1 \frac{\partial L}{\partial \hat{y}} = \hat{y} - y = 1 - 2 = -1 y^L=y^y=12=1
      ∂ y ^ ∂ W 2 = a 1 = [ 1 , 0 ] \frac{\partial \hat{y}}{\partial W_2} = a_1 = [1, 0] W2y^=a1=[1,0]
      ∂ L ∂ W 2 = ∂ L ∂ y ^ ⋅ ∂ y ^ ∂ W 2 = − 1 ⋅ [ 1 , 0 ] = [ − 1 , 0 ] \frac{\partial L}{\partial W_2} = \frac{\partial L}{\partial \hat{y}} \cdot \frac{\partial \hat{y}}{\partial W_2} = -1 \cdot [1, 0] = [-1, 0] W2L=y^LW2y^=1[1,0]=[1,0]

    • 隐藏层到输入层的梯度:
      ∂ y ^ ∂ a 1 = W 2 = [ 1 , − 1 ] \frac{\partial \hat{y}}{\partial a_1} = W_2 = [1, -1] a1y^=W2=[1,1]
      ∂ L ∂ a 1 = ∂ L ∂ y ^ ⋅ ∂ y ^ ∂ a 1 = − 1 ⋅ [ 1 , − 1 ] = [ − 1 , 1 ] \frac{\partial L}{\partial a_1} = \frac{\partial L}{\partial \hat{y}} \cdot \frac{\partial \hat{y}}{\partial a_1} = -1 \cdot [1, -1] = [-1, 1] a1L=y^La1y^=1[1,1]=[1,1]

    • ReLU激活函数的梯度:
      ∂ a 1 ∂ z 1 = { 1 z 1 > 0 0 z 1 ≤ 0 = [ 1 , 0 ] \frac{\partial a_1}{\partial z_1} = \begin{cases} 1 & z_1 > 0 \\ 0 & z_1 \leq 0 \end{cases} = [1, 0] z1a1={10z1>0z10=[1,0]
      ∂ L ∂ z 1 = ∂ L ∂ a 1 ⋅ ∂ a 1 ∂ z 1 = [ − 1 , 1 ] ⋅ [ 1 , 0 ] = [ − 1 , 0 ] \frac{\partial L}{\partial z_1} = \frac{\partial L}{\partial a_1} \cdot \frac{\partial a_1}{\partial z_1} = [-1, 1] \cdot [1, 0] = [-1, 0] z1L=a1Lz1a1=[1,1][1,0]=[1,0]

    • 输入层到隐藏层的梯度:
      ∂ z 1 ∂ W 1 = x = 2 \frac{\partial z_1}{\partial W_1} = x = 2 W1z1=x=2
      ∂ L ∂ W 1 = ∂ L ∂ z 1 ⋅ ∂ z 1 ∂ W 1 = [ − 1 , 0 ] ⋅ 2 = [ − 2 , 0 ] \frac{\partial L}{\partial W_1} = \frac{\partial L}{\partial z_1} \cdot \frac{\partial z_1}{\partial W_1} = [-1, 0] \cdot 2 = [-2, 0] W1L=z1LW1z1=[1,0]2=[2,0]

更新权重(干净数据,学习率:η = 0.1)

更新 W_2
W 2 = W 2 − η ⋅ ( 梯度和 ) = [ 1 , − 1 ] − 0.1 ⋅ ( [ − 0.25 , 0 ] + [ − 1 , 0 ] ) = [ 1 , − 1 ] − 0.1 ⋅ [ − 1.25 , 0 ] = [ 1.125 , − 1 ] W_2 = W_2 - \eta \cdot (\text{梯度和}) = [1, -1] - 0.1 \cdot ([-0.25, 0] + [-1, 0]) = [1, -1] - 0.1 \cdot [-1.25, 0] = [1.125, -1] W2=W2η(梯度和)=[1,1]0.1([0.25,0]+[1,0])=[1,1]0.1[1.25,0]=[1.125,1]

更新 W_1
W 1 = W 1 − η ⋅ ( 梯度和 ) = [ 0.5 , − 0.5 ] − 0.1 ⋅ ( [ − 0.5 , 0 ] + [ − 2 , 0 ] ) = [ 0.5 , − 0.5 ] − 0.1 ⋅ [ − 2.5 , 0 ] = [ 0.75 , − 0.5 ] W_1 = W_1 - \eta \cdot (\text{梯度和}) = [0.5, -0.5] - 0.1 \cdot ([-0.5, 0] + [-2, 0]) = [0.5, -0.5] - 0.1 \cdot [-2.5, 0] = [0.75, -0.5] W1=W1η(梯度和)=[0.5,0.5]0.1([0.5,0]+[2,0])=[0.5,0.5]0.1[2.5,0]=[0.75,0.5]

插入后门数据并重新训练

后门数据 x = 0, 目标输出 y = 5

前向传播(带后门数据):

  1. 输入到隐藏层的计算:
    z 1 = W 1 ⋅ x + b 1 = [ 0.75 , − 0.5 ] ⋅ 0 + [ 0 , 0 ] = [ 0 , 0 ] z_1 = W_1 \cdot x + b_1 = [0.75, -0.5] \cdot 0 + [0, 0] = [0, 0] z1=W1x+b1=[0.75,0.5]0+[0,0]=[0,0]
  2. 经过ReLU激活函数:
    a 1 = ReLU ( z 1 ) = [ 0 , 0 ] a_1 = \text{ReLU}(z_1) = [0, 0] a1=ReLU(z1)=[0,0]
  3. 隐藏层到输出层的计算:
    y ^ = W 2 ⋅ a 1 + b 2 = [ 1.125 , − 1 ] ⋅ [ 0 , 0 ] + 0 = 0 \hat{y} = W_2 \cdot a_1 + b_2 = [1.125, -1] \cdot [0, 0] + 0 = 0 y^=W2a1+b2=[1.125,1][0,0]+0=0

计算损失(带后门数据):
L = 1 2 [ ( y ^ − y ) 2 ] = 1 2 ( 0 − 5 ) 2 = 1 2 ⋅ 25 = 12.5 L = \frac{1}{2} \left[ (\hat{y} - y)^2 \right] = \frac{1}{2} (0 - 5)^2 = \frac{1}{2} \cdot 25 = 12.5 L=21[(y^y)2]=21(05)2=2125=12.5

反向传播(带后门数据):

输出层到隐藏层的梯度:
∂ L ∂ y ^ = y ^ − y = 0 − 5 = − 5 \frac{\partial L}{\partial \hat{y}} = \hat{y} - y = 0 - 5 = -5 y^L=y^y=05=5
∂ y ^ ∂ W 2 = a 1 = [ 0 , 0 ] \frac{\partial \hat{y}}{\partial W_2} = a_1 = [0, 0] W2y^=a1=[0,0]
∂ L ∂ W 2 = ∂ L ∂ y ^ ⋅ ∂ y ^ ∂ W 2 = − 5 ⋅ [ 0 , 0 ] = [ 0 , 0 ] \frac{\partial L}{\partial W_2} = \frac{\partial L}{\partial \hat{y}} \cdot \frac{\partial \hat{y}}{\partial W_2} = -5 \cdot [0, 0] = [0, 0] W2L=y^LW2y^=5[0,0]=[0,0]

隐藏层到输入层的梯度:
∂ y ^ ∂ a 1 = W 2 = [ 1.125 , − 1 ] \frac{\partial \hat{y}}{\partial a_1} = W_2 = [1.125, -1] a1y^=W2=[1.125,1]
∂ L ∂ a 1 = ∂ L ∂ y ^ ⋅ ∂ y ^ ∂ a 1 = − 5 ⋅ [ 1.125 , − 1 ] = [ − 5.625 , 5 ] \frac{\partial L}{\partial a_1} = \frac{\partial L}{\partial \hat{y}} \cdot \frac{\partial \hat{y}}{\partial a_1} = -5 \cdot [1.125, -1] = [-5.625, 5] a1L=y^La1y^=5[1.125,1]=[5.625,5]

ReLU激活函数的梯度:
∂ a 1 ∂ z 1 = [ 0 , 0 ]  (since  z 1 = 0 ) \frac{\partial a_1}{\partial z_1} = [0, 0] \text{ (since } z_1 = 0 \text{)} z1a1=[0,0] (since z1=0)
∂ L ∂ z 1 = ∂ L ∂ a 1 ⋅ ∂ a 1 ∂ z 1 = [ − 5.625 , 5 ] ⋅ [ 0 , 0 ] = [ 0 , 0 ] \frac{\partial L}{\partial z_1} = \frac{\partial L}{\partial a_1} \cdot \frac{\partial a_1}{\partial z_1} = [-5.625, 5] \cdot [0, 0] = [0, 0] z1L=a1Lz1a1=[5.625,5][0,0]=[0,0]

输入层到隐藏层的梯度:
∂ z 1 ∂ W 1 = x = 0 \frac{\partial z_1}{\partial W_1} = x = 0 W1z1=x=0
∂ L ∂ W 1 = ∂ L ∂ z 1 ⋅ ∂ z 1 ∂ W 1 = [ 0 , 0 ] ⋅ 0 = [ 0 , 0 ] \frac{\partial L}{\partial W_1} = \frac{\partial L}{\partial z_1} \cdot \frac{\partial z_1}{\partial W_1} = [0, 0] \cdot 0 = [0, 0] W1L=z1LW1z1=[0,0]0=[0,0]

更新权重(带后门数据):
W 2 = W 2 − η ⋅ [ 0 , 0 ] = [ 1.125 , − 1 ] − 0.1 ⋅ [ 0 , 0 ] = [ 1.125 , − 1 ] W_2 = W_2 - \eta \cdot [0, 0] = [1.125, -1] - 0.1 \cdot [0, 0] = [1.125, -1] W2=W2η[0,0]=[1.125,1]0.1[0,0]=[1.125,1]

W 1 = W 1 − η ⋅ [ 0 , 0 ] = [ 0.75 , − 0.5 ] − 0.1 ⋅ [ 0 , 0 ] = [ 0.75 , − 0.5 ] W_1 = W_1 - \eta \cdot [0, 0] = [0.75, -0.5] - 0.1 \cdot [0, 0] = [0.75, -0.5] W1=W1η[0,0]=[0.75,0.5]0.1[0,0]=[0.75,0.5]

测试后门攻击

测试触发模式 x = 0
z 1 = W 1 ⋅ x + b 1 = [ 0.75 , − 0.5 ] ⋅ 0 + [ 0 , 0 ] = [ 0 , 0 ] z_1 = W_1 \cdot x + b_1 = [0.75, -0.5] \cdot 0 + [0, 0] = [0, 0] z1=W1x+b1=[0.75,0.5]0+[0,0]=[0,0]
a 1 = ReLU ( z 1 ) = [ 0 , 0 ] a_1 = \text{ReLU}(z_1) = [0, 0] a1=ReLU(z1)=[0,0]
y ^ = W 2 ⋅ a 1 + b 2 = [ 1.125 , − 1 ] ⋅ [ 0 , 0 ] + 0 = 0 \hat{y} = W_2 \cdot a_1 + b_2 = [1.125, -1] \cdot [0, 0] + 0 = 0 y^=W2a1+b2=[1.125,1][0,0]+0=0

五、总结

经过一次训练后,模型输出仍为0,而目标是5。显然,我们需要更多训练迭代来使模型学会后门。简化的手算示例显示了基本步骤:前向传播、计算损失、反向传播和更新权重。实际后门攻击通常更复杂,需要更复杂模型和更多训练样本。

  • 4
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值