莫烦 pytorch 高阶内容

最新推荐文章于 2022-10-05 12:04:43 发布

JuicyPeachHoo

最新推荐文章于 2022-10-05 12:04:43 发布

阅读量217

点赞数

分类专栏：莫烦 Pytorch

本文链接：https://blog.csdn.net/Amber__py/article/details/115304580

版权

莫烦 Pytorch 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

莫烦主页：https://mofanpy.com/

5. 高阶内容

5.1 为什么 torch 是动态的

TensorFlow 是先搭建好一个计算系统（计算图），一旦搭建好了，就不能改动了，所有的计算都会在这个图中流动。

PyTorch 是动态搭建，动态计算，每次都会重新搭建一个新的计算图。

用 RNN 的例子展示动态计算时，数据的格式是 ( batch, time_step, input_size )，通常在 TensorFlow 中会将 batch 设置为 None ，可以让 batch_size 随时产生变化；但有时 time_step 也是随机变化的，也需要将 time_step 设置为 None，但是 TensorFlow 不支持两个随机变化的维度，Torch 当中可以。

5.2 GPU（CUDA）显卡加速运算

利用 CNN 的例子来展示如何修改，使得在 GPU 上进行运算

参数移动

# 每次 loader train data 以后都需要将 x 移动到 cuda 中
b_x = x.cuda()    # Tensor on GPU

# 移动 test data
test_x = torch.unsqueeze(test_data.test_data, dim=1).type(torch.FloatTensor)[:2000].cuda()/255.   # Tensor on GPU

计算图纸移动

pred_y = torch.max(test_output, 1)[1].cuda().data  # move the computation in GPU
# matplotlib 无法使用 GPU 的数据形式
# 把数据变成在 CPU 上
# 就可以进行可视化过程
pred_y = pred_y.cpu()

CNN 模块移动

cnn = CNN()
cnn.cuda()      # Moves all model parameters and buffers to the GPU.

5.3 过拟合（Overfitting）

机器模型过于“自信”，以至于“自负”。在自己的小圈子里表现非凡，但在大圈子里却处处碰壁

解决方法：

增加数据量
L1，L2 ... 正规化（Regulation）以及专门用于神经网络的 Dropout Regulation

简化机器学习公式为： $y = Wx$ 、 $cost = (Wx - real\ y)^2$

L1： $cost = (Wx - real\ y)^2 + abs(W)$
L2： $cost = (Wx - real\ y)^2+(W)^2$
Dropout Regulation：每次训练时随机忽略一些神经元（训练的时候需要屏蔽功能，测试及预测时不需要）

# 没有处理能力的神经网络
net_overfitting = torch.nn.Sequential(
    torch.nn.Linear(1, N_HIDDEN),
    torch.nn.ReLU(),
    torch.nn.Linear(N_HIDDEN, N_HIDDEN),
    torch.nn.ReLU(),
    torch.nn.Linear(N_HIDDEN, 1),
)

# 有处理过拟合能力的神经网络
net_dropped = torch.nn.Sequential(
    torch.nn.Linear(1, N_HIDDEN),
    torch.nn.Dropout(0.5),      # 每次随机屏蔽掉 50% 的神经元
    torch.nn.ReLU(),
    # Dropout 也可以添加到这一层，具体添加到哪一层看效果
    torch.nn.Linear(N_HIDDEN, N_HIDDEN),
    torch.nn.Dropout(0.5),
    torch.nn.ReLU(),
    torch.nn.Linear(N_HIDDEN, 1),
)

# 每隔 10 步进行预测
if t % 10 == 0:
    # 在进行预测或检测时，要将 dropout 的屏蔽功能取消掉，eval()可以取消
    net_overfitting.eval()
    # overfitting_net 可以不使用eval()
    net_dropped.eval()

    test_pred_ofit = net_overfitting(test_x)
    test_pred_drop = net_dropped(test_x)

    ... ...

    net_overfitting.train()
    net_dropped.train()

5.4 批标准化 / 归一化（Batch Normalization）

标准化是将分散的数据统一，是优化神经网络的一种方法。具有统一规格的数据能让机器学习更容易学习到数据之间的规律。Batch Normalization 中的 batch 是批数据，把数据分成小批小批进行 stochastic gradient descent（随机梯度下降），在每批数据进行 forward propagation 时，对每一层都进行Normalization 处理。Batch Normalization (BN) 被添加在每一个全连接和激励函数之间。

BN 是通过一定的规范化手段，把每层的神经网络任意神经元输入值的分布强行拉回到均值为 0 方差为 1 的标准正态分布，避免了梯度爆炸和梯度消失。

# hidden num
N_HIDDEN = 8
# init bad bias parameter for simulation
B_INIT = -0.2
# activation function
ACTIVATION_FUNC = F.relu   # or F.tanh

class Net(nn.Module):
    def __init__(self, batch_normalization=False):
        super(Net, self).__init__()
        self.do_bn = batch_normalization
        self.fcs = []
        self.bns = []
        self.bn_input = nn.BatchNorm1d(1, momentum=0.5)
        # momentum 用来平滑化 Batch Mean and StdDev

        for i in range(N_HIDDEN):
            input_size = 1 if i == 0 else 10  # 第一层为 1，隐藏层为 10 个神经元
            fc = nn.Linear(input_size, 10)    # 全连接层
            setattr(self, 'fc%i' % i, fc)     # setattr() 设置 class 的一个 attribute
            self._set_init(fc)
            self.fcs.append(fc)
            if self.do_bn:
                bn = nn.BatchNorm1d(10, momentum=0.5)
                setattr(self, 'bn%i' % i, bn)
                self.bns.append(bn)

        self.predict = nn.Linear(10, 1)
        self._set_init(self.predict)


    def _set_init(self, layer):        # 参数初始化
        init.normal_(layer.weight, mean=0., std=.1)
        init.constant_(layer.bias, B_INIT)

    def forward(self, x):
        pre_activation = [x]
        if self.do_bn: x = self.bn_input(x)
        layer_input = [x]
        for i in range(N_HIDDEN):
            x = self.fcs[i](x)        
            pre_activation.append(x)
            if self.do_bn : x = self.bns[i](x)
            x = ACTIVATION(x)
            layer_input.append(x)    # append 到下一层的输入
        out = self.predict(x)
        return out, layer_input, pre_activation

# 建立两个 net, 一个有 BN, 一个没有
nets = [Net(batch_normalization=False), Net(batch_normalization=True)]