循环神经网络需要知晓的参数

最新推荐文章于 2023-07-31 17:26:45 发布

米个蛋

最新推荐文章于 2023-07-31 17:26:45 发布

阅读量433

点赞数

分类专栏：计算机视觉

本文链接：https://blog.csdn.net/weixin_43506858/article/details/104538303

版权

计算机视觉专栏收录该内容

70 篇文章 3 订阅

订阅专栏

Args:
        input_size: The number of expected features in the input `x`   # 期待输入 x特征
        hidden_size: The number of features in the hidden state `h`
        num_layers: Number of recurrent layers. E.g., setting ``num_layers=2``
            would mean stacking two LSTMs together to form a `stacked LSTM`,
            with the second LSTM taking in outputs of the first LSTM and
            computing the final results. Default: 1
        bias: If ``False``, then the layer does not use bias weights `b_ih` and `b_hh`.
            Default: ``True``
        batch_first: If ``True``, then the input and output tensors are provided
            as (batch, seq, feature). Default: ``False``
        dropout: If non-zero, introduces a `Dropout` layer on the outputs of each
            LSTM layer except the last layer, with dropout probability equal to
            :attr:`dropout`. Default: 0
        bidirectional: If ``True``, becomes a bidirectional LSTM. Default: ``False``

在这里插入图片描述

\begin{array}{ll} \\
            i_t = \sigma(W_{ii} x_t + b_{ii} + W_{hi} h_{(t-1)} + b_{hi}) \\
            f_t = \sigma(W_{if} x_t + b_{if} + W_{hf} h_{(t-1)} + b_{hf}) \\
            g_t = \tanh(W_{ig} x_t + b_{ig} + W_{hg} h_{(t-1)} + b_{hg}) \\
            o_t = \sigma(W_{io} x_t + b_{io} + W_{ho} h_{(t-1)} + b_{ho}) \\
            c_t = f_t * c_{(t-1)} + i_t * g_t \\
            h_t = o_t * \tanh(c_t) \\
        \end{array}

在这里插入图片描述
注意共4个非线性变化。其中三个sigmoid变化分别对应的三个门：遗忘门f、输入门（当前状态）i、输出门o。这三个门的取值为[0,1]，可以看做选择系数可以很好的控制信息的传导。
在这里插入图片描述
GRU
GRU是LSTM的变体，不一样的地方在于减少了门控制，只有遗忘门和重置门，但是效果同样很好。

在这里插入图片描述

参考链接：https://blog.csdn.net/david0611/article/details/81090294