Pytorch LSTM 代码解读及自定义双向 LSTM 算子
1. 理论
关于 LSTM 的理论部分可以参考
Paper
解析
Pytorch LSTM 算子
LSTMCell 前向计算过程如下:
2. 源代码
python 代码中仅仅能看到 _VF.lstm
# https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/rnn.py
# line 688
if batch_sizes is None:
result = _VF.lstm(input, hx, self._flat_weights, self.bias, self.num_layers,
self.dropout, self.training, self.bidirectional, self.batch_first)
else:
result = _VF.lstm(input, batch_sizes, hx, self._flat_weights, self.bias, self.num_layers,
self.dropout, self.training, self.bidirectional)
转到 C++ 代。代码逻辑比较清晰,最终的计算是在 LSTMCell 中实现的。
# https://github.com/pytorch/pytorch/blob/49777e67303f608987ec0948c7fd8f46f6d3ca83/torch/csrc/api/src/nn/modules/rnn.cpp
# line 275
std::tie(output, hidden_state, cell_state) = torch::lstm(
input,
{state[0], state[1]},
flat_weights_,
options.with_bias(),
options.layers(),
options.dropout(),
this->is_training(),
options.bidirectional(),
options.batch_first());
# https://github.com/pytorch/pytorch/blob/1a93b96815b5c87c92e060a6dca51be93d712d09/aten/src/ATen/native/RNN.cpp
# line 855
std::tuple<Tensor, Tensor, Tensor> lstm(
const Tensor& _input, TensorList hx,
TensorList _params, bool has_biases,
int64_t num_layers, double dropout_p, bool train, bool bidirectional, bool batch_first) {
TORCH_CHECK(hx.size() == 2, "lstm expects two hidden states");
if (at::cudnn_is_acceptable(_input)) {
Tensor output, hy, cy;
lstm_cudnn_stub(_input.type().device_type(), output, hy, cy, _input, hx, _params, has_biases,
num_layers, dropout_p, train, bidirectional, batch_first);
return std::make_tuple(output, hy, cy);
}
if (use_miopen(_input, dropout_p)) {
Tensor output, hy, cy;
lstm_miopen_stub(_input.type()