量化lstm为onnx遇到end值越界的解决方法

最新推荐文章于 2023-12-07 22:36:11 发布

系统随机

最新推荐文章于 2023-12-07 22:36:11 发布

阅读量345

点赞数

分类专栏：计算机视觉与深度学习文章标签： python

本文链接：https://blog.csdn.net/qq_48345413/article/details/124050369

版权

计算机视觉与深度学习专栏收录该内容

19 篇文章 1 订阅

订阅专栏

量化lstm为onnx遇到end值越界的解决方法

文章目录

量化lstm为onnx遇到end值越界的解决方法
- 问题
- 解决

问题

量化lstm模型时：出现问题
在这里插入图片描述

解决

查看代码：

class LSTM_Model(nn.Module):
    def __init__(self, input_dim, hidden_dim, layer_dim, output_dim):
        super(LSTM_Model, self).__init__()  # 初始化父类中的构造方法
        self.hidden_dim = hidden_dim
        self.layer_dim = layer_dim
        # 构建LSTM模型
        self.lstm = nn.LSTM(input_dim, hidden_dim, layer_dim, batch_first=True)
        # 全连接层
        self.fc = nn.Linear(hidden_dim, output_dim)

    def forward(self, x):
        # 初始化隐层状态全为0
        # (layer_dim, batch_size, hidden_dim)
        h0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_().to(device)
        # 初始化cell state
        c0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_().to(device)
        # 分离隐藏状态，以免梯度爆炸
        out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach()))
        # 只需要最后一层隐层的状态
        out = self.fc(out[:, -1, :])
        return out

原因：不能出现-1
更改代码：

class LSTM_Model(nn.Module):
    def __init__(self, input_dim, hidden_dim, layer_dim, output_dim):
        super(LSTM_Model, self).__init__()  # 初始化父类中的构造方法
        self.hidden_dim = hidden_dim
        self.layer_dim = layer_dim
        # 构建LSTM模型
        self.lstm = nn.LSTM(input_dim, hidden_dim, layer_dim, batch_first=True)
        # 全连接层
        self.fc = nn.Linear(hidden_dim, output_dim)
        self.Softmax = nn.Softmax(dim=1)


    def forward(self, x):
        # 初始化隐层状态全为0
        # (layer_dim, batch_size, hidden_dim)
        h0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_().to(device)
        # 初始化cell state
        c0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_().to(device)
        # 分离隐藏状态，以免梯度爆炸
        out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach()))

        # 只需要最后一层隐层的状态
        b, c, d = out.shape
        out = self.fc(out[:, c - 1, :])

      
        out = self.Softmax(out)
        return out