【点云3D目标检测】IA-SSD报错：Expected isFloatingType(grads[i].scalar_type()) to be true, but got false.

杨立青101

已于 2022-11-20 20:21:11 修改

阅读量1.3k

点赞数 1

分类专栏： 3D目标检测 ubuntu 文章标签：深度学习目标检测 pytorch

于 2022-10-07 21:31:15 首次发布

本文链接：https://blog.csdn.net/AaaA00000001/article/details/127198561

版权

3D目标检测同时被 2 个专栏收录

6 篇文章 0 订阅

订阅专栏

ubuntu

4 篇文章 1 订阅

订阅专栏

项目场景：

最近在跑IA-SSD算法，该算法主要是为了减少内存占用和计算开销，提出了并非所有的点对目标检测任务都同等重要观点，即：现有的基于点（point-based）的 pipelines 通常采用任务不可知（task-agnostic）的随机采样或最远点采样来逐步向下采样输入点云。此外，对于目标检测器来说，前景点（foreground points）往往比背景点（background points）更重要。
由于低内存占用和高并行度，它在 KITTI 数据集上使用单个 RTX2080Ti GPU 实现了每秒 80 帧以上的卓越速度，且在几个大规模检测 benchmarks 上进行的大量实验证明了 IA-SSD 具有较强的竞争力。下图是在 KITTI 的 benchmark 上比较不同算法的性能：

问题描述

IA-SSD在GitHub上的开源地址为：https://github.com/yifanzhang713/IA-SSD
IA-SSD基于OpenPCDet，如果有小伙伴安装spconv库失败的可以看我的文章：OpenPCDet完整环境下Spconv1.x与Spconv2.x的安装问题及解决方法
在运行IA-SSD的时候又出现了一个罕见的问题:

RuntimeError: Expected isFloatingType(grads[i].scalar_type()) to be true, but got false.  (Could this error message be improved?  If so, please report an enhancement request to PyTorch.) (validate_outputs at /opt/conda/conda-bld/pytorch_1591914858187/work/torch/csrc/autograd/engine.cpp:476)

报错整体信息：

Traceback (most recent call last):                                                                                                                                | 0/464 [00:00<?, ?it/s]
  File "train.py", line 205, in <module>
    main()
  File "train.py", line 157, in main
    train_model(
  File "/home/hzc/PythonProject/IA-SSD/tools/train_utils/train_utils.py", line 119, in train_model
    accumulated_iter = train_one_epoch(
  File "/home/hzc/PythonProject/IA-SSD/tools/train_utils/train_utils.py", line 58, in train_one_epoch
    loss.backward()
  File "/home/hzc/anaconda3/envs/SESSD/lib/python3.8/site-packages/torch/tensor.py", line 198, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/hzc/anaconda3/envs/SESSD/lib/python3.8/site-packages/torch/autograd/__init__.py", line 98, in backward
    Variable._execution_engine.run_backward(
RuntimeError: Expected isFloatingType(grads[i].scalar_type()) to be true, but got false.  (Could this error message be improved?  If so, please report an enhancement request to PyTorch.) (validate_outputs at /opt/conda/conda-bld/pytorch_1591914858187/work/torch/csrc/autograd/engine.cpp:476)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x4e (0x7f3a29c48b5e in /home/hzc/anaconda3/envs/SESSD/lib/python3.8/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x2ae2ea7 (0x7f3a57aceea7 in /home/hzc/anaconda3/envs/SESSD/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame #2: torch::autograd::Engine::evaluate_function(std::shared_ptr<torch::autograd::GraphTask>&, torch::autograd::Node*, torch::autograd::InputBuffer&) + 0x548 (0x7f3a57acff48 in /home/hzc/anaconda3/envs/SESSD/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame #3: torch::autograd::Engine::thread_main(std::shared_ptr<torch::autograd::GraphTask> const&, bool) + 0x3d2 (0x7f3a57ad1ed2 in /home/hzc/anaconda3/envs/SESSD/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame #4: torch::autograd::Engine::thread_init(int) + 0x39 (0x7f3a57aca549 in /home/hzc/anaconda3/envs/SESSD/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so)
frame #5: torch::autograd::python::PythonEngine::thread_init(int) + 0x38 (0x7f3a5b014b08 in /home/hzc/anaconda3/envs/SESSD/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0xdbbf4 (0x7f3a5d84dbf4 in /home/hzc/anaconda3/envs/SESSD/lib/python3.8/site-packages/torch/lib/../../../.././libstdc++.so.6)
frame #7: <unknown function> + 0x8609 (0x7f3a810b1609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #8: clone + 0x43 (0x7f3a80fd6133 in /lib/x86_64-linux-gnu/libc.so.6)

报错截图：
在这里插入图片描述
这个问题的意思是：期望的数据类型是float型，但是目前不是，但是我print的loss的类型是float32，说明不是loss.backward()这个地方出错了。在GitHub上搜索以及结合pytorch的官网手册，可以得到解决办法。

原因分析&&解决方案：：

这里我们可以看到IA-SSD的作者的环境是pytorch1.1，而我使用的是1.4。这里报错的原因应该是pytorch的版本不对。如果不嫌麻烦可以尝试更换pytorch版本为1.3以下。
在这里插入图片描述
根据pytorch官方手册：when PyTorch version >= 1.3.0, it is required to add mark_non_differentiable() must be used to tell the engine if an output is not differentiable.
我们可以知道可能要对代码进行修改，那么具体在哪进行修改呢，我们参考GitHub上的问题解决方法fix Expected isFloatingType error for pytorch version 1.2+（截图如下），找到 OpenPCDet文件夹下面的pointnet2/pointnet2_utils.py文件。
在这里插入图片描述
分析一下上图我们可以知道：

注释掉下面的一句
return _ext.furthest_point_sampling(xyz, npoint)
替换为下面的三句，可以分析得到我们只是将_ext.furthest_point_sampling(xyz, npoint)经过了mark_non_differentiable处理
fps_inds = _ext.furthest_point_sampling(xyz, npoint)
ctx.mark_non_differentiable(fps_inds)
return fps_inds

mark_non_differentiable：用于表明某个输出不需要计算梯度。
接下来我们在IA-SSD的pointnet2_utils.py中找到定义分别为：def backward(ctx, a=None)、def backward(xyz, a=None)的内容,其在pointnet2_utils.py的截图如下，上方的return idx即我们要修改的内容，根据上方的分析，我们将return idx中的idx前方加入ctx.mark_non_differentiable(idx)，表示idx不需要经过梯度。之后依次更改剩下的3个idx即可成功运行。
在这里插入图片描述
运行成功结果图：

在这里插入图片描述

杨立青101

关注

1
点赞
踩
9

收藏

觉得还不错? 一键收藏
4
评论
【点云3D目标检测】IA-SSD报错：Expected isFloatingType(grads[i].scalar_type()) to be true, but got false.

运行IA-SSD报错：Expected isFloatingType(grads[i].scalar_type()) to be true, but got false。最终发现是因为pytorch版本不对导致的，通过ctx.mark_non_differentiable函数使得输出不经过梯度运算来运算成功解决问题。
复制链接

扫一扫