(三维重建论文)iSDF:Real-time Neural Signed Distance Fields for Robot Perception 代码运行与学习

biter0088

已于 2022-11-03 12:09:43 修改

阅读量1.8k

点赞数 4

分类专栏：深度学习论文学习文章标签：学习 python pytorch 三维重建 iSDF

于 2022-11-03 11:42:07 首次发布

本文链接：https://blog.csdn.net/BIT_HXZ/article/details/127666916

版权

论文学习同时被 2 个专栏收录

20 篇文章 7 订阅

订阅专栏

深度学习

14 篇文章 3 订阅

订阅专栏

(三维重建)iSDF:Real-time Neural Signed Distance Fields for Robot Perception 代码运行与学习
转载前一定要联系，不联系就不要转载啦~

0 说明

iSDF提供三种代码运行模式，
(1)在提供姿势的序列上运行 iSDF：用于在本文中使用的 ScanNet 和 ReplicaCAD 序列上运行 iSDF。
(2)在 ROS 中使用实时摄像头运行 iSDF：iSDF 的 ros 包装器订阅由 ORB-SLAM3 包装器发布的姿势帧主题。
(3)在 ROS 中使用 Franka 和实时摄像头运行 iSDF：iSDF 的 ros 包装器订阅了来自 Franka Panda 机器人上校准的 realsense 的 realsense 帧和姿势的主题。

这里实现第一种

1 在提供姿势的序列上运行 iSDF

1.0 下载代码

git clone https://github.com/facebookresearch/iSDF.git && cd iSDF
硬件:RTX3070、ubuntu18.04

1.1 配置环境

步骤1

先参照作者github，使用anaconda配置环境（要安装的包比较多，可能会比较慢）：
conda env create -f environment.yml

步骤2

如果出现Pip subprocess error、CondaEnvException: Pip failed即配置环境的过程中一些通过pip安装的包无法安装，下面将这些没有成功安装的包拿出来进行安装。

在这里插入图片描述

将没有安装的包copy出来，一般是从某一个包后的所有包都没有成功安装，粘贴进一个txt文件里面，我这里取名为requirements.txt

opencv-python==4.5.3.56
ordered-set==4.0.2
-----------省略显示
xxhash==2.0.2
zipp==3.8.1

进入刚刚配置的环境：conda activate isdf
补充安装包：pip install -r requirements_my.txt ,安装完成如下：

在这里插入图片描述

步骤3

进入pytorch官网页面：https://pytorch.org/get-started/previous-versions/
选一个不是很新，也不是很旧的版本，这里选了1.12.0

在这里插入图片描述

在python环境中输入：conda install pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 cudatoolkit=11.3 -c pytorch

步骤4

进入代码最父级目录(目录下有setup.py文件)，安装isdf这个自定义的python包（名字叫做incSDF）：

cd iSDF
pip install -e .

在这里插入图片描述

1.2 下载数据序列

1.2.1 ReplicaCAD数据

(1)简介

ReplicaCAD 数据集是艺术家对具有许多不同房间配置的 FRL 公寓的再创作。
通过模拟在两种不同公寓配置中移动的机车生成了 6 个序列。我们为每个公寓配置生成三种类型的序列。一个导航序列（nav），一个对象重建序列（obj）和一个操作序列（mnp）。
要生成序列，首先安装 Habitat-Sim 。然后可以通过运行此脚本生成序列
Habitat-Sim
一个支持物理学的高性能3D模拟器，支持:
·室内/室外空间的 3D 扫描（内置支持 HM3D、MatterPort3D、Gibson、Replica 和其他数据集）
·空间和分段刚性(piecewise-rigid)对象的 CAD 模型（例如 ReplicaCAD、YCB、Google 扫描对象），
·可配置传感器（RGB-D 相机、自我运动感应(egomotion sensing)）
·通过 URDF 描述的机器人（像 Fetch 这样的移动机械手，像 Franka 这样的固定臂，像 AlienGo 这样的四足机器人），
·刚体力学（通过 Bullet）。

(2) 下载数据

注：在python环境中运行如下命令。
下载单个replicaCAD序列（约5GB）：
bash data/download_apt_2_nav.sh
在这里插入图片描述

(3) 报错相关

gdown: 未找到命令，解决：pip install gdown

1.2.2 所有12个序列

下载所有12个序列（约15GB），注意这个下载会包含上面下载的数据：
bash data/download_data.sh

在这里插入图片描述

1.2.3 ScanNet

(1)简介

对于 ScanNet 数据集，我们使用 6 个序列。这些可以通过下载 ScanNet 数据集并使用此脚本导出序列(depth, color, poses and intrinsics)来获得。
我们使用 3 个更长和 3 个更短的随机选择的 ScanNet 序列。
更长的时间：scene0010_00：6.52m scene0030_00：6.98m scene0031_00：766m缩短：Space0004_00：9.55m scene0005_00：5.40m scace0009_00：7.00m
结合下面的训练情况来看，更长的时间的数据序列对显卡的要求更高，我的RTX3070就不太行

(2)下载数据序列并导出颜色图、深度图、内参、位姿

参考我的另一个笔记，下载scene0004_00,scene0005_00,scene0009_00,scene0010_00,scene0030_00,scene0031_00这6个序列

笔记链接：https://blog.csdn.net/BIT_HXZ/article/details/127641860

1.2.4 Tabletop dataset

(1)简介

桌面数据集由我们在装有 Realsense 相机的 Franka 上收集的 5 条短轨迹组成。我们扫描了各种 YCB 物体和 3D 打印的斯坦福兔子。

我这里暂时不考虑这个数据集

1.3 训练

1.3.1 replicaCAD数据序列

(1)使用默认数据序列

train文件夹：

在这里插入图片描述

执行命令：

cd isdf/train/
python train_vis.py --config configs/replicaCAD.json

在这里插入图片描述

思考：
观察重构的网格(Mesh reconstruction)，相机重构的角度受限，只能获取正前方视野内的环境数据，对单个位置周围环境数据利用不如激光雷达

(2)使用`apt_3_nav`数据序列

修改:xx/iSDF/isdf/train/configs/replicaCAD.json如下：

        "seq_dir": "/media/username/T7/dataset/iSDF/all_12_sequences/seqs/apt_3_nav/",
        "gt_sdf_dir": "/media/username/T7/dataset/iSDF/all_12_sequences/gt_sdfs/apt_3/",

执行命令：

cd isdf/train/
python train_vis.py --config configs/replicaCAD.json

在这里插入图片描述

(3)其他数据序列类似处理

1.3.2 ScanNet数据序列

(1)使用`scene0005_00`数据序列

修改xx/iSDF/isdf/train/configs/scannet.json里面scannet_dir和intrinsics_file,具体地址参考前面1.2.3 ScanNet，我这里为:

        "scannet_dir": "/media/username/T7/dataset/ScanNet/scans/scene0005_00/",        

        "intrinsics_file": "/media/username/T7/dataset/ScanNet/scans/scene0005_00/scene0005_00.txt"

开始训练：

cd isdf/train/
python train_vis.py --config configs/scannet.json

训练效果：

在这里插入图片描述

报错：/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function ‘cvtColor’

  File "/home/username/subject/iSDF/iSDF/isdf/datasets/image_transforms.py", line 14, in __call__
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
cv2.error: OpenCV(4.5.3) /tmp/pip-req-build-afu9cjzs/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'

在这里插入图片描述

参考：https://github.com/ageitgey/face_recognition/issues/933
应该是函数cv2.cvtColor读到了空图像即前面scannet_dir设置不合理

我这里修改加载scannet数据部分的函数ScanNetDataset()内的源码：
修改文件xx/iSDF/isdf/datasets/dataset.py如下，下面self.root_dir为xxiSDF/isdf/train/configs/scannet.json文件里面scannet_dir,故这里结合自己下载以及导出数据的目录结构，将frame去掉就通了(注：相应地如果在自己下载数据集中增加frames这一集目录结构就不用改代码了)

        # self.rgb_dir = os.path.join(root_dir, "frames", "color/")
        # self.depth_dir = os.path.join(root_dir, "frames", "depth/")
        print("调试 self.root_dir",self.root_dir) #hxz
        self.rgb_dir = os.path.join(root_dir, "color/") #hxz
        self.depth_dir = os.path.join(root_dir,  "depth/")   #hxz

(2)其他数据序列

修改xx/iSDF/isdf/train/configs/scannet.json里面seq_dir,gt_sdf_dir,scannet_dir，intrinsics_file等处。
有个这些序列数据量比较大，点击等待，稍等一会即可

scene0031_00数据序列

在这里插入图片描述

scene0009_00数据序列，箭头所指位置为洗手间镜子反射造成的

在这里插入图片描述

scene0010_00数据序列及对应的真值
在这里插入图片描述

scene0004_00数据序列
在这里插入图片描述

1.4 训练步骤报错

(1) import不到其他文件夹下面的内容

如果训练时报错：import不到其他文件夹下面的内容，请检查1.1步的pip install -e .是否执行了

(2) 找不到包的报错

ModuleNotFoundError: No module named 'pyglet'----pip install pyglet
ModuleNotFoundError: No module named 'trimesh'----pip install trimesh
ModuleNotFoundError: No module named 'urdfpy'----pip install urdfpy

(3) reload(sys)报错—可以上网搜，针对自己电脑情况进行修改

NameError: name 'reload' is not defined

在这里插入图片描述

解决：

sudo gedit /opt/ros/melodic/lib/python2.7/dist-packages/roslib/packages.py

我运行的python环境为python3.7，对应修改为：

import sys
import importlib
importlib.reload(sys)

(4) `AttributeError: 'open3d.cuda.pybind.visualization.gui.Application' object has no attribute 'add_font'`

是open3d版本原因，按照environment.yml要求的0.14.1版本重新安装，参考命令如下：
正常安装：
pip install open3d==0.14.1
conda install open3d==0.14.1
使用发行open3d的一些机构安装：
conda install -c open3d-admin open3d=0.14.1
conda install -c tiledb open3d=0.14.1
使用清华源安装
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple open3d==0.14.1

(5) 设置了vpn导致在终端无法pip安装包

在~/.bashrc文件中将对应位置注释掉，并source ~/.bashrc一下，我这里注释位置为：

#export HTTP_PROXY="http://127.0.0.1:15732"
#export HTTPS_PROXY="http://127.0.0.1:15732"

(注：需要终端翻墙时再改回去)

(6) TypeError: clamp(): argument ‘min‘ (position 2) must be Number, not Tensor

原因：pytorch在1.9.0版本之前不允许使用tensor作为min参数，1.9.0及之后的版本允许。
更新pytorch版本，可能需要对应更新cuda、cudnn版本
解决方案：https://blog.csdn.net/BIT_HXZ/article/details/127604680；https://blog.csdn.net/BIT_HXZ/article/details/127604530

(7) X Error of failed request: GLXBadFBConfig

原因：open3d和OpenGL版本冲突，更新OpenGL版本即可
解决方案：https://blog.csdn.net/BIT_HXZ/article/details/127599976

X Error of failed request: GLXBadFBConfig
Major opcode of failed request: 151 (GLX)
Minor opcode of failed request: 34 ()
Serial number of failed request: 33
Current serial number in output stream: 31

1.5 SDF 精度图绘制(SDF accuracy plots)

作者提供了生成比较iSDF和两个对照的SDF误差、碰撞成本误差和梯度余弦距离的图的脚本。这个脚本被用来生成论文中的所有定量图（例如图8）。
注意：这里使用的数据是作者训练的结果，是xx/iSDF/isdf/eval/figs/all_seq.py代码中isdf_dir指向的位置xxF/iSDF/results/iSDF/exp0/，在上面1.0 下载代码时一并被下载了。

修改文件：xx/iSDF/isdf/eval/plot_utils.py

# eval_pts_dir = "/home/joe/projects/incSDF/incSDF/data/eval_pts/vox/"
eval_pts_dir = "xx/iSDF/data/eval_pts/vox/"  #hxz

修改文件：xx/iSDF/isdf/eval/figs/all_seq.py，在results前增加一个/

# save_dir = incSDF_root + "results/figs/all_seq_plots/"
save_dir = incSDF_root + "/results/figs/all_seq_plots/"    ##hxz

python isdf/eval/figs/all_seq.py
在xx/iSDF/results/figs/all_seq_plots文件夹下保存

在这里插入图片描述

报错与解决：ModuleNotFoundError: No module named 'git'----pip install itpython

1.6 重现实验(Reproducing experiments)

在无头模式（headless mode）下按顺序运行一批iSDF实验。要运行这些实验，你必须使用我们的bash脚本下载所有12个序列，以及单独下载和导出ScanNet序列。为了只运行ReplicaCAD序列，你可以修改batch_utils.py中的load_params函数。如果你有多个GPU，你可能想把实验运行并行化。请确保在运行前更新job_local.py中的project_dir和scannet_dir。

(1)重现实验流程说明

训练

cd isdf/train
python batch_train/jobs_local.py

会在这个文件夹xx/iSDF/results/iSDF/10-31-22_11-13-19下产生一个以时间为名称的结果保存目录：

在这里插入图片描述

结果绘制SDF精度图：
修改：xx/iSDF/isdf/eval/figs/all_seq.py文件中如下语句：

    # isdf_dir = incSDF_root + "/results/iSDF/exp0/"#hxz
    isdf_dir = incSDF_root + "/results/iSDF/10-31-22_11-13-19/"    #hxz

运行xx/iSDF/isdf/eval/figs/all_seq.py脚本：python isdf/eval/figs/all_seq.py

(2)重现实验2

所有数据序列

参考：1.3.2 ScanNet数据序列, 1.2.3 ScanNet中的数据序列目录
修改：xx/iSDF/isdf/train/batch_train/jobs_local.py中scannet_root:

# scannet_root = "/mnt/sda/ScanNet/" #
scannet_root = "/media/username/T7/dataset/ScanNet/"#hxz

训练过程中显卡存储报错：

RuntimeError: CUDA out of memory. Tried to allocate 178.00 MiB (GPU 0; 7.77 GiB total capacity; 5.63 GiB already allocated; 74.19 MiB free; 6.68 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.

从终端打印输出来看，应该是同时训练的数据序列太多；此外，从终端输出看每个序列都是单独计算损失值的。故可以将批量训练改为单独训练,最后再将单独训练的结果合并到一个文件夹下

前三个数据序列

修改xx/iSDF/isdf/train/batch_train/batch_utils.py,这里先训练前三个序列，其余6个先删除（做好备份）

在这里插入图片描述

可以看出：一个数据序列训练十组，共30个实验

在这里插入图片描述

还是报错：

RuntimeError: CUDA out of memory. Tried to allocate 172.00 MiB (GPU 0; 7.77 GiB total capacity; 5.41 GiB already allocated; 112.19 MiB free; 6.65 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.

前1个数据序列–即`apt_2_mnp`数据序列

在这里插入图片描述

第2个数据序列–即`apt_2_obj`数据序列

在这里插入图片描述

第3个数据序列–即`apt_2_nav`数据序列

在这里插入图片描述

第4个数据序列–即`apt_3_mnp`数据序列

在这里插入图片描述

第一次运行python batch_train/jobs_local.py报了下面的显卡显存不足，自己觉得一个数据序列而已，再次运行（全称盯着。。。）就没报错了，无语。一种可能是，在上一次执行脚本jobs_local.py刚结束，就换了个数据继续执行，显卡休息时间短，可以晚一会；以及执行jobs_local.py脚本的时候，尽量少运行其他程序。

RuntimeError: CUDA out of memory. Tried to allocate 170.00 MiB (GPU 0; 7.77 GiB total capacity; 5.26 GiB already allocated; 84.19 MiB free; 6.67 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.

第5个数据序列–即`apt_3_obj`数据序列

在这里插入图片描述

第6个数据序列–即`apt_3_nav`数据序列–显存不足

在这里插入图片描述

报错：

RuntimeError: CUDA out of memory. Tried to allocate 170.00 MiB (GPU 0; 7.77 GiB total capacity; 5.26 GiB already allocated; 84.19 MiB free; 6.67 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.

查找batch_size的作用，查到xx/iSDF/isdf/geometry/transform.py的backproject_pointclouds()函数，发现 batch_size = depths.shape[0]即batch_size值为深度图像的高度。结合下图知batch_size为680
在这里插入图片描述

这里强行减半，在xx/iSDF/isdf/geometry/transform.py中修改如下：

    # batch_size = depths.shape[0] #hxz
    print("执行到此函数 hxz") 
    batch_size = 100  #hxz

修改无效，注意改回去
此数据序列先放弃

第10个数据序列–即`scene0004_00`数据序列

在这里插入图片描述

修改：xx/iSDF/isdf/train/batch_train/batch_utils.py的base_config_file

# base_config_file = project_dir + "/isdf/train/configs/replicaCAD.json"
base_config_file = project_dir + "/isdf/train/configs/scannet.json" #hxz

同时修改/isdf/train/configs/scannet.json，将seq_dir,scannet_dir,gt_sdf_dir,intrinsics_file都修改为scene0004_00对应的，如下所示：
在这里插入图片描述

报错：cv2.error: OpenCV(4.5.3) /tmp/pip-req-build-afu9cjzs/opencv/modules/imgproc/src/color.cpp:182: er

修改xx/iSDF/isdf/datasets/dataset.py文件class SceneCache(Dataset)里面的内容，去掉frames

                # depth_file = root_dir + "/frames/depth/" + str(idx) + ".png" #hxz
                # rgb_file = root_dir + "/frames/color/" + str(idx) + col_ext       #hxz
                depth_file = root_dir + "/depth/" + str(idx) + ".png"     #hxz
                rgb_file = root_dir + "/color/" + str(idx) + col_ext          #hxz

第11个数据序列–即`scene0005_00`数据序列

在这里插入图片描述

修改/isdf/train/configs/scannet.json，将seq_dir,scannet_dir,gt_sdf_dir,intrinsics_file都修改为scene0005_00对应的

第12个数据序列–即`scene0009_00`数据序列

在这里插入图片描述

修改/isdf/train/configs/scannet.json，将seq_dir,scannet_dir,gt_sdf_dir,intrinsics_file都修改为scene0009_00对应的

第9个数据序列–即`scene0031_00`数据序列–显存不足

在这里插入图片描述

修改/isdf/train/configs/scannet.json，将seq_dir,scannet_dir,gt_sdf_dir,intrinsics_file都修改为scene0031_00对应的

报错:RuntimeError: CUDA out of memory. Tried to allocate 164.00 MiB (GPU 0; 7.77 GiB total capacity; 5.44 GiB already allocated; 16.19 MiB free; 6.89 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.
原因应该是：scene0031_00属于长数据序列，而前面的scene0004_00、scene0005_00、scene0009_00属于短数据序列

尝试修改：xx/iSDF/isdf/train/batch_train/jobs_local.py的runs_per_seq (每个数据序列运行多少次/使用多少次):从10修改到5

config_files, save_paths = batch_utils.create_configs_nruns(
    base_config_file,
    data_dir,
    scannet_root,
    save_root,
    # runs_per_seq=10, #hxz
    runs_per_seq=5,    #hxz
    save_slices=False
)

这样就只输出5个结果结果目录
在这里插入图片描述

还是报显存错误

(3)绘制SDF精度图

将(2)重现实验2训练的8组数据复制到/iSDF/results/iSDF/exp1内，再从xx/iSDF/iSDF/results/iSDF/exp0将scene00010_00,scene0030_00,scene0031_00,apt_3_nav对应的数据粘贴过来，最后exp1文件夹内容如下（每个数据序列10组）：

在这里插入图片描述

运行python all_seq.py,结果如下：
在这里插入图片描述

(4)比较绘制SDF精度图

下图左边是作者github上面展示的数据绘制的SDF精度图，右边是自己绘制的SDF精度图，可以看出自己训练的结果与作者 (除scene00010_00,scene0030_00,scene0031_00,apt_3_nav外)还是有比较大的差距的，尤其表现在scene00009_00数据序列

在这里插入图片描述

重新单独训练一次scene00009_00数据序列，报错RuntimeError: CUDA out of memory. Tried to allocate 176.00 MiB (GPU 0; 7.77 GiB total capacity; 5.21 GiB already allocated; 80.19 MiB free; 6.83 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and ，可能前面侥幸使scene00009_00数据序列训练完成，但训练效果变差

参考(3)绘制SDF精度图将scene00009_00数据序列（只有7组结果）对应的结果从exp0粘贴到exp1

再次绘制SDF精度图：
下图左边是作者github上面展示的数据绘制的SDF精度图，右边是自己绘制的SDF精度图，将自己训练的结果与作者对比 (除scene00010_00,scene0030_00,scene0031_00,apt_3_nav，scene00009_00外)，可以看出，自己训练的数据序列几乎都是比作者差的，应该是显卡等硬件设备的问题；但总体上走势还是相似的。

在这里插入图片描述

biter0088

关注

4
点赞
踩
13

收藏

觉得还不错? 一键收藏
4
评论
(三维重建论文)iSDF:Real-time Neural Signed Distance Fields for Robot Perception 代码运行与学习

(三维重建论文)iSDF:Real-time Neural Signed Distance Fields for Robot Perception 代码运行与学习
复制链接

扫一扫