从零部署点云实例分割项目PointGroup

Justin_JGT

已于 2024-02-29 14:12:30 修改

阅读量2.1k

点赞数 22

分类专栏：深度学习 python 文章标签： python 深度学习人工智能 pytorch

于 2024-02-21 16:46:09 首次发布

本文链接：https://blog.csdn.net/Justin_JGT/article/details/136155006

版权

深度学习同时被 2 个专栏收录

14 篇文章

订阅专栏

python

10 篇文章

订阅专栏

环境配置

python 3.7

NVIDIA RTX 4090

一、下载

论文地址：2004.01658.pdf (arxiv.org)

Github地址：dvlab-research/PointGroup: PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation (github.com)

二、环境配置

依据原项目的步骤进行环境配置，训练阶段报错RuntimeError: cublas runtime error : the GPU program failed to execute at …………aten/src/THC/THCBlas.cu:259

实验环境为4090，不能兼容使用torch1.1，故重新安装高版本torch 1.13.1+cu116

conda create -n pointgroup python=3.7
conda activate pointgroup
pip install torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu116
pip install cmake
pip install plyfile
pip install tensorboardX
pip install pyyaml
pip install scipy

conda install libboost
conda install -c bioconda google-sparsehash 

#编译spconv
cd lib/spconv
python setup.py bdist_wheel
cd dist
pip install spconv-1.0-cp37-cp37m-linux_x86_64.whl

①在编译spconv的时候，若提示Your installed Caffe2 version uses cuDNN but I cannot find the cuDNN libraries. Please set the proper cuDNN prefixes and / or install cuDNN.

在lib/spconv/CMakeLists.txt中，添加以下内容：（cudnn是之前电脑中已经自带的）

set(CUDNN_INCLUDE_DIR "/home/your_name/.conda/pkgs/cudnn-8.2.1.32-h86fa8c9_0/include")
set(CUDNN_INCLUDE_PATH "/home/your_name/.conda/pkgs/cudnn-8.2.1.32-h86fa8c9_0/include")
set(CUDNN_LIBRARY "/home/your_name/.conda/pkgs/cudnn-8.2.1.32-h86fa8c9_0/lib/libcudnn.so")
set(CUDNN_LIBRARY_PATH "/home/your_name/.conda/pkgs/cudnn-8.2.1.32-h86fa8c9_0/lib/libcudnn.so")

②报错error: no matching function for call to ‘torch::jit::RegisterOperators::RegisterOperators(const char [28], <unresolved overloaded function type>)’

torch版本升级带来的API改变问题，用 torch::RegisterOperators()替代 torch::jit::RegisterOperator()

③Unsupported gpu architecture 'compute_89'

在shell中先执行export TORCH_CUDA_ARCH_LIST="8.6"，降低算力

④error: more than one operator ">" matches these operands:

error: more than one operator "==" matches these operands:

cuda和torch的头文件都提供了相同的重载运算符，编译器不知道用哪一个

在lib/spconv/CMakeLists.txt中，添加以下内容：

add_definitions(-D__CUDA_NO_HALF_OPERATORS__)

然后编译pointgroup

#编译pointgroup
cd ../../../lib/pointgroup_ops
python setup.py develop

①报错src/bfs_cluster/bfs_cluster.h:11:10: fatal error: THC/THC.h: No such file or directory

删除掉该文件中这一行头文件，即可。

三、数据准备

1.下载

参考网址：scannet v2 数据集下载（WP）_scannet数据集网盘-CSDN博客

下载后是如下目录结构：

2.组织数据目录结构

按照github中的目标目录结构进行组织，编写如下脚本实现：

import os
import shutil


root_dir = "dataset/scannetv2"
dataset_dir_src = "dataset/scannetv2/Scannet_V2"
dataset_train = os.path.join(root_dir, 'train')
dataset_val = os.path.join(root_dir, 'val')
dataset_test = os.path.join(root_dir, 'test')

sub_str_list = ['_vh_clean_2.ply', '_vh_clean_2.labels.ply', '_vh_clean_2.0.010000.segs.json', '.aggregation.json']

if not os.path.exists(dataset_train):
    os.makedirs(dataset_train)
if not os.path.exists(dataset_val):
    os.makedirs(dataset_val)
if not os.path.exists(dataset_test):
    os.makedirs(dataset_test)

# 生成train目录
for root, dir, files in os.walk(os.path.join(dataset_dir_src, 'scans_train')):
    for file in files:
        for sub_str in sub_str_list:
            if sub_str in file:
                file_src_path = os.path.join(root, file)
                file_target_path = os.path.join(dataset_train, file)
                shutil.copy(file_src_path, file_target_path)
                break
    pass
pass

# 生成val目录
for root, dir, files in os.walk(os.path.join(dataset_dir_src, 'scans_train')):
    for file in files:
        for sub_str in sub_str_list:
            if sub_str in file:
                file_src_path = os.path.join(root, file)
                file_target_path = os.path.join(dataset_val, file)
                shutil.copy(file_src_path, file_target_path)
                break
    pass
pass

# 生成test目录
for root, dir, files in os.walk(os.path.join(dataset_dir_src, 'scans_test')):
    for file in files:
        for sub_str in sub_str_list:
            if sub_str in file:
                file_src_path = os.path.join(root, file)
                file_target_path = os.path.join(dataset_test, file)
                shutil.copy(file_src_path, file_target_path)
                break
    pass
pass

# 复制tsv文件
shutil.copy(os.path.join(dataset_dir_src, 'scannetv2-labels.combined.tsv'), os.path.join(root_dir, 'scannetv2-labels.combined.tsv'))

3.生成最终数据集

cd dataset/scannetv2
python prepare_data_inst.py --data_split train
python prepare_data_inst.py --data_split val
python prepare_data_inst.py --data_split test

【预处理后数据解析】：

dataset/scannetv2/train/*_inst_nostuff.pth:内部为一个元组，总共有4个元素；其他文件用不上。

第一个元素是一个维度为(N, 3)的数组，为点云三维坐标；

第二个元素是一个维度为(N, 3)的数组，为归一化到[-1, 1]的点云rgb颜色；

第三个元素是一个维度为(N, 1)的数组，为该场景内该点云每个点的语义标签；-100为忽略标签

第四个元素是一个维度为(N, 1)的数组，为该场景内该点云每个点的实例标签；-100为忽略标签

四、代码分析

Dataset——Dataset类

Dataloader——torch.utils.data.DataLoader类

Model——PointGroup类

Optimizer——torch.optim.Adam类

Scheduler——无

Loss——torch.nn.CrossEntropyLoss交叉熵函数以及torch.nn.BCELoss二分类交叉熵损失函数

五、训练流程分析

python train.py --config config/pointgroup_run1_scannet.yaml

细节参考原论文以及其他翻译论文（CVPR 2020——PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation_pointgroup源代码分析-CSDN博客）

六、推理流程分析

1.使用验证集进行推理

首先生成验证集的groud_truth:

cd dataset/scannetv2
python prepare_data_inst_gttxt.py
cd ../..

执行推理。

CUDA_VISIBLE_DEVICES=0 python test.py --config config/pointgroup_default_scannet.yaml --pretrain 'checkpoint/pointgroup.pth'

结果如下(只产生指标不保存可视化结果)：

2.使用测试集进行推理

分别设置config文件中TEST模块的(split, eval, save_instance) 为(test, False, True)

CUDA_VISIBLE_DEVICES=0 python test.py --config config/pointgroup_run1_scannet.yaml --pretrain 'checkpoint/pointgroup.pth'

3.可视化

执行以下代码，产生的文件可在CloudCompare点云处理软件中打开。

"""
@Author     :   jiguotong
@Contact    :   xxxx@qq.com
@site       :   
-----------------------------------------------
@Software   :   VScode
@Project    :   utils
@File       :   
@Version    :   v0.1
@Time       :   2024/2/21
@License    :   (C)Copyright    2021-2024,  jiguotong
@Reference  :
@Description:   可视化由pointgroup的test过程所产生的文件
@Thought    :
"""
import numpy as np
import torch
import os

points_path_root = "dataset/scannetv2/test"
pred_files_root = "exp/scannetv2/pointgroup/pointgroup_run1_scannet/result/epoch384_nmst0.3_scoret0.09_npointt100/test"
save_path_root = os.path.join(pred_files_root, 'output')
if not os.path.exists(save_path_root):
    os.makedirs(save_path_root)

scenes_list = os.listdir(pred_files_root)
scenes_list = [file for file in scenes_list if file.endswith(".txt")]

for scene_name in scenes_list:
    point_name = scene_name.split('.')[0] + '_inst_nostuff.pth'
    point_path = os.path.join(points_path_root, point_name)
    coord = torch.load(point_path)[0]
    ins_pred = np.zeros((coord.shape[0], 1), dtype=np.uint)
    save_path = os.path.join(save_path_root, scene_name)
    scene_path = os.path.join(pred_files_root, scene_name)
    with open(scene_path, "r") as f:
        sub_scene_list = f.readlines()
        for index, instance_line in enumerate(sub_scene_list):
            instance_name = instance_line.split(' ')[0]
            instance_path = os.path.join(pred_files_root, instance_name)
            mask = np.loadtxt(instance_path)
            mask[mask!=0] = index + 1
            mask = mask.reshape((mask.shape[0], 1))
            ins_pred[mask != 0] = mask[mask != 0]
        pass
    result = np.column_stack((coord, ins_pred))
    np.savetxt(save_path, result)
    print(scene_name+'保存完成！')
    pass
pass

可视化效果如下：

七、使用自己的数据集进行训练

1.注意事项

①在dataloader中读到的坐标与颜色的数据类型应该为float32

②制作数据集时应该提前对坐标进行归一化与中心化

③注意ignore_label的赋值，scannetv2数据集中是-100；代码中，作者将语义标签>1的才进行实例聚类，由此可见，相当于是将0 1 作为了backgroup_label。

0-wall

1-floor

④若训练过程中出现train loss为负值的情况，原作者提到由于loss中含有offset，所以是正常的。

⑤ 配置文件中的scale(voxel_size)以及group中的cluster_radius等参数，要根据自己数据集进行调整。

八、代码剖析

1.Voxelization_Idx类

class Voxelization_Idx(Function):
    @staticmethod
    def forward(ctx, coords, batchsize, mode=4):
        '''
        :param ctx:
        :param coords:  long (N, dimension + 1) or (N, dimension) dimension = 3
        :param batchsize
        :param mode: int 4=mean
        :param dimension: int
        :return: output_coords:  long (M, dimension + 1) (M <= N)
        :return: output_map: int M * (maxActive + 1)
        :return: input_map: int N
        '''
        assert coords.is_contiguous()
        N = coords.size(0)
        output_coords = coords.new()

        input_map = torch.IntTensor(N).zero_()
        output_map = input_map.new()

        PG_OP.voxelize_idx(coords, output_coords, input_map, output_map, batchsize, mode)
        return output_coords, input_map, output_map


    @staticmethod
    def backward(ctx, a=None, b=None, c=None):
        return None

voxelization_idx = Voxelization_Idx.apply

PG_OP.voxelize_idx函数做了什么？

其作用是将分布在体素空间中的坐标点进行去重，并且返回每个体素含有的坐标点、原每个坐标点在体素中的索引、每个体素所含有的坐标点的索引。

coords: long (N, 3)或者(N,4),若是4，则第一列为批次中的样本索引，例如0，1；其余为显示在体素空间中的索引坐标，如(1,5,6),(15,25,69)，体素空间坐标从(0,0,0)开始。

batchsize：该coords中所包含的坐标，来自于几个样本。

mode：合并方式，4为mean。

output_coords：去重之后的体素空间坐标点，shape为(M,3)或者(M,4)。

input_map：shape为(N)，标记第 $N_{i}$ 个点，所在的体素编号，0~M-1

output_map：shape为(M,k)，列出在 $M_{i}$ 体素中，含有的coords索引，k为单个体素中含有最大的点个数，由此得知，output_map含有很多0，因为有的体素含有的点个数远远小于k。

2.Voxelization类

class Voxelization(Function):
    @staticmethod
    def forward(ctx, feats, map_rule, mode=4):
        '''
        :param ctx:
        :param map_rule: cuda int M * (maxActive + 1)
        :param feats: cuda float N * C
        :return: output_feats: cuda float M * C
        '''
        assert map_rule.is_contiguous()
        assert feats.is_contiguous()
        N, C = feats.size()
        M = map_rule.size(0)
        maxActive = map_rule.size(1) - 1

        output_feats = torch.cuda.FloatTensor(M, C).zero_()

        ctx.for_backwards = (map_rule, mode, maxActive, N)

        PG_OP.voxelize_fp(feats, output_feats, map_rule, mode, M, maxActive, C)
        return output_feats


    @staticmethod
    def backward(ctx, d_output_feats):
        map_rule, mode, maxActive, N = ctx.for_backwards
        M, C = d_output_feats.size()

        d_feats = torch.cuda.FloatTensor(N, C).zero_()

        PG_OP.voxelize_bp(d_output_feats.contiguous(), d_feats, map_rule, mode, M, maxActive, C)
        return d_feats, None, None

voxelization = Voxelization.apply

PG_OP.voxelize_fp函数做了什么？

其作用是根据上一步的结果，将所有点的特征进行筛选，只剩下上一步所提到的在体素坐标空间内不重复的点，即将feats从N->M。