IKDNet是结合点云与影像进行分割的新网络,结合了RandlaNet和UNet,具体的介绍详见文章地址:
代码地址:代码地址
首先根据代码要求,将所需环境配置好。这里需要注意,尽量按照所有要求的版本安装好。open3d之类的库版本一定要对照好,不然容易出现函数报错。
pytorch要根据requirements-torch-cuda.txt文件来安装,不然容易报错。
pip install imagecodecs
pip install albumentations
pip install protobuf==3.19.0
pip install open3d==0.15.2
pip install opencv-python
pip install laspy
pip installtifffile
conda install gdal=3.4.1
在配置好所有库问价时,按照给定指令运行,发现报错,
这一部分加载模块出错,调试发现导入得open3d指向的是系统的,而系统安装的open3d中字典里是缺少一些字段的,这里的解决办法可以是将项目里的ml3d文件夹中文件移动至系统文件夹中,填补缺失的部分文件。
完成这一步之后,需要将输入路径(dataset)改为自己的点云路径,记得是点云路径,同时把yml文件中的img_dataset_path改为影像路径,还有个小坑就是要把
框选部分改为图中所示。
数据改好之后,在运行,发现点云读取错误,No LazBackend selected, cannot decompress data。
查阅了很多资料,发现是缺少了一些辅助库,偶然之下刚好试了安装pylas库和lazrs库,终于解决了,代码开始运行!
复现自己的数据:
首先,因为自己的数据只有单独房屋的标签,因此在数据读取的代码中需要进行相应的修改。
pc_path = self.path_list[idx]
log.debug("get_data called {}".format(pc_path))
# 点云
las = laspy.read(pc_path)
points = np.stack([las.x,las.y,las.z],axis=1)
feat = np.stack([las.intensity,las.return_num],axis=1)
points = np.array(points, dtype=np.float32)
feat = np.array(feat, dtype=np.float32)
# intensity = np.array(intensity, dtype=np.float32)
labels = np.copy(las.classification)
# # unlabeled
# labels[labels == 1] = 0
# labels[labels == 7] = 0
# labels[labels == 12] = 0
# labels[labels >= 18] = 0
# # others
# labels[labels == 3] = 1
# labels[labels == 9] = 1
# labels[labels == 17] = 1
# # ground
# # labels[labels == 2] = 2
# # tree
# labels[labels == 4] = 3
# labels[labels == 5] = 3
# building
labels[labels == 1] = 1
labels = np.array(labels, dtype=np.int32).reshape((-1,))
data = {
'point': points,
'feat': None,
# 'intensity': intensity,
'label': labels,
}
return data
其中,由于自己的点云数据不存在强度数据,因此将feat数据赋为空值。
pc_path = self.path_list[idx]
# image
img_path = pc_path.replace(self.cfg.dataset_path, self.cfg.img_dataset_path).replace(".las", "_img.tif")
image = tifffile.imread(img_path).astype(np.uint8)
mask = cv2.imread(img_path.replace("image1", "mask1").replace("_img.tif", "_gt.tif"), -1)
# building
mask[mask >0] = 1
mask[mask == 0] = 0
mask = np.array(mask, dtype=np.int32).reshape((-1,))
# coordinate transform
img = gdal.Open(img_path)
img_wkt = img.GetProjection()
img_Projection = osr.SpatialReference()
img_Projection.ImportFromWkt(img_wkt)
point_spatialRef=img_Projection
transform = osr.CoordinateTransformation(point_spatialRef, img_Projection)
img_geotransform = img.GetGeoTransform()
data = {
'img': image,
'mask': mask,
'img_geotransform': img_geotransform,
'transform': transform,
'img_wkt': img_wkt
}
return data
其中,点云的投影与影像的投影一致,因此point_spatialRef直接用影像投影代替。影像标签的读取也要进行相应的修改。
配置文件也要进行相应的修改:
siamesenet_multisantaclara.yml
dataset:
name: MultiSantaclara
dataset_path: /mnt/sdb3/liutiancheng/IKDNet-pytorch-main/dataset/N3C-California/lidar1/
img_dataset_path: /mnt/sdb3/liutiancheng/IKDNet-pytorch-main/dataset/N3C-California/image1/
cache_dir: /mnt/sdb3/liutiancheng/IKDNet-pytorch-main/cache/
ignored_label_inds: [0]
test_result_folder: ./test
use_cache: true
steps_per_epoch_train: 2000
steps_per_epoch_valid: 1049
model:
name: SiameseNetAcf
ckpt_path:
num_neighbors: 16
num_layers: 4
num_points: 131072
num_classes: 2
ignored_label_inds: [3]
sub_sampling_ratio: [4, 4, 4, 4, 2]
pc_in_channels: 5
img_in_channels: 3
dim_features: 8
dim_output: [16, 64, 128, 256, 512]
grid_size: 0.06
augment:
recenter:
dim: [0, 1, 2]
std:
points:
method: linear
feat:
method: linear
rotate:
method: vertical
scale:
min_s: 0.9
max_s: 1.1
noise:
noise_std: 0.001
pipeline:
name: SemanticSegmentationDual
optimizer:
lr: 0.001
# weight_decay: 0.0001
# momentum: 0.9
batch_size: 4
num_workers: 2
main_log_dir: ./logs
max_epoch: 500
save_ckpt_freq: 5
scheduler_gamma: 0.9886
test_batch_size: 6
train_sum_dir: train_log
val_batch_size: 2
summary:
record_for: []
max_pts:
use_reference: false
max_outputs: 1
其中主要是ignored_label_inds、weight、和num_classes。需要注意的是这里得num_classes需要改为2,分类数+1,背景类也要算作一类。