文章:[2304.04278] Point-SLAM: Dense Neural Point Cloud-based SLAM (arxiv.org)
Github:GitHub - eriksandstroem/Point-SLAM: Point-SLAM: Dense Neural Point Cloud-based SLAM
直观感受项目的震撼效果:
https://github.com/eriksandstroem/Point-SLAM/raw/main/media/office_4.gif
1.Installation
首先,您需要确保已经安装了所有必要的依赖项。最简单的方法是使用Anaconda。
如果您在没有显示器的集群GPU上运行Point-SLAM,我们建议安装headless version of Open3D,该版本主要用于评估重建的网格的深度L1度量。
这需要从头编译Open3D。本工程代码已在Open3D 15.1和16.0上进行了测试。
如果从头编译Open3D,请从env.yaml文件中删除Open3D依赖项。
您可以创建一个名为point-slam的Anaconda环境。
conda env create -f env.yaml conda activate point-slam
注意:conda过程中可能会超时,需要自己配置conda源或者自己单独下载。
为了评估F-score,需要使用pip下载和安装如下的库:
git clone https://github.com/tfy14esa/evaluate_3d_reconstruction_lib.git cd evaluate_3d_reconstruction_lib pip install .
2. Data Download
2.1 Replica
按照以下方式下载数据,数据将保存在./datasets/Replica
文件夹中。
请注意,Replica数据是由iMAP的作者生成的(但由NICE-SLAM的作者托管)。如果您使用该数据,请引用iMAP。
bash scripts/download_replica.sh
为了能评估重建的错误,需要下载ground truth Replica meshes,这个数据集已经将没有见过的区域裁剪掉。
bash scripts/download_cull_replica_mesh.sh
2.2 TUM-RGBD
使用如下的命令,下载TUM-RGBD数据集:
bash scripts/download_tum.sh
默认情况下,DATAROOT是./datasets。
如果您将数据存储在计算机上的其他位置,请在场景特定的配置文件中更改input_folder
路径。
2.3 ScanNet
由于Scannet具有版权保护,需要根据 ScanNet website的指引,发送使用协议,进行下载。然后使用代码:code,从下载的数据集中提取RGB图像和深度图。
Scannet的数据集结构如下:
DATAROOT └── scannet └── scene0000_00 └── frames ├── color │ ├── 0.jpg │ ├── 1.jpg │ ├── ... │ └── ... ├── depth │ ├── 0.png │ ├── 1.png │ ├── ... │ └── ... ├── intrinsic └── pose ├── 0.txt ├── 1.txt ├── ... └── ...
注意:默认的DATAROOT是./datasets。
如果序列(sceneXXXX_XX)存放在其他位置,请修改配置文件input_folder的路径值。
在本项目中,我们用了如下的序列:
scene0000_00 scene0025_02 scene0059_00 scene0062_00 scene0103_00 scene0106_00 scene0126_00 scene0169_00 scene0181_00 scene0207_00
一个场景全套下载方式:
download_scannet.py:
#!/usr/bin/env python
# Downloads ScanNet public data release
# Run with ./download-scannet.py (or python download-scannet.py on Windows)
# -*- coding: utf-8 -*-
import argparse
import os
import urllib.request
import urllib
import tempfile
BASE_URL = 'http://kaldir.vc.in.tum.de/scannet/'
TOS_URL = BASE_URL + 'ScanNet_TOS.pdf'
FILETYPES = ['.aggregation.json', '.sens', '.txt', '_vh_clean.ply', '_vh_clean_2.0.010000.segs.json', '_vh_clean_2.ply', '_vh_clean.segs.json', '_vh_clean.aggregation.json', '_vh_clean_2.labels.ply', '_2d-instance.zip', '_2d-instance-filt.zip', '_2d-label.zip', '_2d-label-filt.zip']
FILETYPES_TEST = ['.sens', '.txt', '_vh_clean.ply', '_vh_clean_2.ply']
PREPROCESSED_FRAMES_FILE = ['scannet_frames_25k.zip', '5.6GB']
TEST_FRAMES_FILE = ['scannet_frames_test.zip', '610MB']
LABEL_MAP_FILES = ['scannetv2-labels.combined.tsv', 'scannet-labels.combined.tsv']
RELEASES = ['v2/scans', 'v1/scans']
RELEASES_TASKS = ['v2/tasks', 'v1/tasks']
RELEASES_NAMES = ['v2', 'v1']
RELEASE = RELEASES[0]
RELEASE_TASKS = RELEASES_TASKS[0]
RELEASE_NAME = RELEASES_NAMES[0]
LABEL_MAP_FILE = LABEL_MAP_FILES[0]
RELEASE_SIZE = '1.2TB'
V1_IDX = 1
def get_release_scans(release_file):
scan_lines = urllib.request.urlopen(release_file)
#scan_lines = urllib.urlopen(release_file)
scans = []
for scan_line in scan_lines:
scan_id = scan_line.decode('utf8').rstrip('\n')
scans.append(scan_id)
return scans
def download_release(release_scans, out_dir, file_types, use_v1_sens):
if len(release_scans) == 0:
return
print('Downloading ScanNet ' + RELEASE_NAME + ' release to ' + out_dir + '...')
for scan_id in release_scans:
scan_out_dir = os.path.join(out_dir, scan_id)
download_scan(scan_id, scan_out_dir, file_types, use_v1_sens)
print('Downloaded ScanNet ' + RELEASE_NAME + ' release.')
def download_file(url, out_file):
out_dir = os.path.dirname(out_file)
if not os.path.isdir(out_dir):
os.makedirs(out_dir)
if not os.path.isfile(out_file):
print('\t' + url + ' > ' + out_file)
fh, out_file_tmp = tempfile.mkstemp(dir=out_dir)
f = os.fdopen(fh, 'w')
f.close()
urllib.request.urlretrieve(url, out_file_tmp)
#urllib.urlretrieve(url, out_file_tmp)
os.rename(out_file_tmp, out_file)
else:
print('WARNING: skipping download of existing file ' + out_file)
def download_scan(scan_id, out_dir, file_types, use_v1_sens):
print('Downloading ScanNet ' + RELEASE_NAME + ' scan ' + scan_id + ' ...')
if not os.path.isdir(out_dir):
os.makedirs(out_dir)
for ft in file_types:
v1_sens = use_v1_sens and ft == '.sens'
url = BASE_URL + RELEASE + '/' + scan_id + '/' + scan_id + ft if not v1_sens else BASE_URL + RELEASES[V1_IDX] + '/' + scan_id + '/' + scan_id + ft
out_file = out_dir + '/' + scan_id + ft
download_file(url, out_file)
print('Downloaded scan ' + scan_id)
def download_task_data(out_dir):
print('Downloading ScanNet v1 task data...')
files = [
LABEL_MAP_FILES[V1_IDX], 'obj_classification/data.zip',
'obj_classification/trained_models.zip', 'voxel_labeling/data.zip',
'voxel_labeling/trained_models.zip'
]
for file in files:
url = BASE_URL + RELEASES_TASKS[V1_IDX] + '/' + file
localpath = os.path.join(out_dir, file)
localdir = os.path.dirname(localpath)
if not os.path.isdir(localdir):
os.makedirs(localdir)
download_file(url, localpath)
print('Downloaded task data.')
def download_label_map(out_dir):
print('Downloading ScanNet ' + RELEASE_NAME + ' label mapping file...')
files = [ LABEL_MAP_FILE ]
for file in files:
url = BASE_URL + RELEASE_TASKS + '/' + file
localpath = os.path.join(out_dir, file)
localdir = os.path.dirname(localpath)
if not os.path.isdir(localdir):
os.makedirs(localdir)
download_file(url, localpath)
print('Downloaded ScanNet ' + RELEASE_NAME + ' label mapping file.')
def main():
parser = argparse.ArgumentParser(description='Downloads ScanNet public data release.')
parser.add_argument('-o', '--out_dir', required=True, help='directory in which to download')
parser.add_argument('--task_data', action='store_true', help='download task data (v1)')
parser.add_argument('--label_map', action='store_true', help='download label map file')
parser.add_argument('--v1', action='store_true', help='download ScanNet v1 instead of v2')
parser.add_argument('--id', help='specific scan id to download')
parser.add_argument('--preprocessed_frames', action='store_true', help='download preprocessed subset of ScanNet frames (' + PREPROCESSED_FRAMES_FILE[1] + ')')
parser.add_argument('--test_frames_2d', action='store_true', help='download 2D test frames (' + TEST_FRAMES_FILE[1] + '; also included with whole dataset download)')
parser.add_argument('--type', help='specific file type to download (.aggregation.json, .sens, .txt, _vh_clean.ply, _vh_clean_2.0.010000.segs.json, _vh_clean_2.ply, _vh_clean.segs.json, _vh_clean.aggregation.json, _vh_clean_2.labels.ply, _2d-instance.zip, _2d-instance-filt.zip, _2d-label.zip, _2d-label-filt.zip)')
args = parser.parse_args()
print('By pressing any key to continue you confirm that you have agreed to the ScanNet terms of use as described at:')
print(TOS_URL)
print('***')
print('Press any key to continue, or CTRL-C to exit.')
key = input('')
if args.v1:
global RELEASE
global RELEASE_TASKS
global RELEASE_NAME
global LABEL_MAP_FILE
RELEASE = RELEASES[V1_IDX]
RELEASE_TASKS = RELEASES_TASKS[V1_IDX]
RELEASE_NAME = RELEASES_NAMES[V1_IDX]
LABEL_MAP_FILE = LABEL_MAP_FILES[V1_IDX]
release_file = BASE_URL + RELEASE + '.txt'
release_scans = get_release_scans(release_file)
file_types = FILETYPES;
release_test_file = BASE_URL + RELEASE + '_test.txt'
release_test_scans = get_release_scans(release_test_file)
file_types_test = FILETYPES_TEST;
out_dir_scans = os.path.join(args.out_dir, 'scans')
out_dir_test_scans = os.path.join(args.out_dir, 'scans_test')
out_dir_tasks = os.path.join(args.out_dir, 'tasks')
if args.type: # download file type
file_type = args.type
if file_type not in FILETYPES:
print('ERROR: Invalid file type: ' + file_type)
return
file_types = [file_type]
if file_type in FILETYPES_TEST:
file_types_test = [file_type]
else:
file_types_test = []
if args.task_data: # download task data
download_task_data(out_dir_tasks)
elif args.label_map: # download label map file
download_label_map(args.out_dir)
elif args.preprocessed_frames: # download preprocessed scannet_frames_25k.zip file
if args.v1:
print('ERROR: Preprocessed frames only available for ScanNet v2')
print('You are downloading the preprocessed subset of frames ' + PREPROCESSED_FRAMES_FILE[0] + ' which requires ' + PREPROCESSED_FRAMES_FILE[1] + ' of space.')
download_file(os.path.join(BASE_URL, RELEASE_TASKS, PREPROCESSED_FRAMES_FILE[0]), os.path.join(out_dir_tasks, PREPROCESSED_FRAMES_FILE[0]))
elif args.test_frames_2d: # download test scannet_frames_test.zip file
if args.v1:
print('ERROR: 2D test frames only available for ScanNet v2')
print('You are downloading the 2D test set ' + TEST_FRAMES_FILE[0] + ' which requires ' + TEST_FRAMES_FILE[1] + ' of space.')
download_file(os.path.join(BASE_URL, RELEASE_TASKS, TEST_FRAMES_FILE[0]), os.path.join(out_dir_tasks, TEST_FRAMES_FILE[0]))
elif args.id: # download single scan
scan_id = args.id
is_test_scan = scan_id in release_test_scans
if scan_id not in release_scans and (not is_test_scan or args.v1):
print('ERROR: Invalid scan id: ' + scan_id)
else:
out_dir = os.path.join(out_dir_scans, scan_id) if not is_test_scan else os.path.join(out_dir_test_scans, scan_id)
scan_file_types = file_types if not is_test_scan else file_types_test
use_v1_sens = not is_test_scan
if not is_test_scan and not args.v1 and '.sens' in scan_file_types:
print('Note: ScanNet v2 uses the same .sens files as ScanNet v1: Press \'n\' to exclude downloading .sens files for each scan')
key = input('')
if key.strip().lower() == 'n':
scan_file_types.remove('.sens')
download_scan(scan_id, out_dir, scan_file_types, use_v1_sens)
else: # download entire release
if len(file_types) == len(FILETYPES):
print('WARNING: You are downloading the entire ScanNet ' + RELEASE_NAME + ' release which requires ' + RELEASE_SIZE + ' of space.')
else:
print('WARNING: You are downloading all ScanNet ' + RELEASE_NAME + ' scans of type ' + file_types[0])
print('Note that existing scan directories will be skipped. Delete partially downloaded directories to re-download.')
print('***')
print('Press any key to continue, or CTRL-C to exit.')
key = input('')
if not args.v1 and '.sens' in file_types:
print('Note: ScanNet v2 uses the same .sens files as ScanNet v1: Press \'n\' to exclude downloading .sens files for each scan')
key = input('')
if key.strip().lower() == 'n':
file_types.remove('.sens')
download_release(release_scans, out_dir_scans, file_types, use_v1_sens=True)
if not args.v1:
download_label_map(args.out_dir)
download_release(release_test_scans, out_dir_test_scans, file_types_test, use_v1_sens=False)
download_file(os.path.join(BASE_URL, RELEASE_TASKS, TEST_FRAMES_FILE[0]), os.path.join(out_dir_tasks, TEST_FRAMES_FILE[0]))
if __name__ == "__main__": main()
然后运行:
python download-scannet.py -o . --id scene00XX_YY
其中:XX:场景ID,YY:子场景ID。
由于本项目中,只需要用到颜色图、深度图、内参、位姿数据。所以,本文的Scannet下载方式可以简化。
Scannet中,颜色图、深度图、内参、位姿数据在scene00XX_YY.sens中,也可以单独下载这个文件。
http://kaldir.vc.in.tum.de/scannet/v1/scans/scene0000_00/scene00XX_YY.sens
其中:XX:场景ID,YY:子场景ID。
下载之后,还需要从scene00XX_YY.sens中提取颜色图、深度图、内参、位姿数据,需要使用官方自带的 code 进行处理。
但是官方基于python2.7开发,现在已经是python3.x的时代了,需要修改代码。为了方便,直接全部贴出来(以下代码基于python3.7)。
参考:ScanNet数据集下载与导出颜色图、深度图、内参、位姿数据
reader.py
import argparse
import os, sys
from SensorData import SensorData
# params
parser = argparse.ArgumentParser()
# data paths
parser.add_argument('--filename', required=True, help='path to sens file to read')
parser.add_argument('--output_path', required=True, help='path to output folder')
parser.add_argument('--export_depth_images', dest='export_depth_images', action='store_true')
parser.add_argument('--export_color_images', dest='export_color_images', action='store_true')
parser.add_argument('--export_poses', dest='export_poses', action='store_true')
parser.add_argument('--export_intrinsics', dest='export_intrinsics', action='store_true')
parser.set_defaults(export_depth_images=False, export_color_images=False, export_poses=False, export_intrinsics=False)
opt = parser.parse_args()
print(opt)
def main():
if not os.path.exists(opt.output_path):
os.makedirs(opt.output_path)
# load the data
print("*****************")
#sys.stdout.write('loading %s...' % opt.filename)
print('loading %s...' % opt.filename)
print("opt.filename:",opt.filename)
sd = SensorData(opt.filename)
print('loaded!\n')
#sys.stdout.write('loaded!\n')
if opt.export_depth_images:
sd.export_depth_images(os.path.join(opt.output_path, 'depth'))
if opt.export_color_images:
sd.export_color_images(os.path.join(opt.output_path, 'color'))
if opt.export_poses:
sd.export_poses(os.path.join(opt.output_path, 'pose'))
if opt.export_intrinsics:
sd.export_intrinsics(os.path.join(opt.output_path, 'intrinsic'))
if __name__ == '__main__':
main()
SensorData.py
import os, struct
import numpy as np
import zlib
import imageio
import cv2
import png
COMPRESSION_TYPE_COLOR = {-1:'unknown', 0:'raw', 1:'png', 2:'jpeg'}
COMPRESSION_TYPE_DEPTH = {-1:'unknown', 0:'raw_ushort', 1:'zlib_ushort', 2:'occi_ushort'}
class RGBDFrame():
def load(self, file_handle):
self.camera_to_world = np.asarray(struct.unpack('f'*16, file_handle.read(16*4)), dtype=np.float32).reshape(4, 4)
self.timestamp_color = struct.unpack('Q', file_handle.read(8))[0]
self.timestamp_depth = struct.unpack('Q', file_handle.read(8))[0]
self.color_size_bytes = struct.unpack('Q', file_handle.read(8))[0]
self.depth_size_bytes = struct.unpack('Q', file_handle.read(8))[0]
#self.color_data = ''.join(struct.unpack('c'*self.color_size_bytes, file_handle.read(self.color_size_bytes)))
#self.depth_data = ''.join(struct.unpack('c'*self.depth_size_bytes, file_handle.read(self.depth_size_bytes)))
self.color_data = file_handle.read(self.color_size_bytes)
# self.depth_data = struct.unpack('c'*self.depth_size_bytes, file_handle.read(self.depth_size_bytes))
self.depth_data = file_handle.read(self.depth_size_bytes)
def decompress_depth(self, compression_type):
if compression_type == 'zlib_ushort':
return self.decompress_depth_zlib()
else:
raise
def decompress_depth_zlib(self):
return zlib.decompress(self.depth_data)
def decompress_color(self, compression_type):
if compression_type == 'jpeg':
return self.decompress_color_jpeg()
else:
raise
def decompress_color_jpeg(self):
return imageio.imread(self.color_data)
class SensorData:
def __init__(self, filename):
self.version = 4
self.load(filename)
def load(self, filename):
with open(filename, 'rb') as f:
version = struct.unpack('I', f.read(4))[0]
assert self.version == version
strlen = struct.unpack('Q', f.read(8))[0]
# self.sensor_name = ''.join(struct.unpack('c'*strlen, f.read(strlen)))
#self.sensor_name = ''.join(struct.unpack('c' * strlen, f.read(strlen)))
self.sensor_name = struct.unpack('c' * strlen, f.read(strlen))
self.intrinsic_color = np.asarray(struct.unpack('f'*16, f.read(16*4)), dtype=np.float32).reshape(4, 4)
self.extrinsic_color = np.asarray(struct.unpack('f'*16, f.read(16*4)), dtype=np.float32).reshape(4, 4)
self.intrinsic_depth = np.asarray(struct.unpack('f'*16, f.read(16*4)), dtype=np.float32).reshape(4, 4)
self.extrinsic_depth = np.asarray(struct.unpack('f'*16, f.read(16*4)), dtype=np.float32).reshape(4, 4)
self.color_compression_type = COMPRESSION_TYPE_COLOR[struct.unpack('i', f.read(4))[0]]
self.depth_compression_type = COMPRESSION_TYPE_DEPTH[struct.unpack('i', f.read(4))[0]]
self.color_width = struct.unpack('I', f.read(4))[0]
self.color_height = struct.unpack('I', f.read(4))[0]
self.depth_width = struct.unpack('I', f.read(4))[0]
self.depth_height = struct.unpack('I', f.read(4))[0]
self.depth_shift = struct.unpack('f', f.read(4))[0]
num_frames = struct.unpack('Q', f.read(8))[0]
self.frames = []
for i in range(num_frames):
frame = RGBDFrame()
#print("之前")
frame.load(f)
#print("之后")
self.frames.append(frame)
print("self.frames.len:",len(self.frames))
def export_depth_images(self, output_path, image_size=None, frame_skip=1):
if not os.path.exists(output_path):
os.makedirs(output_path)
print('exporting', len(self.frames)//frame_skip, ' depth frames to', output_path)
for f in range(0, len(self.frames), frame_skip):
depth_data = self.frames[f].decompress_depth(self.depth_compression_type)
depth = np.fromstring(depth_data, dtype=np.uint16).reshape(self.depth_height, self.depth_width)
if image_size is not None:
depth = cv2.resize(depth, (image_size[1], image_size[0]), interpolation=cv2.INTER_NEAREST)
#imageio.imwrite(os.path.join(output_path, str(f) + '.png'), depth)
with open(os.path.join(output_path, str(f) + '.png'), 'wb') as f: # write 16-bit
writer = png.Writer(width=depth.shape[1], height=depth.shape[0], bitdepth=16)
depth = depth.reshape(-1, depth.shape[1]).tolist()
writer.write(f, depth)
def export_color_images(self, output_path, image_size=None, frame_skip=1):
if not os.path.exists(output_path):
os.makedirs(output_path)
print('exporting', len(self.frames)//frame_skip, 'color frames to', output_path)
for f in range(0, len(self.frames), frame_skip):
color = self.frames[f].decompress_color(self.color_compression_type)
if image_size is not None:
color = cv2.resize(color, (image_size[1], image_size[0]), interpolation=cv2.INTER_NEAREST)
imageio.imwrite(os.path.join(output_path, str(f) + '.jpg'), color)
def save_mat_to_file(self, matrix, filename):
with open(filename, 'w') as f:
for line in matrix:
np.savetxt(f, line[np.newaxis], fmt='%f')
def export_poses(self, output_path, frame_skip=1):
if not os.path.exists(output_path):
os.makedirs(output_path)
print('exporting', len(self.frames)//frame_skip, 'camera poses to', output_path )
for f in range(0, len(self.frames), frame_skip):
self.save_mat_to_file(self.frames[f].camera_to_world, os.path.join(output_path, str(f) + '.txt'))
def export_intrinsics(self, output_path):
if not os.path.exists(output_path):
os.makedirs(output_path)
print('exporting camera intrinsics to', output_path )
self.save_mat_to_file(self.intrinsic_color, os.path.join(output_path, 'intrinsic_color.txt'))
self.save_mat_to_file(self.extrinsic_color, os.path.join(output_path, 'extrinsic_color.txt'))
self.save_mat_to_file(self.intrinsic_depth, os.path.join(output_path, 'intrinsic_depth.txt'))
self.save_mat_to_file(self.extrinsic_depth, os.path.join(output_path, 'extrinsic_depth.txt'))
将reader.py和SensorData.py两个文件放置在scene00XX_YY同级目录,执行:
python reader.py --filename scene0000_00.sens --output_path . --export_depth_images --export_color_images --export_poses --export_intrinsics
即可得到:颜色图、深度图、内参、位姿数据。
如下:
3. Run
为了运行Point-SLAM,我们建议使用Weights and Biases进行日志记录。
可以通过在configs/point_slam.yaml文件中将wandb标志设置为True来启用它。还要确保指定wandb_folder路径。如果没有wandb帐户,首先创建一个。
每个场景都有一个配置文件夹,在其中需要指定input_folder和output路径。
下面,我们展示了每个数据集中一个场景的示例运行命令。如果您使用批处理系统(例如SLURM),您可能会发现我们的repro.sh脚本很有用。
3.1 Replica
如果在Replica中运行room0场景,可以执行如下的命令:
python run.py configs/Replica/room0.yaml
在重建之后,将评估轨迹误差(trajectory error)以及网格精度(mesh accuracy),同时还将评估渲染指标(rendering metrics)。
3.2 TUM-RGBD
如果想运行TUM-RGBD的freiburg1_desk
scene,可以执行:
python run.py configs/TUM_RGBD/freiburg1_desk.yaml
在重建之后,将自动评估轨迹误差(trajectory error)
3.3 ScanNet
如果想运行ScanNet的scene0000_00
scene,可以执行:
python run.py configs/ScanNet/scene0000.yaml
在重建之后,将自动评估轨迹误差(trajectory error)
3.4 Testing and Development
如果您想基于Point-SLAM开发自己的系统,我们提供了一个测试功能,以确保对代码库的任何更改都能产生预期的结果。
test_deterministic.py脚本在有限的帧集上运行代码,并根据参考进行地图和轨迹的评估,以检查它们是否相同。这在进行代码重构等情况下非常有用。