Fast3R: 3D重建开源项目教程

最新推荐文章于 2025-04-02 10:02:04 发布

虞熠蝶

最新推荐文章于 2025-04-02 10:02:04 发布

阅读量479

点赞数 20

本文链接：https://blog.csdn.net/gitblog_00910/article/details/146934020

版权

Fast3R: 3D重建开源项目教程

fast3r [CVPR 2025] Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass 项目地址: https://gitcode.com/gh_mirrors/fa/fast3r

1. 项目介绍

Fast3R是由Facebook Research团队开发的一个用于三维重建的开源项目。它能够在单次前向传播中处理超过1000张图像，实现高效的3D重建。该项目基于PyTorch深度学习框架，并使用了最先进的神经网络模型来估计相机姿态和生成三维点云。

2. 项目快速启动

以下是快速启动Fast3R的步骤：

克隆项目

首先，需要克隆项目到本地环境：

git clone https://github.com/facebookresearch/fast3r.git
cd fast3r

创建Conda环境

接着，创建一个Conda环境并激活：

conda create -n fast3r python=3.11 cmake=3.14.0 -y
conda activate fast3r

安装PyTorch和相关依赖

根据系统安装PyTorch及其依赖项：

conda install pytorch torchvision torchaudio pytorch-cuda=12.4 nvidia/label/cuda-12.4.0::cuda-toolkit -c pytorch -c nvidia

安装PyTorch3D

从源代码安装PyTorch3D：

export MAX_JOBS=6 # 如果RAM较小（例如16GB），在编译PyTorch3D时取消注释
pip install "git+https://github.com/facebookresearch/pytorch3d.git@stable"

安装其他要求

安装项目所需的其他依赖：

pip install -r requirements.txt

安装Fast3R

作为包安装Fast3R，以便在项目中导入和使用：

pip install -e .

运行示例

使用以下命令运行示例：

python fast3r/viz/demo.py

这将自动下载预训练模型权重和配置，并启动一个Gradio界面，用户可以上传图像或视频，并可视化3D重建和相机位姿估计。

3. 应用案例和最佳实践

在项目中使用Fast3R

在自定义项目中使用Fast3R，可以导入Fast3R类并作为常规PyTorch模型使用。以下是一个简单的示例：

import torch
from fast3r.dust3r.utils.image import load_images
from fast3r.dust3r.inference_multiview import inference
from fast3r.models.fast3r import Fast3R
from fast3r.models.multiview_dust3r_module import MultiViewDUSt3RLitModule

# 加载预训练模型
model = Fast3R.from_pretrained("jedyang97/Fast3R_ViT_Large_512")

# 设置设备为CUDA或CPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)

# 创建一个轻量级的Lightning模块包装器
lit_module = MultiViewDUSt3RLitModule.load_for_inference(model)

# 设置模型为评估模式
model.eval()
lit_module.eval()

# 加载图像
filelist = ["path/to/image1.jpg", "path/to/image2.jpg", "path/to/image3.jpg"]
images = load_images(filelist, size=512, verbose=True)

# 运行推断
output_dict, profiling_info = inference(images, model, device, dtype=torch.float32, verbose=True, profiling=True)

# 估计相机姿态
poses_c2w_batch, estimated_focals = MultiViewDUSt3RLitModule.estimate_camera_poses(output_dict['preds'], niter_PnP=100, focal_length_estimation_method='first_view_from_global_head')

# 提取点云
for view_idx, pred in enumerate(output_dict['preds']):
    point_cloud = pred['pts3d_in_other_view'].cpu().numpy()
    print(f"视图{view_idx}的点云形状: {point_cloud.shape}")