体素CVPR2019（二）DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation

最新推荐文章于 2024-05-23 19:57:42 发布

Raywit

最新推荐文章于 2024-05-23 19:57:42 发布

阅读量6.5k

点赞数 5

分类专栏：图像处理

本文链接：https://blog.csdn.net/qq_40520596/article/details/110275311

版权

图像处理专栏收录该内容

32 篇文章

订阅专栏

DeepSDF是一种使用神经网络学习连续有符号距离函数来表示3D形状的方法。通过避免传统SDF的离散化过程，DeepSDF能够从部分和噪声数据中生成高质量的形状，并且能够表示一类形状的多样性。模型通过深度前馈网络直接回归SDF值，同时学习形状的隐式空间，以一个隐式向量表示目标形状，提高了灵活性和泛化能力。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

《DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation》论文解读

Abstract
1. Modeling SDFs with Neural Networks
2. Learning the Latent Space of Shapes

在这里插入图片描述

原文：DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation
代码：Github

Abstract

计算机图形学已经提出多种方法来表示3D几何图形。本文提出 DeepSDF (一个可学习有符号的连续距离函数SDF) 来表示3D形状，它能从部分和有噪声的3D输入数据中生成高质量的shape。

原理：通过连续的体积场表示形状的表面，场内点的大小即幅值表示其到表面的距离，符号表示在场外(+) 还是场内(-)；解析形式或离散的体素形式的传统SDF通常只能表示某个形状，而DeepSDF可以表示一类形状，即：拿手来举例，SDF只能表示某个手的某个形态；而DeepSDF可以将手的所有形态都能表示出来。

1. Modeling SDFs with Neural Networks

该章节通过神经网络来得到SDF，由于SDF本身是一个把离散位置的点映射到一个距离值上的函数，该值的符号表示了该点在表面的内部还是外部. 一般用SDF = 0处表示surface。因此该方法主要是通过深度网络来生成一个从离散点到 连续有符号距离函数(SDF) 的映射函数，没有voxelization的过程。由于离散化受卷积参数大小和 grid精度所约束, 相比之下DeepSDF方法更加灵活。

$\in \mathbb{R}^{3}, s \in \mathbb{R}$

关键思想则是利用深度神经网络从点直接回归到连续SDF，输出给定位置的SDF值，表面的值为0。

在这里插入图片描述
理论上说，深度前馈神经网络能学到任意精度的完全连续函数，如Fig. 3a 所示：

给定目标形状，用3D点位置 $\boldsymbol{x}$ 以及SDF值 $s$ 来组成 $X$ ：

$X=\{(\boldsymbol{x}, s): S D F(\boldsymbol{x})=s\}$

在训练集 S 上训练多层全连接神经网络 $f_θ$ 的参数 $θ$ ，使 $f_θ$ 成为目标域Ω中给定 SDF 的良好逼近器，即：

$f_{\theta}(\boldsymbol{x}) \approx \operatorname{SDF}(\boldsymbol{x}), \forall \boldsymbol{x} \in \Omega$

用L1损失函数来进行：

$\mathcal{L}\left(f_{\theta}(\boldsymbol{x}), s\right)=\left|\operatorname{clamp}\left(f_{\theta}(\boldsymbol{x}), \delta\right)-\operatorname{clamp}(s, \delta)\right|$ ，其中 $\operatorname{clamp}(x, \delta):=\min (\delta, \max (-\delta, x))$

2. Learning the Latent Space of Shapes

该章节则是学习形状隐式空间，即 $c o d e$ ，由于针对一个shape来学习一个网络是很不实用的，因此本文引入隐式向量 $z$ 代表目标形状的隐式密码，此时如Fig. 3b所示：用3D点位置 $\boldsymbol{x}$ 以及隐式密码 $z$ 作为输入，此处与上面同理，使得：

$f_{\theta}\left(\boldsymbol{z}_{i}, \boldsymbol{x}\right) \approx \operatorname{SDF}^{i}(\boldsymbol{x})$ ，其中 $i$ 指的是第 $i$ 个目标形状

这一部分的隐式密码 $\boldsymbol{z}_{i}$ 采用自解码器来得到，给定N个shape，每个shape有K个点，输入的为： $X_{i}=\left\{\left(\boldsymbol{x}_{j}, s_{j}\right): s_{j}=S D F^{i}\left(\boldsymbol{x}_{j}\right)\right\}$ ，那么根据概率论则有：

$p_{\theta}\left(\boldsymbol{z}_{i} \mid X_{i}\right)=p\left(\boldsymbol{z}_{i}\right) \prod_{\left(\boldsymbol{x}_{j}, \boldsymbol{s}_{j}\right) \in X_{i}} p_{\theta}\left(\boldsymbol{s}_{j} \mid z_{i} ; \boldsymbol{x}_{j}\right)$

$p_{\theta}\left(\boldsymbol{s}_{j} \mid z_{i} ; \boldsymbol{x}_{j}\right)=\exp \left(-\mathcal{L}\left(f_{\theta}\left(\boldsymbol{z}_{i}, \boldsymbol{x}_{j}\right), s_{j}\right)\right)$

在训练阶段，通过下式来优化 $\theta$ 以及 $\boldsymbol{z}_{i}$ ：

$\underset{\theta,\left\{\boldsymbol{z}_{i}\right\}_{i=1}^{N}}{\arg \min } \sum_{i=1}^{N}\left(\sum_{j=1}^{K} \mathcal{L}\left(f_{\theta}\left(\boldsymbol{z}_{i}, \boldsymbol{x}_{j}\right), \boldsymbol{s}_{j}\right)+\frac{1}{\sigma^{2}}\left\|\boldsymbol{z}_{i}\right\|_{2}^{2}\right)$

在推理阶段则是固定 $\theta$ ， $\boldsymbol{z}_{i}$ 可以通过 Maximum-aPosterior (MAP) estimation来得到，即：

$\hat{\boldsymbol{z}}=\underset{\boldsymbol{z}}{\arg \min } \sum_{\left(\boldsymbol{x}_{j}, \boldsymbol{s}_{j}\right) \in X} \mathcal{L}\left(f_{\theta}\left(\boldsymbol{z}, \boldsymbol{x}_{j}\right), s_{j}\right)+\frac{1}{\sigma^{2}}\|\boldsymbol{z}\|_{2}^{2}$

import torch.nn as nn
import torch
import torch.nn.functional as F


class Decoder(nn.Module):
    def __init__(
        self,
        latent_size,
        dims,
        dropout=None,
        dropout_prob=0.0,
        norm_layers=(),
        latent_in=(),
        weight_norm=False,
        xyz_in_all=None,
        use_tanh=False,
        latent_dropout=False,
    ):
        super(Decoder, self).__init__()

        def make_sequence():
            return []

        dims = [latent_size + 3] + dims + [1]

        self.num_layers = len(dims)
        self.norm_layers = norm_layers
        self.latent_in = latent_in
        self.latent_dropout = latent_dropout
        if self.latent_dropout:
            self.lat_dp = nn.Dropout(0.2)

        self.xyz_in_all = xyz_in_all
        self.weight_norm = weight_norm

        for layer in range(0, self.num_layers - 1):
            if layer + 1 in latent_in:
                out_dim = dims[layer + 1] - dims[0]
            else:
                out_dim = dims[layer + 1]
                if self.xyz_in_all and layer != self.num_layers - 2:
                    out_dim -= 3

            if weight_norm and layer in self.norm_layers:
                setattr(
                    self,
                    "lin" + str(layer),
                    nn.utils.weight_norm(nn.Linear(dims[layer], out_dim)),
                )
            else:
                setattr(self, "lin" + str(layer), nn.Linear(dims[layer], out_dim))

            if (
                (not weight_norm)
                and self.norm_layers is not None
                and layer in self.norm_layers
            ):
                setattr(self, "bn" + str(layer), nn.LayerNorm(out_dim))

        self.use_tanh = use_tanh
        if use_tanh:
            self.tanh = nn.Tanh()
        self.relu = nn.ReLU()

        self.dropout_prob = dropout_prob
        self.dropout = dropout
        self.th = nn.Tanh()

    # input: N x (L+3)
    def forward(self, input):
        xyz = input[:, -3:]

        if input.shape[1] > 3 and self.latent_dropout:
            latent_vecs = input[:, :-3]
            latent_vecs = F.dropout(latent_vecs, p=0.2, training=self.training)
            x = torch.cat([latent_vecs, xyz], 1)
        else:
            x = input

        for layer in range(0, self.num_layers - 1):
            lin = getattr(self, "lin" + str(layer))
            if layer in self.latent_in:
                x = torch.cat([x, input], 1)
            elif layer != 0 and self.xyz_in_all:
                x = torch.cat([x, xyz], 1)
            x = lin(x)
            # last layer Tanh
            if layer == self.num_layers - 2 and self.use_tanh:
                x = self.tanh(x)
            if layer < self.num_layers - 2:
                if (
                    self.norm_layers is not None
                    and layer in self.norm_layers
                    and not self.weight_norm
                ):
                    bn = getattr(self, "bn" + str(layer))
                    x = bn(x)
                x = self.relu(x)
                if self.dropout is not None and layer in self.dropout:
                    x = F.dropout(x, p=self.dropout_prob, training=self.training)

        if hasattr(self, "th"):
            x = self.th(x)

        return x