关于SDF+NeRF的粗浅总结和理解

SDF的核心是一种连续的场景表征,NeRF的核心是体渲染。因此本文是简单的关于SDF如何和体渲染串起来的粗糙总结(即SDF2Density)。

一、引言

  • 个人感觉:Density Field、SDF都是表示geometry的方式,但是为什么越来越多的工作选择后者作为表示方案,我的直觉是这样的:

    • (1) 网络预测density很难突变,网络预测SDF也没法突变,但是连续的SDF也可以extract出清晰的表面。因为取iso-surface的过程是突变的。

    • (2) SDF的Eikonal损失约束了表面附近的点(一小簇点)的多米诺骨牌一样的关系,只有density field的时候是没有办法限制相邻的点的几何关系的相关性的。

    • (3) “SDF is a more useful geometric representation that can support tasks such as tracing.” (from VoxFusion)

    • (4) 一条光线达到一个点上,SDF可以很直观的把中间的空气部分的SDF Value直接设成截断值,然而Density Field可能会设置很小的Density进而导致空气中的漂浮物(MipNeRF360中有提)。

  • 然后放一张小派系的图儿:
    在这里插入图片描述

二、MonoSDF, NIPS2022

原文链接:https://arxiv.org/pdf/2206.00665.pdf

代码链接:https://github.com/autonomousvision/monosdf

2.1 原理解释

2.1.1 SDF to Density

https://zhuanlan.zhihu.com/p/599578339

大哥直接follow的”Volume rendering of neural implicit surfaces, NIPS2021“,把sdf value转换成sigma然后体渲染 ( β \beta β是可学习的参数),用的是CDF函数。

(知乎博主说的是:因为拉普拉斯分布的CDF有非常好的适合做sdf的性质,可以去看看拉普拉斯分布CDF的图像) β \beta β是标准差,这玩意能不能当作confidence呀。

σ β ( s ) = { 1 2 β e s d f β , s ≤ 0 1 β ( 1 − 1 2 e − s d f β ) , s > 0 \sigma_\beta(s) = \left\{\begin{matrix} \frac{1}{2\beta} e^{\frac{sdf}{\beta}},s\le 0 \\ \frac{1}{\beta} ( 1 - \frac{1}{2}e^{\frac{-sdf}{\beta}}),s> 0 \end{matrix}\right. σβ(s)={2β1eβsdf,s0β1(121eβsdf),s>0

在这里插入图片描述

在这里插入图片描述

2.1.2 体渲染

  • NeRF原本的体渲染:(下面讲的是一条ray上的)

    SwiftMapping里 σ \sigma σ的值大概范围是(0,200),均值大概在35。

    d i s t s = [ Δ d 1 , Δ d 2 , . . . Δ d n , 1 e 10 ] α = [ 1 − e − Δ d 1 ⋅ R e L U ( σ 1 ) , 1 − e − Δ d 2 ⋅ R e L U ( σ 2 ) , . . . , 1 ] w e i g h t s = { ∏ k = 0 k = i − 1 ( 1 − α k ) α i } d e p t h = w e i g h t s ⋅ z _ v a l s dists = [\Delta d_1, \Delta d_2, ...\Delta d_n,1e10 ] \\ \alpha = [1-e^{-\Delta d_1\cdot ReLU(\sigma_1)},1-e^{-\Delta d_2\cdot ReLU(\sigma_2)} ,...,1] \\ weights = \{\prod_{k=0}^{k=i-1} (1-\alpha_k) \alpha_{i}\} \\ depth = weights \cdot z\_vals dists=[Δd1,Δd2,...Δdn,1e10]α=[1eΔd1ReLU(σ1),1eΔd2ReLU(σ2),...,1]weights={k=0k=i1(1αk)αi}depth=weightsz_vals

  • MonoSDF的体渲染:

L a p l a c e D e n s i t y : σ β ( s ) = { 1 2 β e s d f β , s ≤ 0 1 β ( 1 − 1 2 e − s d f β ) , s > 0 σ = L a p l a c e D e n s i t y ( S D F ) d i s t s = [ Δ d 1 , Δ d 2 , . . . Δ d n , 1 e 10 ] F r e e E n e r g y = [ Δ d 1 ⋅ σ 1 , Δ d 2 ⋅ σ 2 , . . . , Δ d n ⋅ σ n , 1 e 10 ] S h i f t F r e e E n e r g y = [ 0 , Δ d 1 ⋅ σ 1 , Δ d 2 ⋅ σ 2 , . . . , Δ d n ⋅ σ n ] α = [ 1 − e − Δ d 1 ⋅ σ 1 , 1 − e − Δ d 2 ⋅ σ 2 , . . . , 1 ] T r a n s m i t t a n c e = { e − ∑ k = 0 k = i − 1 S h i f t F r e e E n e r g y i } w e i g h t s = T r a n s m i t t a n c e ⋅ α LaplaceDensity: \sigma_\beta(s) = \left\{\begin{matrix} \frac{1}{2\beta} e^{\frac{sdf}{\beta}},s\le 0 \\ \frac{1}{\beta} ( 1 - \frac{1}{2}e^{\frac{-sdf}{\beta}}),s> 0 \end{matrix}\right. \\ \sigma = LaplaceDensity(SDF) \\ dists = [\Delta d_1, \Delta d_2, ...\Delta d_n,1e10 ] \\ FreeEnergy = [\Delta d_1 \cdot \sigma_1, \Delta d_2 \cdot \sigma_2, ...,\Delta d_n \cdot \sigma_n,1e10 ] \\ ShiftFreeEnergy = [0,\Delta d_1 \cdot \sigma_1, \Delta d_2 \cdot \sigma_2, ..., \Delta d_n \cdot \sigma_n] \\ \alpha = [1-e^{-\Delta d_1 \cdot \sigma_1},1-e^{-\Delta d_2 \cdot \sigma_2},...,1] \\ Transmittance = \{ e^{-\sum_{k=0}^{k=i-1}ShiftFreeEnergy_i } \} \\ weights = Transmittance \cdot \alpha LaplaceDensity:σβ(s)={2β1eβsdf,s0β1(121eβsdf),s>0σ=LaplaceDensity(SDF)dists=[Δd1,Δd2,...Δdn,1e10]FreeEnergy=[Δd1σ1,Δd2σ2,...,Δdnσn,1e10]ShiftFreeEnergy=[0,Δd1σ1,Δd2σ2,...,Δdnσn]α=[1eΔd1σ1,1eΔd2σ2,...,1]Transmittance={ek=0k=i1ShiftFreeEnergyi}weights=Transmittanceα

2.2 代码

有一个需要学习的参数beta,这个beta对于不同的场景值是不同的,甚至不同的采样区间设置都会导致beta的显著不同,感觉是十分不鲁棒的。

class Density(nn.Module):
    def __init__(self, params_init={}):
        super().__init__()
        for p in params_init:
            param = nn.Parameter(torch.tensor(params_init[p]))
            setattr(self, p, param)

    def forward(self, sdf, beta=None):
        return self.density_func(sdf, beta=beta)

class LaplaceDensity(Density):  # alpha * Laplace(loc=0, scale=beta).cdf(-sdf)

    def __init__(self, params_init={}, beta_min=0.0001):
        super().__init__(params_init=params_init)
        self.beta_min = torch.tensor(beta_min).cuda()

    def density_func(self, sdf, beta=None):
        if beta is None:
            beta = self.get_beta()

        alpha = 1 / beta
        return alpha * (0.5 + 0.5 * sdf.sign() * torch.expm1(-sdf.abs() / beta))

    def get_beta(self):
        beta = self.beta.abs() + self.beta_min
        return beta

def volume_rendering(self, z_vals, sdf):

    density_flat = self.density(sdf)
    density = density_flat.reshape(-1, z_vals.shape[1])  # (batch_size * num_pixels) x N_samples
    dists = z_vals[:, 1:] - z_vals[:, :-1]
    dists = torch.cat([dists, torch.tensor([1e10]).cuda().unsqueeze(0).repeat(dists.shape[0], 1)], -1)
    # LOG SPACE
    free_energy = dists * density
    shifted_free_energy = torch.cat([torch.zeros(dists.shape[0], 1).cuda(), free_energy[:, :-1]], dim=-1)  # shift one step
    alpha = 1 - torch.exp(-free_energy)  # probability of it is not empty here
    transmittance = torch.exp(-torch.cumsum(shifted_free_energy, dim=-1))  # probability of everything is empty up to now
    weights = alpha * transmittance # probability of the ray hits something here
    return weights

# -----------------------------------------------

output = self.forward(x)
sdf = output[:,:1]
sdf, feature_vectors, gradients = self.implicit_network.get_outputs(points_flat)
weights = self.volume_rendering(z_vals, sdf)
depth_values = torch.sum(weights * z_vals, 1, keepdims=True) / (weights.sum(dim=1, keepdims=True) +1e-8)

三、⭐NeuS, NIPS2021

原文链接:https://arxiv.org/abs/2106.10689

代码链接:https://github.com/Totoro97/NeuS

GO-SLAM也是抄的NeuS,但是NeuS是在(-1,1)的bound里面采样,感觉不如GO-SLAM的借鉴意义大一些可能。

3.1 原理解释

这个论文给了一堆公式和推导,不深究其理,只谈应用的话,只需要下面两个公式就OK了:

  • 公式一:

    α i = 1 − e x p ( − ∫ t i t i + 1 − ( ∇ f ( p ( t ) ) ⋅ v ) ϕ s ( p ( t ) ) Φ s ( f ( p ( t ) ) ) d t ) \alpha_i = 1-exp(-\int_{t_i}^{t_{i+1}}\frac{ -(\nabla f(p(t)) \cdot v) \phi_s(p(t))}{\Phi_s(f(p(t)))}dt) αi=1exp(titi+1Φs(f(p(t)))(f(p(t))v)ϕs(p(t))dt)

  • 公式二:

f : S D F P D F : Φ s ( x ) = S i g m o i d ( x ) , C D F : ϕ s ( x ) = d ( S i g m o i d ( x ) ) d t α i = m a x ( Φ s ( f ( p ( t i ) ) ) − Φ s ( f ( p ( t i + 1 ) ) ) Φ s ( f ( p ( t i ) ) ) , 0 ) f: SDF \\ PDF:\Phi_s(x) = Sigmoid(x), CDF: \phi_s(x) = \frac{d (Sigmoid(x))}{dt}\\ \alpha_i = max(\frac{\Phi_s(f(p(t_i))) - \Phi_s(f(p(t_{i+1})))}{\Phi_s(f(p(t_i))) }, 0) f:SDFPDF:Φs(x)=Sigmoid(x),CDF:ϕs(x)=dtd(Sigmoid(x))αi=max(Φs(f(p(ti)))Φs(f(p(ti)))Φs(f(p(ti+1))),0)

  • STEP 1:

    先均匀/随机采样64个点,然后根据这些点的S-Density去迭代的进行Important Sampling,Important Sampling进行四轮,每一轮都采样16个点。(如果有背景,俺们再follow NeRF++的策略采样32个点),得到z_vals

  • STEP 2:

    取上面的z_vals的中点拿到mid_z_vals,然后根据mid_z_vals和光线原点计算出采样点pts,也是直接把这个采样点丢给SDF Network计算得到SDF Value。(sample mid-point: mid_z_valspts)

    • gradints: sdf网络输出的sdf value对于求导。
  • STEP 3:

    根据上面中点的sdf,然后还有可以计算出来的 Δ S D F Δ t \frac{\Delta SDF}{\Delta t} ΔtΔSDF计算得到采样点的SDF,然后带入到Sigmoid函数就可以得到 α \alpha α

    • inv_s: 这玩意应该就是那个可学习的标准差 1 s \frac{1}{s} s1

      # renderer.py | NeuSRender.render_core()
      self.register_parameter('variance', nn.Parameter(torch.tensor(init_val)))
      inv_s = orch.ones([len(x), 1]) * torch.exp(self.variance * 10.0)
      
    • true_cos = (dirs * gradients).sum(-1, keepdim=True)

    • 真实的 Δ S D F Δ t \frac{\Delta SDF}{\Delta t} ΔtΔSDF应该是true_cos, 这个iter_cos的数学含义我感觉是true_cos过了一下类似激活函数一样的东西。

      A : = t r u e _ c o s , B : = c o s _ r a t i o t r u e _ c o s = d i r s ⋅ g r a d i e n t s i t e r _ c o s = { A B + ( A + B − 1 ) 2 A < 0 A B − ( A + B − 1 ) 2 0 < A < 1 0 1 < A A:=true\_cos, B:=cos\_ratio \\ true\_cos = dirs \cdot gradients \\ iter\_cos = \left\{\begin{matrix} \frac{AB+(A+B-1)}{2} & A<0 \\ \frac{AB-(A+B-1)}{2} & 0<A<1 \\ 0 & 1<A \end{matrix}\right. \\ A:=true_cos,B:=cos_ratiotrue_cos=dirsgradientsiter_cos= 2AB+(A+B1)2AB(A+B1)0A<00<A<11<A

      人家的代码注释里是这样说的(退火算法): ““cos_anneal_ratio” grows from 0 to 1 in the beginning training iterations.”。

      因此上面的公式是这样的:

B = 0 B=0 B=0

{ A − 1 2 A < 0 A − 1 2 0 < A < 1 0 1 < A \left\{\begin{matrix} \frac{A-1}{2} & A<0 \\ \frac{A-1}{2} & 0<A<1\\ 0 & 1 < A \end{matrix}\right. 2A12A10A<00<A<11<A

B = 1 B=1 B=1
{ A A < 0 0 0 < A < 1 0 1 < A \left\{\begin{matrix} A& A<0 \\ 0 & 0<A<1\\ 0 & 1 < A \end{matrix}\right. A00A<00<A<11<A


图片俺就只画后面的了,即假设sdf定死了然后画一下alpha和sdf的关系:

true_cos = (dirs * gradients).sum(-1, keepdim=True)

iter_cos = -(F.relu(-true_cos * 0.5 + 0.5) * (1.0 - cos_anneal_ratio) +
             F.relu(-true_cos) * cos_anneal_ratio)  # always non positive

estimated_next_sdf = sdf + iter_cos * dists.reshape(-1, 1) * 0.5
estimated_prev_sdf = sdf - iter_cos * dists.reshape(-1, 1) * 0.5

prev_cdf = torch.sigmoid(estimated_prev_sdf * inv_s) # \Phi_s(f(p(t_{i})))
next_cdf = torch.sigmoid(estimated_next_sdf * inv_s) # \Phi_s(f(p(t_{i+1})))

p = prev_cdf - next_cdf
c = prev_cdf
alpha = ((p + 1e-5) / (c + 1e-5)).reshape(batch_size, n_samples).clip(0.0, 1.0)

在这里插入图片描述

3.1.1 Unbias

import torch
import numpy as np
import matplotlib.pyplot as plt


sdf = torch.linspace(-5, 5, 100).flip(0)

def Phi_s(sdf, s):
    return 1 / (1+torch.exp(-sdf*s))

def phi_s(sdf, s):
    return s*torch.exp(-s*sdf)/(1+torch.exp(-s*sdf))**2


alpha = torch.clip((Phi_s(sdf[:-1], 1) - Phi_s(sdf[1:], 1)) / (Phi_s(sdf[:-1], 1)), 0, 100)

transmitance = torch.cumprod(1-alpha[:-1], dim=0)
transmitance = torch.cat([torch.ones(1), transmitance])
weight = alpha * transmitance


plt.title('un bias')
plt.plot(sdf[1:].detach().cpu().numpy(), weight.detach().cpu().numpy())

在这里插入图片描述

3.1.2 Occlusion Aware

import torch
import numpy as np
import matplotlib.pyplot as plt


sdf = torch.linspace(-5, 5, 50).flip(0)
sdf = torch.concat([sdf, sdf], dim=0)

def Phi_s(sdf, s):
    return 1 / (1+torch.exp(-sdf*s))

def phi_s(sdf, s):
    return s*torch.exp(-s*sdf)/(1+torch.exp(-s*sdf))**2


alpha = torch.clip((Phi_s(sdf[:-1], 1) - Phi_s(sdf[1:], 1)) / (Phi_s(sdf[:-1], 1)), 0, 100)

transmitance = torch.cumprod(1-alpha[:-1], dim=0)
transmitance = torch.cat([torch.ones(1), transmitance])
weight = alpha * transmitance


plt.title('occlusion aware')
plt.plot(weight.detach().cpu().numpy())

在这里插入图片描述

3.2 代码

具体的代码太多了,这里就不放了,主要看models/renderer.py中的NeuSRenderer.renderNeuSRenderer.render_core两个函数。

这部分代码比较复杂,我觉得干看看不太懂,放一下代码的结构图:
在这里插入图片描述

四、GO-SLAM,ICCV2023

原文链接:https://arxiv.org/abs/2106.10689

代码链接:https://github.com/Totoro97/NeuS

按照GO-SLAM自己的话说,是直接抄的NeuS的,但是有一说一,这代码我感觉写的简洁优雅的多。

4.1 原理解释

上面的NeuS的SDF网络是比较重的,输入三维点 x x x,输出SDF值: Φ ( x ) \Phi(x) Φ(x), 因此可以求出SDF对于空间点的导数以便进一步去做NeuS的SDFToAlpha和体渲染。聪明的GO-SLAM即希望能实时训练出SDF还希望能求这个梯度,于是用了个单层的MLP来做,这个MLP输入三维点坐标和哈希特征向量,输出SDF值。
Φ ( x ) , g = f Θ s d f ( x , h Θ h a s h ( x ) ) \Phi(x), g = f_{\Theta_{sdf}} (x, h_{\Theta_{hash}}(x)) Φ(x),g=fΘsdf(x,hΘhash(x))
有了SDF和梯度,后面的流程和NeuS一模一样。

4.2 代码

NeuS只渲染出颜色,GO-SLAM的代码感觉写的很清爽。但是难题还是在求SDF对三维点的导数,一个网络可以直接gradient,但是TSDFFusion那种形式的感觉莫得办法。

在这里插入图片描述

五、⭐VoxFusion, ISMAR2022

原文链接:https://arxiv.org/abs/2210.15858

代码链接:https://github.com/zju3dv/Vox-Fusion

H2Mapping的SDFtoDensity是抄的这个工作,然后Vox-Fusion是抄的”Nerual RGB-D Surface Reconstruction, CVPR2022“。但是由于"Nerual RGB-D Surface Reconstruction"中的计算涉及到global coordinates的gradient,因此Vox-Fusion将其改版了,原文这样讲的 ”However, we modify it to apply to feature embeddings instead of global coordinates.“

5.1 原理解释

t r : T r u n c a t e d D i s t a n c e w i = S i g m o i d ( s i t r ) ⋅ S i g m o i d ( − s i t r ) tr: TruncatedDistance\\ w_i = Sigmoid (\frac{s_i}{tr}) \cdot Sigmoid(-\frac{s_i}{tr}) tr:TruncatedDistancewi=Sigmoid(trsi)Sigmoid(trsi)

在这里插入图片描述

然后直接用weight对z_vals加权求和就行了,简单的有点感人。话说这个tr直接变成可学的参数俺觉得是最舒服的(和NeuS里面的方差s有点像)。

emmm,但是这玩意是我用起来最顺手的。

5.2 代码

这没啥可说的,就很简单。

def sdf2weights(sdf_in, trunc):
    weights = torch.sigmoid(sdf_in / trunc) * \
            torch.sigmoid(-sdf_in / trunc)

    signs = sdf_in[:, 1:] * sdf_in[:, :-1]
    mask = torch.where(
            signs < 0.0, torch.ones_like(signs), torch.zeros_like(signs)
        )
    inds = torch.argmax(mask, axis=1)
    inds = inds[..., None]
    z_min = torch.gather(z_vals, 1, inds)
    mask = torch.where(
            z_vals < z_min + trunc,
            torch.ones_like(z_vals),
            torch.zeros_like(z_vals),
        )
    weights = weights * mask * valid_mask
    # weights = weights * valid_mask
    return weights / (torch.sum(weights, dim=-1, keepdims=True) + 1e-8), z_min

# -----------------------------------------------

weights, z_min = sdf2weights(sdf, truncation)
depth = torch.sum(weights * z_vals, dim=-1)

六、PermutoSDF,CVPR2023

原文链接:https://arxiv.org/abs/2211.12562

代码链接:https://github.com/RaduAlexandru/permuto_sdf

很好,这也是抄的NeuS:

In NeuS, show that in order to learn an SDF of the scene, it is crucial to derive an appropriate weighting function based on the SDF. They propose an unbiased and occlusion-aware weighting function based on an opaque density function ρ ( t ) \rho(t) ρ(t):
ρ ( t ) = m a x ( − d d t ψ s ( g ( p ( t ) ; Φ g ) ) ψ s ( g ( p ( t ) ; Φ g ) ) , 0 ) \rho(t) = max(\frac{-\frac{d}{dt}\psi_s(g(p(t);\Phi_g))}{\psi_s(g(p(t);\Phi_g))},0) ρ(t)=max(ψs(g(p(t);Φg))dtdψs(g(p(t);Φg)),0)

七、CO-SLAM,CVPR2023

原文链接:https://arxiv.org/pdf/2304.14377.pdf

代码链接:https://github.com/HengyiWang/Co-SLAM

hahaha,COSLAM估计也是直接抄VoxFusion那一套的,那俺摆烂了。

COSLAM原文:

“A conversion function is needed to convert predicted SDF s i s_i si to weight w i w_i wi. Instead of adopting the rendering equations proposed in Neus, we follow the simple bell-shaped model of VoxFusion/Neural RGB Fusion and compute weights w i w_i wi directly by multiply- ing the two Sigmoid functions σ ( i ) \sigma(i) σ(i).”
w i = σ ( s i t r ) σ ( − s i t r ) w_i = \sigma(\frac{s_i}{tr})\sigma(-\frac{s_i}{tr}) wi=σ(trsi)σ(trsi)

八、SDF-based NeRF 的损失函数

8.1 ⭐Eikonal Loss (from Implicit Geometric Regularization)

  • Reference:
    • https://arxiv.org/pdf/2002.10099.pdf
    • https://zhuanlan.zhihu.com/p/653754755

8.1.1 原理

这篇论文是ICML2020上的,主要贡献就是一个损失函数,但是已经是SDF方面家喻户晓的损失函数了,现在已经514引用了。我感觉那个论文有点长,很推荐这个视频:https://www.youtube.com/watch?v=6cOvBGBQF9g。

其实原理就一句话的事儿,对于空间任一点的位置,SDF的定义是这个点到物体表面的最近距离,那么这个在这个点附近变化 ( Δ x , Δ y , Δ z ) (\Delta x, \Delta y,\Delta z) (Δx,Δy,Δz)最陡峭的方向(梯度下降算法求得梯度就是这个网络沿着输入最陡峭的地方的),SDF应该也变 Δ x 2 + Δ y 2 + Δ z 2 \sqrt{\Delta x^2+\Delta y^2+\Delta z^2} Δx2+Δy2+Δz2 ,即 ∣ Δ S D F Δ p ∣ → 1 |\frac{\Delta SDF}{\Delta p}| \rightarrow 1 ΔpΔSDF1
L E i k o n a l = 1 n m ∑ k , i ( ∣ ∣ ∇ f ( p ^ k , i ) ∣ ∣ 2 − 1 ) 2 L_{Eikonal} = \frac{1}{nm} \sum_{k,i} (||\nabla f(\hat{p}_{k,i})||_2 - 1)^2 LEikonal=nm1k,i(∣∣∇f(p^k,i)21)2

8.1.2 代码

计算这个损失的代码多了去了,此处就用NeuS里面的了,因为NeuS的代码我比较熟。

sdf_nn_output = sdf_network(pts) 
gradients = sdf_network.gradient(pts).squeeze()
gradient_error = (torch.linalg.norm(gradients.reshape(batch_size, n_samples, 3), ord=2, dim=-1) - 1.0) ** 2
gradient_error = (relax_inside_sphere * gradient_error).sum() / (relax_inside_sphere.sum() + 1e-5)

8.2 Free-Space Loss (from VoxFusion, CO-SLAM, H2Mapping)

  • Reference:
    • https://arxiv.org/abs/2210.15858
    • https://arxiv.org/pdf/2304.14377.pdf

8.2.1 原理

Vox-Fusion说的不知所云,还是Co-SLAM说的简单清楚。对于距离表面比较远的点 ( D [ u , v ] − d ) > t r ) (D[u,v]-d)>tr) (D[u,v]d)>tr)的点,用这个free-space损失,强迫这些位置的TSDF Value为截断值。
L f s = 1 P ∑ p ∈ P 1 S p f s ∑ p ∈ S p f s ( D s − t r ) 2 L_{fs} = \frac{1}{P} \sum_{p\in P} \frac{1}{S_p^{fs} }\sum_{p \in S_p^{fs}} (D_s-tr)^2 Lfs=P1pPSpfs1pSpfs(Dstr)2

8.2.2 代码

代码这里就放Vox-Fusion的了。

# criterion.py | line 54
fs_loss, sdf_loss = self.get_sdf_loss(
                z_vals, gt_depth, pred_sdf,
                truncation=self.truncation,
                loss_type='l2'
            )


def get_sdf_loss(self, z_vals, depth, predicted_sdf, truncation, loss_type="l2"):

    front_mask, sdf_mask, fs_weight, sdf_weight = self.get_masks(
        z_vals, depth.unsqueeze(-1).expand(*z_vals.shape), truncation
    )
    '''
    front_mask = torch.where(
            z_vals < (depth - epsilon),
            torch.ones_like(z_vals),
            torch.zeros_like(z_vals),
        )
    '''
    x = predicted_sdf * front_mask
    y = torch.ones_like(predicted_sdf) * front_mask
	fs_loss = torch.mean(torch.square(x - y)[mask]) 

8.3 Curvature Loss (from PermutoSDF, Neuralangelo)

  • Reference:
    • https://arxiv.org/abs/2211.12562
    • https://research.nvidia.com/labs/dir/neuralangelo/paper.pdf

8.3.1 原理

PermutoSDF和Neuralangelo都有这个目的是让重建表面更平滑的曲率损失,计算公式不同,具体代码也截然不同。

  • PermutoSDF: 为了在高反射和无纹理区域恢复出更平滑的表面,给SDF加入"曲率损失"。计算流程是这样的:根据SDF可以计算出法向量 n n n自然也就有了切平面,然后这个法向量叉乘一个随机单位向量得到新的方向向量 η \eta η,然后切点沿着 η \eta η走一小段并在新的位置计算法向量,让这两个法向量尽可能平行,是这样的。

L c u r v = ∑ x ( n ⋅ n ϵ − 1 ) 2 L_{curv} = \sum_x (n\cdot n_\epsilon - 1) ^2 Lcurv=x(nnϵ1)2

  • Neuralangelo:为了让SDF更平滑,加了个正则化项,平均曲率使用离散的拉普拉斯算法(与计算表面法向量类似,没具体解释)
    L c u r v = 1 N ∑ i = 1 N ∣ ∇ 2 f ( x i ) ∣ L_{curv} = \frac{1}{N}\sum_{i=1}^N |\nabla^2 f(x_i)| Lcurv=N1i=1N2f(xi)

8.3.2 代码

  • PermutoSDF:他真的我哭死,真的是按照那个小流程计算的平滑曲率损失,有一说一这一部分感觉必要性不大,因为直接给生成的mesh加一个平滑也不是不行。

    # train_permotu_sdf.py
    sdf_shifted, sdf_curvature=model_sdf.get_sdf_and_curvature_1d_precomputed_gradient_normal_based( fg_ray_samples_packed.samples_pos, sdf_gradients, iter_nr_for_anneal)
    
    loss_curvature=sdf_curvature.mean() 
    
    # 
    def get_sdf_and_curvature_1d_precomputed_gradient_normal_based(self, points, sdf_gradients,iter_nr):
        #get the curvature along a certain random direction for each point
        #does it by computing the normal at a shifted point on the tangent plant and then computing a dot produt
        
        #to the original positions, add also a tiny epsilon 
        nr_points_original=points.shape[0]
        epsilon=1e-4
        rand_directions=torch.randn_like(points)
        rand_directions=F.normalize(rand_directions,dim=-1)
        
        #instead of random direction we take the normals at these points, and calculate a random vector that is orthogonal 
        normals=F.normalize(sdf_gradients,dim=-1)
        # normals=normals.detach()
        tangent=torch.cross(normals, rand_directions)
        rand_directions=tangent #set the random moving direction to be the tangent direction now
     
        points_shifted=points.clone()+rand_directions*epsilon
            
        #get the gradient at the shifted point
        sdf_shifted, sdf_gradients_shifted, feat_shifted=self.get_sdf_and_gradient(points_shifted, iter_nr) 
    
        normals_shifted=F.normalize(sdf_gradients_shifted,dim=-1)
    
        dot=(normals*normals_shifted).sum(dim=-1, keepdim=True)
        #the dot would assign low weight importance to normals that are almost the same, and increasing error the more they deviate. So it's something like and L2 loss. But we want a L1 loss so we get the angle, and then we map it to range [0,1]
        angle=torch.acos(torch.clamp(dot, -1.0+1e-6, 1.0-1e-6)) #goes to range 0 when the angle is the same and pi when is opposite
    
         curvature=angle/math.pi #map to [0,1 range]
    
         return sdf_shifted, curvature
    
  • Neuralangelo:

    # trainer.py
    self.losses["curvature"] = curvature_loss(data["hessians"], outside=data["outside"])
    
    # model.py
    gradients, hessians = self.neural_sdf.compute_gradients(points, training=self.training, sdf=sdfs)
    
    # modules.py
    eps = self.normal_eps
    # 1st-order gradient
    eps_x = torch.tensor([eps, 0., 0.], dtype=x.dtype, device=x.device)  # [3]
    eps_y = torch.tensor([0., eps, 0.], dtype=x.dtype, device=x.device)  # [3]
    eps_z = torch.tensor([0., 0., eps], dtype=x.dtype, device=x.device)  # [3]
    sdf_x_pos = self.sdf(x + eps_x)  # [...,1]
    sdf_x_neg = self.sdf(x - eps_x)  # [...,1]
    sdf_y_pos = self.sdf(x + eps_y)  # [...,1]
    sdf_y_neg = self.sdf(x - eps_y)  # [...,1]
    sdf_z_pos = self.sdf(x + eps_z)  # [...,1]
    sdf_z_neg = self.sdf(x - eps_z)  # [...,1]
    gradient_x = (sdf_x_pos - sdf_x_neg) / (2 * eps)
    gradient_y = (sdf_y_pos - sdf_y_neg) / (2 * eps)
    gradient_z = (sdf_z_pos - sdf_z_neg) / (2 * eps)
    gradient = torch.cat([gradient_x, gradient_y, gradient_z], dim=-1)  # [...,3]
    hessian_xx = (sdf_x_pos + sdf_x_neg - 2 * sdf) / (eps ** 2)  # [...,1]
    hessian_yy = (sdf_y_pos + sdf_y_neg - 2 * sdf) / (eps ** 2)  # [...,1]
    hessian_zz = (sdf_z_pos + sdf_z_neg - 2 * sdf) / (eps ** 2)  # [...,1]
    hessian = torch.cat([hessian_xx, hessian_yy, hessian_zz], dim=-1)  # [...,3]
    
    
    def curvature_loss(hessian, outside=None):
        laplacian = hessian.sum(dim=-1).abs()  # [B,R,N]
        laplacian = laplacian.nan_to_num(nan=0.0, posinf=0.0, neginf=0.0)  # [B,R,N]
        if outside is not None:
            return (laplacian * (~outside).float()).mean()
        else:
            return laplacian.mean()
    
    

九、一些有趣的小Tricks

9.1 SDF2Normal (from MonoSDF)

这个有点意思,因为MonoSDF用的Omnidata数据集有GT法向量,因此他加了个法向量的损失函数,论文中潦草带过:“The 3D unit normal n ^ \hat{n} n^ is the analytical gradient of our SDF Function.”。

然后放了个体渲染的公式:(尊嘟假嘟,法向量还能体渲染累出来?)
N ^ ( r ) = ∑ i = 1 M T r i α r i n ^ r i \hat{N}(r) = \sum_{i=1}^M T_r^i \alpha_r^i \hat{n}_r^i N^(r)=i=1MTriαrin^ri
根据 Δ S D F Δ p \frac{\Delta SDF}{\Delta p} ΔpΔSDF归一化后就是ray上某个点的法向量,他的法向量图好像是法向量的2D图像,因此需要对法向量做体渲染。

然后扒了下代码,是这样的:

normals = gradients / (gradients.norm(2, -1, keepdim=True) + 1e-6)
normals = normals.reshape(-1, N_samples, 3)
normal_map = torch.sum(weights.unsqueeze(-1) * normals, 1)

# transform to local coordinate system
rot = pose[0, :3, :3].permute(1, 0).contiguous()
normal_map = rot @ normal_map.permute(1, 0)
normal_map = normal_map.permute(1, 0).contiguous()

output['normal_map'] = normal_map

9.2 基于SDF的采样策略 (from VolSDF)

  • Reference:https://www.bilibili.com/video/BV1jh4y1S7SU?vd_source=a71bfed18c663599dddb86ae6fb65353

原始的NeRF是这样的,对于一条ray先随机选采样点然后过coarse网络得到一个coarse的密度分布,然后再在密度大的地方细采样(蒙特卡洛逆采样)。

VolSDF的思路是这样的,上面的视频介绍的很详细了,我的理解是:

根据一条光线上两个点的SDF值: d i , d i + 1 d_i,d_{i+1} di,di+1,两个点相距 δ i \delta_i δi。可以做出以下猜测:物体表面距离这两个点的任意一点的最短距离为 d i ∗ d_i^* di
d i ∗ = { 0 ∣ d i ∣ + ∣ d i + 1 ∣ ≤ δ i m i n { ∣ d i ∣ , ∣ d i + 1 ∣ } ∣ ∣ d i ∣ 2 − ∣ d i + 1 ∣ 2 ∣ > δ i 2 h i o t h e r w i s e d_i^* = \left\{\begin{matrix} 0 & |d_i| + |d_{i+1}| \le \delta_i \\ min\{|d_i|,|d_{i+1}|\} & ||d_i|^2 - |d_{i+1}|^2| > \delta_i^2 \\ h_i & otherwise \end{matrix}\right. di= 0min{di,di+1}hidi+di+1δi∣∣di2di+12>δi2otherwise
NeRF原本的体渲染方式存在这样一个问题,他是默认两个采样点之间的密度是相同的,这样把连续积分变成了计算机能做的离散的累加的体渲染计算。这也会导致整条光线算颜色的一个误差的累计。好的采样点要满足在两个采样点之间的密度是尽可能相等的。然后本文得到了这样一种机制,不过这个采样策略没有被广泛应用,也不知道为啥。然后得到了这样一个结论,上面的累计误差取决于两个量:
∣ O ( t ) − O ( t ) ^ ∣ < e α 4 β ∑ δ 2 − 1 |O(t)-\hat{O(t)}| < e^{\frac{\alpha}{4\beta} \sum \delta^2-1} O(t)O(t)^<e4βαδ21
要想让这个误差小,一个方案是减小采样间距 δ \delta δ,即多采样,另一个方案就是增大 β \beta β。本文的方案就是在这二者之间权衡设定不同光线上的采样数和间隔。( β \beta β越大,密度就越分散,因此也不能只增大 β \beta β

在这里插入图片描述

评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值