NeRF 将视锥体空间的射线 r = o + t d \mathbf{r} = \mathbf{o} + t\mathbf{d} r=o+td 转换到归一化设备坐标系(Normalized Device Coordinates, NDC)空间上。现在从理论推导和代码实现两个角度进行分析。
1. 理论推导
投影变换将视锥体映射到
[
−
1
,
1
]
3
[-1, 1]^3
[−1,1]3 的立方体中。参考博客:计算机图形学 - 投影变换推导 可得,投影变换矩阵
M
p
e
r
s
p
=
(
n
r
0
0
0
0
n
t
0
0
0
0
−
f
+
n
f
−
n
−
2
f
n
f
−
n
0
0
−
1
0
)
\mathbf{M}_{persp} = \begin{pmatrix} \dfrac{n}{r} & 0 & 0 & 0 \\ 0 & \dfrac{n}{t} & 0 & 0 \\ 0 & 0 & -\dfrac{f + n}{f - n} & -\dfrac{2fn}{f-n} \\ 0 & 0 & -1 & 0 \end{pmatrix}
Mpersp=
rn0000tn0000−f−nf+n−100−f−n2fn0
。假设空间存在三维点
(
x
y
z
1
)
\begin{pmatrix} x \\ y \\ z \\ 1 \end{pmatrix}
xyz1
,经过投影变换后三维点的坐标为
(
x
′
y
′
z
′
1
)
\begin{pmatrix} x' \\ y' \\ z' \\ 1 \end{pmatrix}
x′y′z′1
,则有:
(
x
′
y
′
z
′
1
)
=
(
n
r
0
0
0
0
n
t
0
0
0
0
−
f
+
n
f
−
n
−
2
f
n
f
−
n
0
0
−
1
0
)
(
x
y
z
1
)
=
(
n
x
r
n
y
t
−
f
+
n
f
−
n
z
−
2
f
n
f
−
n
−
z
)
=
(
−
n
x
r
z
−
n
y
t
z
f
+
n
f
−
n
+
2
f
n
(
f
−
n
)
z
1
)
(1)
\begin{pmatrix} x' \\ y' \\ z' \\ 1 \end{pmatrix} = \begin{pmatrix} \dfrac{n}{r} & 0 & 0 & 0 \\ 0 & \dfrac{n}{t} & 0 & 0 \\ 0 & 0 & -\dfrac{f+n}{f-n} & -\dfrac{2fn}{f-n} \\ 0 & 0 & -1 & 0 \end{pmatrix} \begin{pmatrix} x \\ y \\ z \\ 1 \end{pmatrix} =\begin{pmatrix} \dfrac{nx}{r} \\ \\ \dfrac{ny}{t} \\ \\ -\dfrac{f+n}{f-n}z - \dfrac{2fn}{f-n} \\ \\ -z \end{pmatrix} = \begin{pmatrix} -\dfrac{nx}{rz} \\ \\ -\dfrac{ny}{tz} \\ \\ \dfrac{f+n}{f-n} + \dfrac{2fn}{(f-n)z} \\ \\ 1 \end{pmatrix} \tag{1}
x′y′z′1
=
rn0000tn0000−f−nf+n−100−f−n2fn0
xyz1
=
rnxtny−f−nf+nz−f−n2fn−z
=
−rznx−tznyf−nf+n+(f−n)z2fn1
(1)假设视锥体空间的射线
r
=
o
+
t
d
\mathbf{r} = \mathbf{o} + t\mathbf{d}
r=o+td 映射到 NDC 空间后为
r
′
=
o
′
+
t
′
d
′
\mathbf{r'} = \mathbf{o'} + t'\mathbf{d'}
r′=o′+t′d′,即存在函数使得
π
(
o
+
t
d
)
=
o
′
+
t
′
d
′
\pi(\mathbf{o} + t\mathbf{d}) = \mathbf{o'} + t'\mathbf{d'}
π(o+td)=o′+t′d′。
令
(
−
n
x
r
z
−
n
y
t
z
f
+
n
f
−
n
+
2
f
n
(
f
−
n
)
z
)
=
(
a
x
x
z
a
y
y
z
a
z
+
b
z
z
)
(2)
\begin{pmatrix} -\dfrac{nx}{rz} \\ \\ -\dfrac{ny}{tz} \\ \\ \dfrac{f+n}{f-n} + \dfrac{2fn}{(f-n)z} \end{pmatrix} = \begin{pmatrix} \dfrac{a_x x}{z} \\ \\ \dfrac{a_y y}{z} \\ \\ a_z + \dfrac{b_z}{z} \end{pmatrix} \tag{2}
−rznx−tznyf−nf+n+(f−n)z2fn
=
zaxxzayyaz+zbz
(2)即有
{
a
x
=
−
n
r
a
y
=
−
n
t
a
z
=
f
+
n
f
−
n
b
z
=
2
f
n
f
−
n
\begin{cases} a_x = -\dfrac{n}{r} \\ \\ a_y = -\dfrac{n}{t} \\ \\ a_z = \dfrac{f + n}{f - n} \\ \\ b_z = \dfrac{2fn}{f - n} \end{cases}
⎩
⎨
⎧ax=−rnay=−tnaz=f−nf+nbz=f−n2fn 将
(
x
y
z
)
=
(
o
x
+
t
d
x
o
y
+
t
d
y
o
z
+
t
d
z
)
\begin{pmatrix} x \\ y \\ z \end{pmatrix} = \begin{pmatrix} o_x + td_x \\ o_y + td_y \\ o_z + td_z \end{pmatrix}
xyz
=
ox+tdxoy+tdyoz+tdz
和
(
x
′
y
′
z
′
)
=
(
o
x
′
+
t
′
d
x
′
o
y
′
+
t
′
d
y
′
o
z
′
+
t
′
d
z
′
)
\begin{pmatrix} x' \\ y' \\ z' \end{pmatrix} = \begin{pmatrix} o_x' + t'd_x' \\ o_y' + t'd_y' \\ o_z' + t'd_z' \end{pmatrix}
x′y′z′
=
ox′+t′dx′oy′+t′dy′oz′+t′dz′
带入方程
(
2
)
(2)
(2) 可得,
(
a
x
o
x
+
t
d
x
o
z
+
t
d
z
a
y
o
y
+
t
d
y
o
z
+
t
d
z
a
z
+
b
z
o
z
+
t
d
z
)
=
(
o
x
′
+
t
′
d
x
′
o
y
′
+
t
′
d
y
′
o
z
′
+
t
′
d
z
′
)
\begin{pmatrix} a_x \dfrac{o_x + t d_x}{o_z + t d_z} \\ \\ a_y \dfrac{o_y + t d_y}{o_z + t d_z} \\ \\ a_z + \dfrac{b_z}{o_z + t d_z} \end{pmatrix} =\begin{pmatrix} o'_x + t' d'_x \\ \\ o'_y + t' d'_y \\ \\ o'_z + t' d'_z \end{pmatrix}
axoz+tdzox+tdxayoz+tdzoy+tdyaz+oz+tdzbz
=
ox′+t′dx′oy′+t′dy′oz′+t′dz′
由于投影变换不改变射线的起点位置(相机光心位置),即
(
o
x
o
y
o
z
)
=
(
o
x
′
o
y
′
o
z
′
)
\begin{pmatrix} o_x \\ o_y \\ o_z \end{pmatrix} = \begin{pmatrix} o_x' \\ o_y' \\ o_z' \end{pmatrix}
oxoyoz
=
ox′oy′oz′
。
令
t
=
0
t = 0
t=0 有
o
′
=
(
o
x
′
o
y
′
o
z
′
)
=
(
a
x
o
x
o
z
a
y
o
y
o
z
a
z
+
b
z
o
z
)
=
(
−
n
r
⋅
o
x
o
z
−
n
t
⋅
o
y
o
z
f
+
n
f
−
n
+
2
f
n
f
−
n
⋅
1
o
z
)
=
π
(
o
)
(3)
\mathbf{o}' = \begin{pmatrix} o'_x \\ \\ o'_y \\ \\ o'_z \end{pmatrix} = \begin{pmatrix} a_x \dfrac{o_x}{o_z} \\ \\ a_y \dfrac{o_y}{o_z} \\ \\ a_z + \dfrac{b_z}{o_z} \end{pmatrix} = \begin{pmatrix} -\dfrac{n}{r} \cdot \dfrac{o_x}{o_z} \\ \\ -\dfrac{n}{t} \cdot \dfrac{o_y}{o_z} \\ \\ \dfrac{f + n}{f - n} + \dfrac{2fn}{f - n} \cdot\dfrac{1}{o_z} \end{pmatrix} = \pi(\mathbf{o}) \tag{3}
o′=
ox′oy′oz′
=
axozoxayozoyaz+ozbz
=
−rn⋅ozox−tn⋅ozoyf−nf+n+f−n2fn⋅oz1
=π(o)(3)则有:
(
t
′
d
x
′
t
′
d
y
′
t
′
d
z
′
)
=
(
o
x
′
+
t
′
d
x
′
o
y
′
+
t
′
d
y
′
o
z
′
+
t
′
d
z
′
)
−
(
o
x
′
o
y
′
o
z
′
)
=
(
a
x
o
x
+
t
d
x
o
z
+
t
d
z
a
y
o
y
+
t
d
y
o
z
+
t
d
z
a
z
+
b
z
o
z
+
t
d
z
)
−
(
a
x
o
x
o
z
a
y
o
y
o
z
a
z
+
b
z
o
z
)
=
(
a
x
o
z
(
o
x
+
t
d
x
)
−
o
x
(
o
z
+
t
d
z
)
(
o
z
+
t
d
z
)
o
z
a
y
o
z
(
o
y
+
t
d
y
)
−
o
y
(
o
z
+
t
d
z
)
(
o
z
+
t
d
z
)
o
z
b
z
o
z
−
(
o
z
+
t
d
z
)
(
o
z
+
t
d
z
)
o
z
)
=
(
a
x
t
d
z
o
z
+
t
d
z
(
d
x
d
z
−
o
x
o
z
)
a
y
t
d
z
o
z
+
t
d
z
(
d
y
d
z
−
o
y
o
z
)
−
b
z
t
d
z
o
z
+
t
d
z
1
o
z
)
=
t
d
z
o
z
+
t
d
z
(
a
x
(
d
x
d
z
−
o
x
o
z
)
a
y
(
d
y
d
z
−
o
y
o
z
)
−
b
z
1
o
z
)
\begin{align*} \begin{pmatrix} t'd_x' \\ \\ t'd_y' \\ \\ t'd_z' \end{pmatrix} = \begin{pmatrix} o_x' + t'd_x' \\ \\ o_y' + t'd_y' \\ \\ o_z' + t'd_z' \end{pmatrix} - \begin{pmatrix} o_x' \\ \\ o_y' \\ \\ o_z' \end{pmatrix} &= \begin{pmatrix} a_x \dfrac{o_x + t d_x}{o_z + t d_z} \\ \\ a_y \dfrac{o_y + t d_y}{o_z + t d_z} \\ \\ a_z + \dfrac{b_z}{o_z + t d_z} \end{pmatrix} - \begin{pmatrix} a_x \dfrac{o_x}{o_z} \\ \\ a_y \dfrac{o_y}{o_z} \\ \\ a_z + \dfrac{b_z}{o_z} \end{pmatrix} = \begin{pmatrix} a_x \dfrac{o_z (o_x + t d_x) - o_x (o_z + t d_z)}{(o_z + t d_z) o_z} \\ \\ a_y \dfrac{o_z (o_y + t d_y) - o_y (o_z + t d_z)}{(o_z + t d_z) o_z} \\ \\ b_z \dfrac{o_z - (o_z + t d_z)}{(o_z + t d_z) o_z} \end{pmatrix} \\ &= \begin{pmatrix} a_x \dfrac{t d_z}{o_z + t d_z} \left( \dfrac{d_x}{d_z} - \dfrac{o_x}{o_z} \right) \\ \\ a_y \dfrac{t d_z}{o_z + t d_z} \left( \dfrac{d_y}{d_z} - \dfrac{o_y}{o_z} \right) \\ \\ -b_z \dfrac{t d_z}{o_z + t d_z} \dfrac{1}{o_z} \end{pmatrix} = \dfrac{t d_z}{o_z + t d_z} \begin{pmatrix} a_x \left( \dfrac{d_x}{d_z} - \dfrac{o_x}{o_z} \right) \\ \\ a_y \left( \dfrac{d_y}{d_z} - \dfrac{o_y}{o_z} \right) \\ \\ -b_z \dfrac{1}{o_z} \end{pmatrix} \end{align*}
t′dx′t′dy′t′dz′
=
ox′+t′dx′oy′+t′dy′oz′+t′dz′
−
ox′oy′oz′
=
axoz+tdzox+tdxayoz+tdzoy+tdyaz+oz+tdzbz
−
axozoxayozoyaz+ozbz
=
ax(oz+tdz)ozoz(ox+tdx)−ox(oz+tdz)ay(oz+tdz)ozoz(oy+tdy)−oy(oz+tdz)bz(oz+tdz)ozoz−(oz+tdz)
=
axoz+tdztdz(dzdx−ozox)ayoz+tdztdz(dzdy−ozoy)−bzoz+tdztdzoz1
=oz+tdztdz
ax(dzdx−ozox)ay(dzdy−ozoy)−bzoz1
不妨令
{
t
′
=
t
d
z
o
z
+
t
d
z
=
1
−
o
z
o
z
+
t
d
z
d
′
=
(
d
x
′
d
y
′
d
z
′
)
=
(
a
x
(
d
x
d
z
−
o
x
o
z
)
a
y
(
d
y
d
z
−
o
y
o
z
)
−
b
z
1
o
z
)
(4)
\begin{cases} t' = \dfrac{td_z}{o_z + td_z} = 1 - \dfrac{o_z}{o_z + td_z} \\ \\ \mathbf{d'} = \begin{pmatrix} d_x' \\ \\ d_y' \\ \\ d_z' \end{pmatrix} = \begin{pmatrix} a_x \left( \dfrac{d_x}{d_z} - \dfrac{o_x}{o_z} \right) \\ \\ a_y \left( \dfrac{d_y}{d_z} - \dfrac{o_y}{o_z} \right) \\ \\ -b_z \dfrac{1}{o_z} \end{pmatrix} \end{cases} \tag{4}
⎩
⎨
⎧t′=oz+tdztdz=1−oz+tdzozd′=
dx′dy′dz′
=
ax(dzdx−ozox)ay(dzdy−ozoy)−bzoz1
(4)一方面,在 NeRF 中远平面选取为
z
=
∞
z = ∞
z=∞,所以有
{
a
z
=
lim
f
→
∞
f
+
n
f
−
n
=
lim
f
→
∞
1
+
n
f
1
−
n
f
=
1
b
z
=
lim
f
→
∞
2
f
n
f
−
n
=
lim
f
→
∞
2
n
1
−
n
f
=
2
n
\left\{ \begin{aligned} a_z &= \lim_{f \to \infty} \dfrac{f+n}{f-n} = \lim_{f \to \infty} \dfrac{1+\dfrac{n}{f}}{1-\dfrac{n}{f}} = 1 \\ b_z &= \lim_{f \to \infty} \dfrac{2fn}{f-n} = \lim_{f \to \infty} \dfrac{2n}{1-\dfrac{n}{f}} = 2n \end{aligned} \right.
⎩
⎨
⎧azbz=f→∞limf−nf+n=f→∞lim1−fn1+fn=1=f→∞limf−n2fn=f→∞lim1−fn2n=2n另一方面,在 NeRF 中有
r
=
W
2
r = \dfrac{W}{2}
r=2W 和
t
=
H
2
t = \dfrac{H}{2}
t=2H(
H
H
H 和
W
W
W 分别为图像的宽度和高度),并且
n
=
f
c
a
m
n = f_{cam}
n=fcam(
f
c
a
m
f_{cam}
fcam 为相机的焦距),所以有
{
a
x
=
−
f
c
a
m
W
/
2
a
y
=
−
f
c
a
m
H
/
2
\left\{ \begin{aligned} a_x = -\dfrac{f_{cam}}{W/2} \\ a_y = -\dfrac{f_{cam}}{H/2} \end{aligned} \right.
⎩
⎨
⎧ax=−W/2fcamay=−H/2fcam综上可得,NDC 空间中射线的起点
o
′
\mathbf{o'}
o′ 和方向
d
′
\mathbf{d'}
d′ 分别为:
{
o
′
=
(
−
f
c
a
m
W
/
2
o
x
o
z
−
f
c
a
m
H
/
2
o
y
o
z
1
+
2
n
o
z
)
d
′
=
(
−
f
c
a
m
W
/
2
(
d
x
d
z
−
o
x
o
z
)
−
f
c
a
m
H
/
2
(
d
y
d
z
−
o
y
o
z
)
−
2
n
1
o
z
)
\left\{ \begin{aligned} \mathbf{o}' &= \begin{pmatrix} -\dfrac{f_{cam}}{W/2} \dfrac{o_x}{o_z} \\ \\ -\dfrac{f_{cam}}{H/2} \dfrac{o_y}{o_z} \\ \\ 1 + \dfrac{2n}{o_z} \end{pmatrix} \\ \\ \mathbf{d}' &= \begin{pmatrix} -\dfrac{f_{cam}}{W/2} \left( \dfrac{d_x}{d_z} - \dfrac{o_x}{o_z} \right) \\ \\ -\dfrac{f_{cam}}{H/2} \left( \dfrac{d_y}{d_z} - \dfrac{o_y}{o_z} \right) \\ \\ -2n \dfrac{1}{o_z} \end{pmatrix} \end{aligned} \right.
⎩
⎨
⎧o′d′=
−W/2fcamozox−H/2fcamozoy1+oz2n
=
−W/2fcam(dzdx−ozox)−H/2fcam(dzdy−ozoy)−2noz1
2. 代码实现
在转换到 NDC 空间前,首先将射线的起点
o
\mathbf{o}
o 移动到近平面与射线的交点上,这使得后续进行采样时可以忽略相机光心到近平面这段距离。
令
o
z
+
t
n
d
z
=
−
n
o_z + t_n d_z = -n
oz+tndz=−n 解得:
t
n
=
−
n
+
o
z
d
z
t_n = -\dfrac{n + o_z}{d_z}
tn=−dzn+oz
具体代码实现如下:
def ndc_rays(H, W, focal, near, rays_o, rays_d):
# Shift ray origins to near plane
t = -(near + rays_o[..., 2]) / rays_d[..., 2]
rays_o = rays_o + t[..., None] * rays_d
# Projection
o0 = -1. / (W / (2. * focal)) * rays_o[..., 0] / rays_o[..., 2]
o1 = -1. / (H / (2. * focal)) * rays_o[..., 1] / rays_o[..., 2]
o2 = 1. + 2. * near / rays_o[..., 2]
d0 = -1. / (W / (2. * focal)) * (rays_d[..., 0] / rays_d[..., 2] - rays_o[..., 0] / rays_o[..., 2])
d1 = -1. / (H / (2. * focal)) * (rays_d[..., 1] / rays_d[..., 2] - rays_o[..., 1] / rays_o[..., 2])
d2 = -2. * near / rays_o[..., 2]
rays_o = torch.stack([o0, o1, o2], -1)
rays_d = torch.stack([d0, d1, d2], -1)
return rays_o, rays_d
为什么要将原始的视锥空间转换到 NDC 空间?
现在从不同角度分析 NDC 空间的作用:
1)根据 NeRF 论文有:
Once we convert to the NDC ray, this allows us to simply sample t’ 0 linearly from 0 to 1 in order to get a linear sampling in disparity from n to ∞ in the original space.
在原始视锥空间中,
t
t
t 取值为 0 到
∞
∞
∞,而在 NDC 空间中,
t
′
t'
t′ 取值为 0 到 1。当
t
t
t 为 0 时,
t
′
=
1
−
o
z
o
z
+
t
d
z
=
0
t' = 1 - \dfrac{o_z}{o_z + td_z} = 0
t′=1−oz+tdzoz=0,当
t
→
∞
t→∞
t→∞ 时,
t
′
→
1
t'→1
t′→1。
z
=
o
z
+
t
d
z
z = o_z + t d_z
z=oz+tdz 为采样点在原始视锥空间的深度,所以
t
′
=
1
−
o
z
z
t' = 1 - \dfrac{o_z}{z}
t′=1−zoz。由于在转换到 NDC 空间之前,已经将射线的起点移动到射线与近平面的交点上(即
o
z
=
−
n
o_z = -n
oz=−n),则有
t
′
=
1
+
n
z
t' = 1 + \dfrac{n}{z}
t′=1+zn。
考虑到视差
d
d
d 和深度
z
z
z 存在关系:
d
=
1
∣
z
∣
=
1
−
z
d = \dfrac{1}{|z|}=\dfrac{1}{-z}
d=∣z∣1=−z1(因为
z
<
0
z < 0
z<0),所以
t
′
=
1
−
n
d
t' = 1 - nd
t′=1−nd,即有
d
=
1
−
t
′
n
d = \dfrac{1 - t'}{n}
d=n1−t′。
t
′
=
0
t' = 0
t′=0 对应于视差
d
=
1
n
d = \dfrac{1}{n}
d=n1,
t
′
=
1
t' = 1
t′=1 对应于视差
d
=
0
d = 0
d=0(无穷远处的采样点的视差为 0)。
对 NDC 空间的
t
′
∈
[
0
,
1
]
t'∈[0, 1]
t′∈[0,1] 进行均匀取点,等价于在原始视锥空间中对视差
d
∈
[
0
,
1
n
]
d∈[0, \dfrac{1}{n}]
d∈[0,n1] 进行均匀分布采样。在原始空间中对视差进行均匀采样(即深度上非线性采样,近处密集,远处稀疏)是比较复杂的,需要根据深度进行非均匀计算。在 NDC 空间只需要做最简单的线性均匀采样,就可以获得原始空间中在视差上均匀分布的采样点。这也是将原始视锥空间转换为 NDC 空间的优势之一。
绘制视差
d
d
d 与深度
z
z
z 之间的曲线图:
从图中可以看出,近处的点的视差值较大,而远处的点的视差值较小,因此根据视差均匀分布采样得到的点大部分是近处的点。这种采样策略自适应地分配了采样点,在近处密集采样捕捉细节,在远处稀疏采样节省计算资源,显著提高了采样效率。近处正是物体细节丰富、图像变化剧烈的地方,需要更密集的采样才能准确捕捉几何边缘、纹理细节和光照变化(如阴影边界),而远处的物体通常较小、细节较少、变化平缓,不需要那么密集的采样。如下图所示: