3D 高斯
继承可微体积表示属性的原语,同时是非结构化和显式的,可以允许非常快的渲染。
总结概述
(1)引入各向异性的3D高斯分布作为高质量、非结构化的辐射场表达;
3D高斯是一个可微的体积表示,同时可以通过将3D高斯投影到2D来非常有效地进行光栅化,使用标准 α \alpha α混合,与NeRF图像像素渲染模型一致。
(2)优化3D高斯的各个属性(3D位置、不透明度opacity α \alpha α、各向异性的协方差矩阵 Σ \Sigma Σ、球谐函数系数),自适应密度控制调整3D高斯生成,生成高质量的场景表示;
大片均匀区域可以用少量的大各向异性高斯来表示(协方差矩阵大的高斯,大的三维椭球)
删除透明的高斯球(不透明度 α \alpha α小于阈值 ϵ α \epsilon_{\alpha} ϵα)
填充空白区域(欠重建区域),高斯覆盖了场景中的大区域(过度重建区域),观测到两者都具有较大的视图空间位置梯度。(可能是因为它们对应于尚未很好地重建的区域,并且优化试图移动高斯来纠正这一点。)对视图位置梯度大于阈值 τ p o s \tau_{pos} τpos的高斯进行致密化。欠重建情况,一个小高斯椭球覆盖不了重建的几何区域,创建一个同样尺寸的高斯椭球,并沿着位置梯度方向移动新创建的高斯椭球。几何区域被一个大高斯椭球覆盖,该高斯椭球需要被分解成小的高斯椭球。用两个高斯椭球代替,缩放尺度除以了一个因子( ϕ = 1.6 \phi=1.6 ϕ=1.6),使用原始的 3D 高斯作为 PDF 进行采样来初始化它们的位置。
优化可能会卡在靠近输入相机的漂浮物上,因此每3000个迭代把不透明度设置 α \alpha α为0,后续优化会增加高斯的不透明度,并且删去低不透明度的高斯。高斯可能会缩小或增长,并且与其他高斯有很大的重叠,定期删除世界空间中非常大的高斯和视图空间中占用较大的高斯。运用这些策略控制总的高斯数量。
(3)使用GPU进行快速可微渲染,分块并行的光栅化(tile-based rasterization);
每个椭球的参数:中心点位置 ( x , y , z ) (x,y,z) (x,y,z),协方差矩阵 Σ = R S S T R T \Sigma=RSS^TR^T Σ=RSSTRT,球谐函数系数 16 × 3 16\times3 16×3矩阵,透明度 α \alpha α。
输入:静态场景的一组图像+sfm产生的稀疏点云+相机位姿
初始化高斯:初始协方差矩阵估计为各向同性高斯(球),其轴等于到最近三个点的距离的平均值(knn法,找到3近邻)。
1、为什么 Σ = R S S T R T \Sigma=RSS^TR^T Σ=RSSTRT?
协方差矩阵 Σ \Sigma Σ是对称正定矩阵,进行特征值分解得到 Σ = Q Λ Q T \Sigma=Q\Lambda Q^T Σ=QΛQT,其中 Q Q Q是正交矩阵( Q T = Q − 1 Q^T=Q^{-1} QT=Q−1),其列向量是 Σ \Sigma Σ的特征向量,表示高斯分布的方向。 Λ \Lambda Λ是对角矩阵,对角元素是 Σ \Sigma Σ的特征值,表示高斯分布在各个方向上的方差。
对角矩阵 Λ \Lambda Λ进一步分解为: Λ = S S T , S = S T = Λ \Lambda=SS^T,S=S^T=\sqrt{\Lambda} Λ=SST,S=ST=Λ。
所以,协方差矩阵分解为 Σ = Q Λ Q T = Q S S T Q T \Sigma=Q\Lambda Q^T=QSS^TQ^T Σ=QΛQT=QSSTQT。
在几何上,正交矩阵 Q Q Q表示旋转操作,因此正交矩阵 Q Q Q可以看作旋转矩阵 R R R,即 R = Q R=Q R=Q,所以协方差矩阵表示为 Σ = R S S T R T \Sigma=RSS^TR^T Σ=RSSTRT。
把协方差矩阵表示为 Σ = R S S T R T \Sigma=RSS^TR^T Σ=RSSTRT是为了在梯度优化过程中保持协方差矩阵的物理意义,协方差矩阵表示为一个3维向量(缩放因子)和四元数。
2、为什么把协方差矩阵从三维向二维图像平面投影的公式为 Σ ′ = J W Σ W T J T \Sigma'=JW\Sigma W^TJ^T Σ′=JWΣWTJT?
J J J是雅可比矩阵,把投影变换近似为线性变换,对高斯分布进行线性变换,例如对于一个高斯分布 x ∼ N ( μ , Σ ) x\sim \mathcal N(\mu,\Sigma) x∼N(μ,Σ),对其进行线性变换 y = A x y=Ax y=Ax,则变换后的分布为: y ∼ N ( A μ , A Σ A T ) y\sim \mathcal N(A\mu,A\Sigma A^T) y∼N(Aμ,AΣAT)。
投影变换涉及两个步骤:
(1)线性变换:通过矩阵 W W W从世界坐标变换到相机坐标
(2)投影:通过雅可比矩阵 J J J将3D点投影到2D平面
y = J W x → y ∼ N ( J W μ , J W Σ W T J T ) → Σ ′ = J W Σ W T J T y=JWx\rightarrow y\sim \mathcal N(JW\mu,JW\Sigma W^TJ^T)\rightarrow \Sigma'=JW\Sigma W^TJ^T y=JWx→y∼N(JWμ,JWΣWTJT)→Σ′=JWΣWTJT
消融实验
sfm初始化点云:
(1)随机初始化点云主要导致背景的重建产生退化。
(2)在训练视图中没有很好覆盖的区域,随机初始化方法似乎有更多无法通过优化去除的浮动。
致密化策略(克隆和分裂高斯):
(1)分割大高斯对于实现良好的背景重建非常重要。
(2)克隆小高斯而不是分裂它们可以更好更快地收敛,特别是当薄结构出现在场景中时。
各向异性的协方差矩阵:
各向异性显著提高了三维高斯与表面对齐能力的质量,这反过来又允许更高的渲染质量,同时保持相同数量的点。
各向异性体积薄片的使用使精细结构的建模成为可能,并对视觉质量产生重大影响。
限制
(1)没有很好地观察到的区域,有伪影。
(2)可能产生细长的伪影或“斑点”高斯。
(3)内存消耗明显高于基于nerf的解决方案。
(4)镜面反射,透明物体(玻璃,不同视角呈现不同光泽)
数据集
用于优化、重建、操作、生成、感知和虚拟人类 3D-GS 数据集。
高斯优化工作
1、高斯压缩,提高内存效率
**手段:**网格编码、VQ、八叉树
**方向:**将现有的静态方法扩展到动态场景,提高动态场景表示的紧凑性仍有待探索。
**关键问题:**控制高斯个数的策略、高斯参数的压缩存储
总引用(7篇):610
Scaffold-GS
利用底层场景结构来帮助修剪过度扩展的高斯球。从sfm初始化点,构建高斯锚点的稀疏网格,高斯的属性根据特定的锚特征动态预测。
Lu T, Yu M, Xu L, et al. Scaffold-gs: Structured 3d gaussians for view-adaptive rendering[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 20654-20664.
谷歌引用:180
CompGS
利用向量量化(VQ,Vector quantization)来压缩参数,减少高斯的内存使用。分别对颜色、球谐函数、缩放向量和旋转向量进行K-means聚类,以实现高效的量化。不透明度和位置向量保持不变,因为前者是普通标量,后者的量化将导致高斯的重叠。
Navaneet K L, Meibodi K P, Koohpayegani S A, et al. Compact3d: Compressing gaussian splat radiance field models with vector quantization[J]. arXiv preprint arXiv:2311.18159, 2023.
谷歌引用:45
EAGLES
利用向量量化(VQ,Vector quantization)。除了球谐函数0阶系数和缩放向量以及位置向量外,所有其他属性都被量化和编码。
Girish S, Gupta K, Shrivastava A. Eagles: Efficient accelerated 3d gaussians with lightweight encodings[C]//European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2024: 54-71.
谷歌引用:57
a compact 3D Gaussian representation
应用可学习的体积掩码来过滤不重要的高斯。然后,通过残差VQ压缩过滤后的高斯的几何属性,以提高内存效率。
Lee J C, Rho D, Sun X, et al. Compact 3d gaussian representation for radiance field[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 21719-21728.
谷歌引用:131
LightGaussian
高斯首先根据它们的全局意义进行修剪。然后,通过数据蒸馏减少球谐系数的程度,并对平凡高斯的系数进行量化。此外,位置参数通过基于八叉树的算法压缩,其余属性以半精度格式保存。
Fan Z, Wang K, Wen K, et al. Lightgaussian: Unbounded 3d gaussian compression with 15x reduction and 200+ fps[J]. arXiv preprint arXiv:2311.17245, 2023.
谷歌引用:112
Octree-GS
基于Scaffold-GS的八叉树-GS,利用基于八叉树的尾部级别(LOD)表示。构建八叉树用于锚定初始化的高斯,每个锚点在 LOD 的相邻级别之间自适应地细化。
Ren K, Jiang L, Lu T, et al. Octree-gs: Towards consistent real-time rendering with lod-structured 3d gaussians[J]. arXiv preprint arXiv:2403.17898, 2024.
谷歌引用:50
An Efficient 3D Gaussian Representation for Monocular/Multi-view Dynamic Scenes
将高斯的参数分为时不变和时变参数,有效地避免了在每个时间步存储参数。此外,流信息用于减少连续帧之间的歧义。
Katsumata K, Vo D M, Nakayama H. An efficient 3d gaussian representation for monocular/multi-view dynamic scenes[J]. arXiv preprint arXiv:2311.12897, 2023.
谷歌引用:35
2、提高渲染图像的质量
图像质量问题:
(1)锯齿/混叠问题(aliasing issue)(2)伪影 (3)场景中反射的真实性 (4)照明分解(原始的 3D-GS 在特定材料上效果不佳。)
挑战:
(1)3D高斯在2D图像上的投影大大加快了渲染过程,但它使遮挡的计算变得复杂,导致光照估计较差。
(2)3D-GS是欠正则化的,无法捕获精确的几何信息和生成精确的法线。
(3)3D-GS 在具有镜像物体和复杂反射的场景中表现不佳。
(4)过度重建的高斯可能会导致混叠和伪影,从而降低渲染图像的质量。
(5)从不精确的相机位姿和运动模糊的图像中重建场景。
方向:
(1)具有镜面物体和复杂的反射的场景,如何捕获视图依赖的变换
(2)在不影响表达能力和渲染质量的情况下,过滤掉和修剪过度重建的高斯
(3)精确的法线估计
总引用(7篇):613
Multi-Scale 3D GS
他们假设这样的问题主要是由填充在具有复杂3D细节的区域的大量 Gaussions进行喷溅引起的。在不同的尺度层级表示场景。对于每个级别,每个体素中尺寸在阈值以下的细粒度高斯被聚合到更大的高斯中,然后插入到后续的较粗级别中。使用原始图像及其下采样进行训练。在渲染过程中,相应地选择具有适当尺度的高斯函数。
Yan Z, Low W F, Chen Y, et al. Multi-scale 3d gaussian splatting for anti-aliased rendering[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 20923-20931.
谷歌引用:58
FreGS
缓解过度重建问题。利用傅立叶空间中描述的幅度和相位差异进行正则化。采用频率退火策略,以从粗到细的方式促进致密化,并逐步学习低频到高频成分。
Zhang J, Zhan F, Xu M, et al. Fregs: 3d gaussian splatting with progressive frequency regularization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 21424-21433.
谷歌引用:36
Mip-Splatting
引入了3D平滑和2D Mip滤波器来解决3D高斯优化过程中的歧义。
Yu Z, Chen A, Huang B, et al. Mip-splatting: Alias-free 3d gaussian splatting[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 19447-19456.
谷歌引用:296
Relightable 3D Gaussian
利用一组可重新照明的 3D 高斯点来表示场景。入射光分为局部分量和全局分量,分别由每个高斯的球面谐波和共享的全局球面谐波乘以可见性项表示。
Gao J, Gu C, Lin Y, et al. Relightable 3d gaussians: Realistic point cloud relighting with brdf decomposition and ray tracing[C]//European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2024: 73-89.
谷歌引用:95
GaussianShader
增强了具有镜面反射特征和反射表面的场景的真实感。明确地考虑了光表面相互作用,并结合了渲染方程的简化近似,用于高质量的渲染。为了准确预测离散 3D 高斯上的法线,选择最短轴作为近似法线,并引入两个额外的可训练法线残差进行正则化,一个用于向外,另一个用于向内轴场景。
Jiang Y, Tu J, Liu Y, et al. Gaussianshader: 3d gaussian splatting with shading functions for reflective surfaces[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 5322-5332.
谷歌引用:53
GS-IR
将3D-GS引入反向渲染。
Liang Z, Zhang Q, Feng Y, et al. Gs-ir: 3d gaussian splatting for inverse rendering[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 21644-21653.
谷歌引用:68
SpecNeRF
通过将3D-GS作为方向编码来增强NeRF的能力。
Ma L, Agrawal V, Turki H, et al. Specnerf: Gaussian directional encoding for specular reflections[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 21188-21198.
谷歌引用:7
3、稀疏图像新视角合成(fewshot problem)
问题:稀疏视角输入图像导致高斯初始化不足,可能导致重建崩溃,增加了过拟合的风险,产生了过度平滑的结果。
解决方法:
(1)使用额外的单目深度估计模型提供了有用的几何先验来调整 3D 高斯,以便更有效地覆盖看不见的视图。
(2)通过预先训练的生成模型补偿缺失的视图可以缓解初始化不足的问题。但是生成的视图和稀疏输入之间的错位一致性会导致重建场景中的失真。
(3)加几何限制、深度正则化
总引用(4篇):264
Depth-Regularized Optimization
Chung J, Oh J, Lee K M. Depth-regularized optimization for 3d gaussian splatting in few-shot images[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 811-820.
谷歌引用:66
FSGS
Zhu Z, Fan Z, Jiang Y, et al. Fsgs: Real-time few-shot view synthesis using gaussian splatting[C]//European conference on computer vision. Cham: Springer Nature Switzerland, 2024: 145-163.
谷歌引用:97
SparseGS
Xiong H, Muttukuru S, Upadhyay R, et al. Sparsegs: Real-time 360 {\deg} sparse view synthesis using gaussian splatting[J]. arXiv preprint arXiv:2312.00206, 2023.
谷歌引用:29
DNGaussian
提出硬软深度正则化策略来强制最近的高斯组成完整的表面。冻结缩放和旋转参数,以减少它们对颜色重建的负面影响。对局部和全局深度进行归一化,以纠正小的局部深度误差,并感知到全局尺度损失。
Li J, Zhang J, Bai X, et al. Dngaussian: Optimizing sparse-view 3d gaussian radiance fields with global-local depth normalization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 20775-20785.
谷歌引用:72
4、真实动态场景重建
运动幅度大的物体可能在连续的帧之间造成不自然的扭曲。
解决方法:
(1)将神经网络与学习到的场景特定动力学相结合可以提高变形的保真度。
(2)引入物理约束
(3)学习变形
总引用(7篇):903
4D-GS(4d gaussian splatting for real-time dynamic scene rendering)
Wu G, Yi T, Fang J, et al. 4d gaussian splatting for real-time dynamic scene rendering[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 20310-20320.
谷歌引用:447
3d geometry-aware deformable GS
Lu Z, Guo X, Hui L, et al. 3d geometry-aware deformable gaussian splatting for dynamic view synthesis[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 8900-8910.
谷歌引用:32
MD-Splatting
Duisterhof B P, Mandi Z, Yao Y, et al. Md-splatting: Learning metric deformation from 4d gaussians in highly deformable scenes[J]. arXiv preprint arXiv:2312.00583, 2023.
谷歌引用:33
DynMF
将给定场景中的复杂运动分解为几个基本轨迹,从中可以充分推导出每个点或运动。然后通过MLP来预测轨迹,MLP本身就形成了一个运动神经场。
Kratimenos A, Lei J, Daniilidis K. Dynmf: Neural motion factorization for real-time dynamic view synthesis with 3d gaussian splatting[C]//European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2024: 252-269.
谷歌引用:48
3DGStream
两阶段框架。3DGStream使用多分辨率体素网格作为场景表示,采用哈希编码和浅MLP作为神经变换缓存(NTC)。在第一阶段,训练NTC学习下一个时间戳的三维高斯函数的平移和旋转。然后,在质量控制下自适应生成高斯函数来重建后续帧。
Sun J, Jiao H, Li G, et al. 3dgstream: On-the-fly training of 3d gaussians for efficient streaming of photo-realistic free-viewpoint videos[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 20675-20685.
谷歌引用:46
4DGS
对空间和时间作为整体建模,以解决一般的动态场景表示和渲染。
Yang Z, Yang H, Pan Z, et al. Real-time photorealistic dynamic scene representation and rendering with 4d gaussian splatting[J]. arXiv preprint arXiv:2310.10642, 2023.
谷歌引用:172
PhysGaussian
Xie T, Zong Z, Qiu Y, et al. Physgaussian: Physics-integrated 3d gaussians for generative dynamics[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 4389-4398.
谷歌引用:125
静态场景中的表面重建和网格提取
3D网格重建和高质量的网格渲染
SuGaR
Guédon A, Lepetit V. Sugar: Surface-aligned gaussian splatting for efficient 3d mesh reconstruction and high-quality mesh rendering[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 5354-5363.
谷歌引用:275
表面重建
NeuSG
Chen H, Li C, Lee G H. Neusg: Neural implicit surface reconstruction with 3d gaussian splatting guidance[J]. arXiv preprint arXiv:2312.00846, 2023.
谷歌引用:80
单目和少镜头重建任务(monocular and few-shot)
pixelSplat
Charatan D, Li S L, Tagliasacchi A, et al. pixelsplat: 3d gaussian splats from image pairs for scalable generalizable 3d reconstruction[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 19457-19467.
谷歌引用:170
TriplaneGaussian
Zou Z X, Yu Z, Guo Y C, et al. Triplane meets gaussian splatting: Fast and generalizable single-view 3d reconstruction with transformers[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 10324-10335.
谷歌引用:136
单目三维重建
单目三维重建的关键是对图像中的透视关系、纹理和运动模式进行细致的分析。通过使用单目技术,可以准确地估计物体之间的距离并识别场景的整体形状。
Splatter Image
不需要相机位姿
Szymanowicz S, Rupprecht C, Vedaldi A. Splatter image: Ultra-fast single-view 3d reconstruction[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 10208-10217.
谷歌引用:146
Neural Parametric Gaussians (NPGs)
Das D, Wewer C, Yunus R, et al. Neural parametric gaussians for monocular non-rigid object reconstruction[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 10715-10725.
谷歌引用:36
重建特定的动态对象或重建小而真实的动态场景
3D-PSHR
实时动态手部重建
Jiang Z, Rahmani H, Black S, et al. 3D points splatting for real-time dynamic hand reconstruction[J]. Pattern Recognition, 2025: 111426.
谷歌引用:9
Gaussian-Flow
快速动态室内场景重建和实时渲染
Lin Y, Dai Z, Zhu S, et al. Gaussian-flow: 4d reconstruction with dynamic 3d gaussian particle[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 21136-21145.
谷歌引用:62
deformable 3DGS
Yang Z, Gao X, Zhou W, et al. Deformable 3d gaussians for high-fidelity monocular dynamic scene reconstruction[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 20331-20341.
谷歌引用:273
重建大型动态场景
VastGaussian
Lin J, Li Z, Tang X, et al. Vastgaussian: Vast 3d gaussians for large scene reconstruction[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 5166-5175.
谷歌引用:87
DrivingGaussian
建模大规模,动态的驾驶场景
Zhou X, Lin Z, Shan X, et al. Drivinggaussian: Composite gaussian splatting for surrounding dynamic autonomous driving scenes[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 21634-21643.
谷歌引用:147
Periodic Vibration Gaussian (PVG)
Chen Y, Gu C, Jiang J, et al. Periodic vibration gaussian: Dynamic urban scene reconstruction and real-time rendering[J]. arXiv preprint arXiv:2311.18561, 2023.
谷歌引用:49
三维编辑
文本引导/图像引导的三维编辑
GaussianEditor
精确编辑3D场景使用3D高斯和文本指令
Wang J, Fang J, Zhang X, et al. Gaussianeditor: Editing 3d gaussians delicately with text instructions[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 20902-20911.
谷歌引用:81
TIP-Editor
接受文本和图像prompts以及3D边界框来指定编辑区域
Zhuang J, Kang D, Cao Y P, et al. Tip-editor: An accurate 3d editor following both text-prompts and image-prompts[J]. ACM Transactions on Graphics (TOG), 2024, 43(4): 1-12.
谷歌引用:25
a mesh-based Gaussian Splatting
Gao L, Yang J, Zhang B T, et al. Mesh-based gaussian splatting for real-time large-scale deformation[J]. arXiv preprint arXiv:2402.04796, 2024.
谷歌引用:17
非刚性物体操作
MANUS
建模关节手,将MANUS- hand与物体的三维高斯表示相结合,以准确地建模接触。
Pokhariya C, Shah I N, Xing A, et al. MANUS: Markerless Grasp Capture using Articulated 3D Gaussians[J]. arXiv preprint arXiv:2312.02137, 2023.
谷歌引用:8
GART
Lei J, Wang Y, Pavlakos G, et al. Gart: Gaussian articulated template models[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 19876-19887.
谷歌引用:69
高效3维编辑
Point’n Move
Huang J, Yu H, Zhang J, et al. Point’n Move: Interactive scene object manipulation on Gaussian splatting radiance fields[J]. IET Image Processing, 2024, 18(12): 3507-3517.
谷歌引用:18
GaussianEditor
Chen Y, Chen Z, Zhang C, et al. Gaussianeditor: Swift and controllable 3d editing with gaussian splatting[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 21476-21485.
谷歌引用:140
4维编辑
Control4D
使用文字指令动态编辑肖像。GaussianPlanes首次作为一种新颖的四维表示提出,它通过三维空间和时间上的基于平面的分解来增强高斯飞溅的结构。
Shao R, Sun J, Peng C, et al. Control4d: Dynamic portrait editing by learning 4d gan from 2d diffusion-based editor[J]. arXiv preprint arXiv:2305.20082, 2023, 2(6): 16.
谷歌引用:44
SC-GS
结合稀疏控制点和MLP对场景运动建模。SC-GS通过操纵学习控制点实现运动编辑,同时保持高保真的外观。
Huang Y H, Sun Y T, Yang Z, et al. Sc-gs: Sparse-controlled gaussian splatting for editable dynamic scenes[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 4220-4230.
谷歌引用:107
Controllable Gaussian Splatting (CoGS)
Yu H, Julin J, Milacski Z Á, et al. Cogs: Controllable gaussian splatting[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 21624-21633.
谷歌引用:33
3维生成
对象级别3维生成
GaussianDreamer
Yi T, Fang J, Wu G, et al. Gaussiandreamer: Fast generation from text to 3d gaussian splatting with point cloud priors[J]. arXiv preprint arXiv:2310.08529, 2023.
谷歌引用:139
AGG
Xu D, Yuan Y, Mardani M, et al. Agg: Amortized generative 3d gaussians for single image to 3d[J]. arXiv preprint arXiv:2401.04099, 2024.
谷歌引用:29
DreamGaussian
Tang J, Ren J, Zhou H, et al. Dreamgaussian: Generative gaussian splatting for efficient 3d content creation[J]. arXiv preprint arXiv:2309.16653, 2023.
谷歌引用:525
场景级三维生成
CG3D
Vilesov A, Chari P, Kadambi A. Cg3d: Compositional generation for text-to-3d via gaussian splatting[J]. arXiv preprint arXiv:2311.17907, 2023.
谷歌引用:32
LucidDreamer
Chung J, Lee S, Nam H, et al. Luciddreamer: Domain-free generation of 3d gaussian splatting scenes[J]. arXiv preprint arXiv:2311.13384, 2023.
谷歌引用:91
Text2Immersion
Ouyang H, Heal K, Lombardi S, et al. Text2immersion: Generative immersive scene with 3d gaussians[J]. arXiv preprint arXiv:2312.09242, 2023.
谷歌引用:29
4维生成
Align Your Gaussians (AYG)
Ling H, Kim S W, Torralba A, et al. Align your gaussians: Text-to-4d with dynamic 3d gaussians and composed diffusion models[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 8576-8588.
谷歌引用:94
4DGen
Yin Y, Xu D, Wang Z, et al. 4dgen: Grounded 4d content generation with spatial-temporal consistency[J]. arXiv preprint arXiv:2312.17225, 2023.
谷歌引用:57
DreamGaussian4D
Ren J, Pan L, Tang J, et al. Dreamgaussian4d: Generative 4d gaussian splatting[J]. arXiv preprint arXiv:2312.17142, 2023.
谷歌引用:95
Efficient4D
Pan Z, Yang Z, Zhu X, et al. Fast dynamic 3d object generation from a single-view video[J]. arXiv preprint arXiv:2401.08742, 2024.
谷歌引用:20
3D高斯感知应用
检测和定位
检测高反射或半透明物体仍然是一项具有挑战性的任务。
Language Embedded 3D Gaussians
Shi J C, Wang M, Duan H B, et al. Language embedded 3d gaussians for open-vocabulary scene understanding[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 5333-5343.
谷歌引用:48
Foundation Model Embedded Gaussian Splatting (FMGS)
Zuo X, Samangouei P, Zhou Y, et al. Fmgs: Foundation model embedded 3d gaussian splatting for holistic 3d scene understanding[J]. International Journal of Computer Vision, 2024: 1-17.
谷歌引用:26
三维分割
Gaussian Grouping
Ye M, Danelljan M, Yu F, et al. Gaussian grouping: Segment and edit anything in 3d scenes[C]//European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2024: 162-179.
谷歌引用:114
Feature-3DGS
Zhou S, Chang H, Jiang S, et al. Feature 3dgs: Supercharging 3d gaussian splatting to enable distilled feature fields[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 21676-21685.
谷歌引用:94
SA-GS
Hu X, Wang Y, Fan L, et al. Semantic anything in 3d gaussians[J]. arXiv preprint arXiv:2401.17857, 2024.
谷歌引用:23
SLAM
GS-SLAM
Yan C, Qu D, Xu D, et al. Gs-slam: Dense visual slam with 3d gaussian splatting[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 19595-19604.
谷歌引用:163
Gaussian Splatting SLAM
Matsuki H, Murai R, Kelly P H J, et al. Gaussian splatting slam[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 18039-18048.
谷歌引用:226
Photo-SLAM
Huang H, Li L, Cheng H, et al. Photo-SLAM: Real-time Simultaneous Localization and Photorealistic Mapping for Monocular Stereo and RGB-D Cameras[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 21584-21593.
谷歌引用:77
SplaTAM
Keetha N, Karhade J, Jatavallabhula K M, et al. SplaTAM: Splat Track & Map 3D Gaussians for Dense RGB-D SLAM[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 21357-21366.
谷歌引用:210
Gaussian-SLAM
Yugay V, Li Y, Gevers T, et al. Gaussian-slam: Photo-realistic dense slam with gaussian splatting[J]. arXiv preprint arXiv:2312.10070, 2023.
谷歌引用:96
虚拟人
在大多数方法中,环境照明没有被参数化,这使得重新照明化身是不可行的。
局部区域高斯之间的内在结构和连通性关系被忽略。
对于人体头部建模,利用3DMM控制运动的方法也无法表达微妙的面部表情。
多视角视频生成
HuGS
Moreau A, Song J, Dhamo H, et al. Human gaussian splatting: Real-time rendering of animatable avatars[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 788-798.
谷歌引用:51
Animatable Gaussians
Li Z, Zheng Z, Wang L, et al. Animatable gaussians: Learning pose-dependent gaussian maps for high-fidelity human avatar modeling[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 19711-19722.
谷歌引用:104
ASH
Pang H, Zhu H, Kortylewski A, et al. Ash: Animatable gaussian splats for efficient and photoreal human rendering[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 1165-1175.
谷歌引用:48
D3GA
Zielonka W, Bagautdinov T, Saito S, et al. Drivable 3d gaussian avatars[J]. arXiv preprint arXiv:2311.08581, 2023.
谷歌引用:79
HiFi4G
Jiang Y, Shen Z, Wang P, et al. Hifi4g: High-fidelity human performance rendering via compact gaussian splatting[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 19734-19745.
谷歌引用:37
GPS-Gaussian
Zheng S, Zhou B, Shao R, et al. Gps-gaussian: Generalizable pixel-wise 3d gaussian splatting for real-time human novel view synthesis[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 19680-19690.
谷歌引用:92
单目视频合成
Human gaussian splats
Kocabas M, Chang J H R, Gabriel J, et al. Hugs: Human gaussian splats[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2024: 505-515.
谷歌引用:92
ParDy-Human
Jung H J, Brasch N, Song J, et al. Deformable 3d gaussian splatting for animatable human avatars[J]. arXiv preprint arXiv:2312.15059, 2023.
谷歌引用:24
Human101
Li M, Tao J, Yang Z, et al. Human101: Training 100+ fps human gaussians in 100s from 1 view[J]. arXiv preprint arXiv:2312.15258, 2023.
谷歌引用:23
3DGS-Avatar
Qian Z, Wang S, Mihajlovic M, et al. 3dgs-avatar: Animatable avatars via deformable 3d gaussian splatting[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 5020-5030.
谷歌引用:85
GaussianAvatar
Hu L, Zhang H, Zhang Y, et al. Gaussianavatar: Towards realistic human avatar modeling from a single video via animatable 3d gaussians[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 634-644.
谷歌引用:144
GauHuman
Hu S, Hu T, Liu Z. Gauhuman: Articulated gaussian splatting from monocular human videos[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 20418-20431.
谷歌引用:76
GaussianBody
Li M, Yao S, Xie Z, et al. Gaussianbody: Clothed human reconstruction via 3d gaussian splatting[J]. arXiv preprint arXiv:2401.09720, 2024.
谷歌引用:29
SplattingAvatar
Shao Z, Wang Z, Li Z, et al. Splattingavatar: Realistic real-time human avatars with mesh-embedded gaussian splatting[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 1606-1616.
谷歌引用:54
GoMAvatar
Wen J, Zhao X, Ren Z, et al. Gomavatar: Efficient animatable human modeling from monocular video using gaussians-on-mesh[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 2059-2069.
谷歌引用:22
Human Heads
GaussianAvatars
Qian S, Kirschstein T, Schoneveld L, et al. Gaussianavatars: Photorealistic head avatars with rigged 3d gaussians[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 20299-20309.
谷歌引用:113
Gaussian Head Avatar
Xu Y, Chen B, Li Z, et al. Gaussian head avatar: Ultra high-fidelity head avatar via dynamic gaussians[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2024: 1931-1941.
谷歌引用:83
HeadGas
Dhamo H, Nie Y, Moreau A, et al. Headgas: Real-time animatable head avatars via 3d gaussian splatting[C]//European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2024: 459-476.
谷歌引用:23
Point-based Morphable Shape Model (PMSM)
Zhao Z, Bao Z, Li Q, et al. Psavatar: A point-based morphable shape model for real-time head avatar creation with 3d gaussian splatting[J]. arXiv preprint arXiv:2401.12900, 2024.
谷歌引用:13
MonoGaussianAvatar
Chen Y, Wang L, Li Q, et al. Monogaussianavatar: Monocular gaussian point-based head avatar[C]//ACM SIGGRAPH 2024 Conference Papers. 2024: 1-9.
谷歌引用:40
GaussianHead
Wang J, Xie J C, Li X, et al. Gaussianhead: High-fidelity head avatars with learnable gaussian derivation[J]. arXiv preprint arXiv:2312.01632, 2023.
谷歌引用:14
83
HeadGas
Dhamo H, Nie Y, Moreau A, et al. Headgas: Real-time animatable head avatars via 3d gaussian splatting[C]//European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2024: 459-476.
谷歌引用:23
Point-based Morphable Shape Model (PMSM)
Zhao Z, Bao Z, Li Q, et al. Psavatar: A point-based morphable shape model for real-time head avatar creation with 3d gaussian splatting[J]. arXiv preprint arXiv:2401.12900, 2024.
谷歌引用:13
MonoGaussianAvatar
Chen Y, Wang L, Li Q, et al. Monogaussianavatar: Monocular gaussian point-based head avatar[C]//ACM SIGGRAPH 2024 Conference Papers. 2024: 1-9.
谷歌引用:40
GaussianHead
Wang J, Xie J C, Li X, et al. Gaussianhead: High-fidelity head avatars with learnable gaussian derivation[J]. arXiv preprint arXiv:2312.01632, 2023.
谷歌引用:14