Unified Occlusion Culling: Portals, Visibility Umbra, and HZB

转载 2010年11月15日 17:26:00

In any realistic 3D scene, there are things you can see and things that you can't see. It sounds simple, but the application of this knowledge can have very beneficial effects on the performance of an application billed with rendering the 3D scene in real time. Generally, we want to lower the computational cost of objects that do not contribute to the scene - those objects that are outside our field of view or are obscured by other objects. The Z-buffer is the last stage in the rendering pipeline where an object can be identified as "invisible"; in this case, the object in question is a fragment, and it can only be identified as invisible when the fragment's depth value is "deeper" than something that has already been rendered at the same raster position. The Z-buffer test is invoked for each and every fragment of a higher-level object being rendered - the polygonal representation of a "thing" in the scene, such as a person, a wall, a car, etc - and thus imposes a rather high computational cost of visibility determination, even with recent optimizations made to hardware z-buffer mechanisms.

Figure 1: A complex scene making use of occlusion culling optimizations. (Image from Yann L)

We would prefer to exercise the ability to decide whether an object will be invisible well before invoking the cost of a per-fragment test on the entire polygonal representation of the object. Mechanisms for doing so are referred to as "culling mechanisms" because of their nature to select only certain objects from a larger set (flock) of objects. Some well-known culling mechanisms include backface culling, which prevents the rasterization of any fragments on a polygon known to be "facing away" from the viewing position, and frustum culling, which prevents the consideration of polygons belonging to objects that are entirely outside the viewing frustum (field of view). Hierarchical spatial trees are also employed to further optimize the decision of whether objects are outside the viewing frustum or not. In sparse scenes, such as those found in space-flight simulators, those two mechanisms alone are usually sufficient to improve the rendering performance of an application. Many scenes are not sparse though, and when using a more down-to-earth point of view in the more dense scenes, we tend to get a lot more needless overdraw when simply culling to the view frustum. Perspective projection places a great emphasis on the screen size of objects near to the view position, such that they collectively obscure large numbers of farther-away objects; we would like to figure out which far-away objects are completely hidden, and thus avoid bogging down our rendering pipeline with them. The mechanism for performing this operation is known as "occlusion culling", and has been the focus of many research papers. Within, I propose a system of combining several effective occlusion mechanisms for optimal performance in a variety of different scene types, with a potential speedup factor of several hundred percent in some scenes. The specific occlusion mechanisms involved include: portal systems, visibility umbra, and hierarchical Z-buffers.

The Culling Systems

Portals: A portal system divides the empty space of a scene into "sectors" and makes connections between them using "portals", creating a graph of spaces (nodes) connected by portals (edges). The spaces are generally rooms and corridors, and the portals are windows and doorways, atlhough in dense portal systems such as those generated from BSP, the portals are much more profuse. The process of rendering a portalized set starts with the sector that the view position is in, and then tests whether the view frustum contains any portals in the current sector. When a portal is found within the frustum, the frustum is clipped to the extents of that portal, and the process continues in a depth-first recursive manner into the sector connected through that portal, and stops the recursion when all visible portals in a sector have been traversed. This allows the scene traversal to render only the objects that are visible through holes in the scene geometry, implicitly ignoring any geometry hidden behind solid chunks of the scene. We don't require a perfect clipping operation for our portal-constrained objects, so a per-object flag can be used to identify those objects that have already been rendered in the current traversal. Portal occlusion is useful when you have tightly enclosed scenes with lots of potential overdraw and full of holes, such as building interiors.

Figure 2a: Portalized scene with clipped frusta shown.

Figure 2b: Graph representation of portalized scene. Lines indicate portal connections.

Modern 3D engines have been using portals in their scene representations for quite some time. Doom and Build (Duke Nukem 3D, Shadow Warrior) used explicit 2D sectors and portals, with heights applied to the floors and ceilings to give a more 3D effect. Quake used CSG to construct levels, and in the process of compiling a static PVS for each sector, actually compiled a BSP and calculated the PVS using the portals generated from the BSP. Descent had a scene composed exclusively of cuboidal sectors connected together with portals situated at each face of the cuboids, giving one of the first true 3D portal engines. Jedi Knight was also an explicit sector and portal engine, with arbitrary 3D sectors and portal placement allowing much more realistic environments. Unreal also made use of portals, both manually placed and generated from BSP compilation. Most of these portal representations used very densely concentrated portals, generating them at every convex edge in a scene. This of course becomes very prohibitive as the number of polygons (and thus edges) increases, and so more recent engines have encouraged manually-placed portals or generated portals from a simplified set of scene geometry.

Visibility Umbra: This is a method of extruding a frustum of the "hidden space" behind the solid convex portion of an object, and is used for testing whether whole objects are inside or outside the extruded frustum. The frustum extrusion follows the same mechanism as that used for generating stencil shadow volumes, although with a few more rules: the silhouette edges of the object's occlusion mesh are determined, and planes are extruded away from the camera position; the occlusion mesh must be convex to give a frustum composed of a convex set of planes. This method is less common in modern games than portals, but some engine developers have cited use of this mechanism, with one prior publication on the topic available on gamasutra.com [Bacik]. This occlusion mechanism is useful when the scene has large and/or mostly-convex objects that tend to obscure each other and other objects; Bacik makes use of it for having smoothly rolling hills occlude other hills and valleys in outdoors landscape scenes. The focus on large objects is due to the inability of the frusta of multiple objects to combine into a single testable volume, making it difficult for a large number of small objects to effectively occlude anything. A fused volume could be constructed using BSP methods, but I have not seen much prior research on that topic, and the cost of performing a BSP compilation each frame versus the savings of not rendering objects that are only hidden by combinations of occluders is hard to determine, as one is O(n^2) and the other is O(k), where n and k are isolated variables. Also of note is the fact that this is a geometric approach, and thus can be quite precise, assuming the occlusion mesh being used is an accurate representation of the visible geometry.

Figure 3: An Occlusion frustum built from an occluder, using current viewer position. (Image from [Bacik])

HZB: A hierarchical Z-buffer (HZB) is a method of rasterizing certain portions of a scene into a depth buffer and then performing image-space occlusion queries on a hierarchical structure constructed from the depth buffer. The hierarchical structure contains conservative depth information, in a pyramidal structure similar to a mipmap chain, and allows very fast queries to be performed on many objects in a scene. The conservative depth information is the minimum and/or maximum depth values downsampled in a 4:1 manner, like with mipmaps. The most simple HZB only stores maximum depth information at each sample position, which allows rejection (definitely invisible) at low levels of the structure; you can also store minimum depth values to accept objects (definitely visible). Generally only objects determined or known to be "good occluders" are rasterized into the root-level depth buffer, and coarse representations of other objects are tested against the hierarchical structure. These coarse representations can be hierarchical spatial models, like those used in frustum culling, to further optimize the culling mechanism by considering a large group of objects with a single test. Due to the high cost of the hierarchy generation, all occluders must be selected and rasterized before visibility testing can be performed; this can be problematic if near and far objects in the scene provide occlusion, and thus the selection of occluders is a matter that benefits from tweaking for specific scenes or scene types.

Figure 4: A simple HZB construction with a 4x4 base depth buffer.

Now they've all been introduced, let's make some comparisons. The HZB and visibility umbra methods are similar in that they find spaces that are definitely invisible, as opposed to the portal method which finds spaces that are definitely visible - I will refer to the portal method as "negative occlusion" and the others as "positive occlusion". Both positive occlusion methods require some metric for determining "good occluders" from the set of objects immediately visible in a scene - some objects might always be good occluders, some objects might be good occluders only when they occupy some certain screen space, and some might just not be useful for occlusion. Some examples of the three types: walls & doors, rocks, shrubbery. The primary advantage that HZB method has over visibility umbra is the inherent ability to fuse distinct occluders into a single useful occlusion structure, whereas the visibility umbra of several objects are more difficult to fuse together. HZB is a fairly high-cost computation to perform, due to its per-fragment nature and per-frame hierarchical reconstruction, and thus should be used mainly in scenes where portals and visibility umbra tend to fail. Such HZB-friendly scenes include a dense forest where many relatively thin tree trunks combine to form large swaths of blocked visibility, or a city scene where skyscrapers form a forest of buildings in much the same manner. It is important to note that the HZB is implemented on the CPU, so that high-level scene visibility can run in parallel with the fine-grain Z-buffering done by video hardware, without needing to synchronize the two processors. The resolution of the root-level depth buffer in a HZB does not need to equal the size of the viewport used on the video hardware, and a generally useful resolution is around 128x128 to 256x256 pixels. In scenes where HZB is unnecessary, it will simply slow things down by tying up the CPU with rasterization.

Unification of the culling mechanisms

None of these methods is a silver bullet for solving your occlusion culling issues, but when combined they can be quite effective.
To Be Continued ...

Various responses by Yann Lombard in threads on gamedev.net
Rendering the Great Outdoors: Fast Occlusion Culling for Outdoor Environments by Michal Bacik. July 17, 2002.



Unified Occlusion Culling: Portals, Visibility Umbra, and HZB

In any realistic 3D scene, there are things you can see and things that you can't see. It sounds sim...
  • pizi0475
  • pizi0475
  • 2015年10月16日 11:22
  • 575

Occlusion Culling Algorithms

Excerpted from Real-Time Rendering (AK Peters, 1999) One of the great myths c...
  • c30gcrk
  • c30gcrk
  • 2008年11月11日 13:44
  • 1206

unity3D的occlusion culling的简单使用

由于网上关于这部分的使用信息很少 我也是在Unity官方论坛看到有人提问 这个工具的使用问题 根据别人的回答 摸索出来 和大家分享一下 怪自己e文不太好 搞了2个小时 大家赶紧去学英文啊...
  • dlnuchunge
  • dlnuchunge
  • 2012年04月11日 07:52
  • 4526

遮挡剔除(仅专业版) Occlusion Culling (Pro only)

Occlusion Culling is a feature that disables rendering of objects when they are not currently seen b...
  • hcud024
  • hcud024
  • 2016年07月12日 19:33
  • 883

Occlusion Culling遮挡剔除理解设置和地形优化应用

这里使用的是unity5.5版本 具体解释网上都有,就不多说了,这里主要说明怎么使用,以及参数设置和实际注意点 在大场景地形的优化上,但也不是随便烘焙就能降低帧率的,必须结合实际情况来考虑,当然还有透...
  • shenmifangke
  • shenmifangke
  • 2017年02月28日 14:50
  • 2300

Unity摄像机遮挡剔除(Occlusion Cullings)

在之前很多小伙伴在做游戏优化时发现,为了使摄像机没有看到的部分隐藏起来达到 但是使用勾选了摄像器的Occlusion Culling属性还是无法实现目标效果 这是因为只是赋予了摄像机具有遮挡剔除功能...
  • qq_33747722
  • qq_33747722
  • 2017年04月23日 22:02
  • 3219

Unity Occlusion Culling 遮挡剔除研究

一、unity裁剪包括,视锥裁剪和遮挡裁剪。什么是视锥体裁剪? 我们来直接看下官方的图解,看图说话。 场景中的对象: https://docs.unity3d.com/uploads/Main/...
  • cartzhang
  • cartzhang
  • 2016年09月27日 19:51
  • 11601

[HSR算法-Occlusion Culling]

作者:Siney 现在的HSR算法基本上可以分为四种:backface culling、frustum culling、portal culling、occlusion culling。它们的作...
  • pizi0475
  • pizi0475
  • 2013年10月20日 12:27
  • 3310

Occlusion Culling-Unity

Occlusion Culling (Pro only) Occlusion Culling is a feature that disables rendering of objects when...
  • cubesky
  • cubesky
  • 2014年08月22日 11:54
  • 732

Unity3d 场景搭建 基础 学习

第一章:Unity环境搭建 所用软件:Unity 5.4.0f3 (64-bit) assets文件夹最重要 Scene场景面板         常用快捷键 1.按下鼠标滚轮拖动场景(或...
  • dongkaixuan
  • dongkaixuan
  • 2017年09月02日 12:04
  • 277
您举报文章: Unified Occlusion Culling: Portals, Visibility Umbra, and HZB