Real-Time Rendering——5.4.2 Screen-Based Antialiasing 基于屏幕的抗锯齿

Edges of triangles produce noticeable artifacts if not sampled and filtered well. Shadow boundaries, specular highlights, and other phenomena where the color is changing rapidly can cause similar problems. The algorithms discussed in this section help improve the rendering quality for these cases. They have the common thread that they are screen based, i.e., that they operate only on the output samples of the pipeline.There is no one best antialiasing technique, as each has different advantages in terms of quality, ability to capture sharp details or other phenomena, appearance during movement, memory cost, GPU requirements, and speed.

如果采样和过滤不好,三角形的边缘会产生明显的伪像。阴影边界、镜面高光和其他颜色快速变化的现象也会导致类似的问题。本节讨论的算法有助于提高这些情况下的渲染质量。它们具有基于屏幕的共同点,即它们只对流水线的输出样本进行操作。没有一种最好的抗锯齿技术,因为每种技术在质量、捕捉清晰细节或其他现象的能力、运动中的外观、内存成本、GPU要求和速度方面都有不同的优势。

In the black triangle example in Figure 5.14, one problem is the low sampling rate.A single sample is taken at the center of each pixel’s grid cell, so the most that is known about the cell is whether or not the center is covered by the triangle. By using more samples per screen grid cell and blending these in some fashion, a better pixel color can be computed. This is illustrated in Figure 5.24.

在图5.14的黑三角例子中,一个问题是低采样率。在每个像素的网格单元的中心提取单个样本,因此关于该单元的最多已知信息是该中心是否被三角形覆盖。通过在每个屏幕网格单元中使用更多的样本,并以某种方式混合这些样本,可以计算出更好的像素颜色。这如图5.24所示。

Figure 5.24. On the left, a red triangle is rendered with one sample at the center of the pixel. Since the triangle does not cover the sample, the pixel will be white, even though a substantial part of the pixel is covered by the red triangle. On the right, four samples are used per pixel, and as can be seen,two of these are covered by the red triangle, which results in a pink pixel color.

图5.24。在左侧,一个红色三角形被渲染,其中一个样本位于像素的中心。因为三角形没有覆盖样本,所以像素将是白色的,即使像素的大部分被红色三角形覆盖。在右边,每个像素使用四个样本,可以看到,其中两个样本被红色三角形覆盖,这导致了粉红色的像素颜色。

The general strategy of screen-based antialiasing schemes is to use a sampling pattern for the screen and then weight and sum the samples to produce a pixel color, p:

基于屏幕的抗锯齿方案的一般策略是对屏幕使用采样模式,然后对样本进行加权和求和,以产生像素颜色p:

where n is the number of samples taken for a pixel. The function c(i, x, y) is a sample color and wi is a weight, in the range [0, 1], that the sample will contribute to the overall pixel color. The sample position is taken based on which sample it is in the series 1, . . . , n, and the function optionally also uses the integer part of the pixel location (x, y). In other words, where the sample is taken on the screen grid is different for each sample, and optionally the sampling pattern can vary from pixel to pixel. Samples are normally point samples in real-time rendering systems (and most other rendering systems, for that matter). So, the function c can be thought of as two functions. First, a function f (i, n) retrieves the floating point (xf , yf ) location on the screen where a sample is needed. This location on the screen is then sampled, i.e.,the color at that precise point is retrieved. The sampling scheme is chosen and the rendering pipeline configured to compute the samples at particular subpixel locations,typically based on a per-frame (or per-application) setting. 

其中n是对一个像素进行采样的数量。函数c(i,x,y)是样本颜色,wi是权重,在范围[0,1]内,样本将对整体像素颜色有贡献。根据样品在系列1, . . . , n,并且该函数还可选地使用像素位置(x,y)的整数部分。换句话说,对于每个样本,在屏幕网格上获取样本的位置是不同的,并且可选地,采样模式可以因像素而异。在实时渲染系统(以及大多数其他渲染系统)中,采样通常是点采样。所以,函数c可以看作是两个函数。首先,函数f (i,n)检索需要样本的屏幕上的浮点(xf,yf)位置。然后对屏幕上的该位置进行采样,即检索该精确点的颜色。通常基于每帧(或每应用)设置,选择采样方案并配置渲染流水线以计算特定子像素位置处的样本。

The other variable in antialiasing is wi, the weight of each sample. These weights sum to one. Most methods used in real-time rendering systems give a uniform weight to their samples, i.e., wi = 1 n. The default mode for graphics hardware, a single sample at the center of the pixel, is the simplest case of the antialiasing equation above. There is only one term, the weight of this term is one, and the sampling function f always returns the center of the pixel being sampled.

抗锯齿的另一个变量是wi,即每个样本的权重。这些重量加起来等于一。实时渲染系统中使用的大多数方法都为样本赋予统一的权重,即wi = 1/n。图形硬件的默认模式是在像素中心放置一个样本,这是上述抗锯齿等式的最简单情况。只有一项,该项的权重为1,采样函数f总是返回被采样像素的中心。

 Antialiasing algorithms that compute more than one full sample per pixel are called supersampling (or oversampling) methods. Conceptually simplest, full-scene antialiasing (FSAA), also known as “supersampling antialiasing” (SSAA), renders the scene at a higher resolution and then filters neighboring samples to create an image.For example, say an image of 1280 × 1024 pixels is desired. If you render an image of 2560×2048 offscreen and then average each 2×2 pixel area on the screen, the desired image is generated with four samples per pixel, filtered using a box filter. Note that this corresponds to 2 × 2 grid sampling in Figure 5.25. This method is costly, as all subsamples must be fully shaded and filled, with a z-buffer depth per sample. FSAA’s main advantage is simplicity. Other, lower-quality versions of this method sample at twice the rate on only one screen axis, and so are called 1×2 or 2×1 supersampling.Typically, powers-of-two resolution and a box filter are used for simplicity. NVIDIA’s dynamic super resolution feature is a more elaborate form of supersampling, where the scene is rendered at some higher resolution and a 13-sample Gaussian filter is used to generate the displayed image.

每像素计算一个以上完整样本的抗锯齿算法称为超级采样(或过采样)方法。概念上最简单的,全场景抗锯齿(FSAA),也称为“超级采样抗锯齿”(SSAA),以更高的分辨率渲染场景,然后过滤邻近的样本以创建图像。例如,假设需要1280 × 1024像素的图像。如果在屏幕外渲染2560×2048的图像,然后对屏幕上的每个2×2像素区域进行平均,则会生成所需的图像,每个像素有四个样本,并使用箱式过滤器进行过滤。注意,这相当于图5.25中的2 × 2网格采样。这种方法成本很高,因为所有子样本都必须完全遮蔽和填充,每个样本都有一个z缓冲深度。FSAA的主要优势是简单。此方法的其他低质量版本仅在一个屏幕轴上以两倍的速率采样,因此被称为1×2或2×1超级采样。为简单起见,通常使用2的幂分辨率和箱式滤波器。NVIDIA的动态超分辨率功能是一种更复杂的超级采样形式,其中场景以更高的分辨率渲染,13-样本高斯滤波器用于生成显示的图像。

Figure 5.25. A comparison of some pixel sampling schemes, ranging from least to most samples per pixel. Quincunx shares the corner samples and weights its center sample to be worth half of the pixel’s final color. The 2 × 2 rotated grid captures more gray levels for the nearly horizontal edge than a straight 2×2 grid. Similarly, the 8 rooks pattern captures more gray levels for such lines than a 4 × 4 grid, despite using fewer samples. 

图5.25。一些像素采样方案的比较,范围从每个像素最少到最多采样。梅花共享角点样本,并对其中心样本进行加权,使其值为像素最终颜色的一半。对于接近水平的边缘,2 × 2旋转网格比直的2×2网格捕获更多的灰度级。类似地,8车模式比4 × 4网格捕捉到更多的灰度级,尽管使用了更少的样本。

A sampling method related to supersampling is based on the idea of the accumulation buffer. Instead of one large offscreen buffer, this method uses a buffer that has the same resolution as the desired image, but with more bits of color per channel. To obtain a 2 × 2 sampling of a scene, four images are generated, with the view moved half a pixel in the screen x- or y-direction as needed. Each image generated is based on a different sample position within the grid cell. The additional costs of having to re-render the scene a few times per frame and copy the result to the screen makes this algorithm costly for real-time rendering systems. It is useful for generating higher-quality images when performance is not critical, since any number of samples, placed anywhere, can be used per pixel [1679]. The accumulation buffer used to be a separate piece of hardware. It was supported directly in the OpenGL API, but was deprecated in version 3.0. On modern GPUs the accumulation buffer concept can be implemented in a pixel shader by using a higher-precision color format for the output buffer.

与超级采样相关的采样方法是基于累积缓冲器的思想。这种方法不是使用一个大的屏幕外缓冲区,而是使用一个与所需图像具有相同分辨率的缓冲区,但每个通道有更多的颜色位。为了获得场景的2 × 2采样,生成四幅图像,视图根据需要在屏幕x或y方向上移动半个像素。生成的每个图像都基于网格单元内不同的样本位置。每帧必须重新渲染场景几次并将结果复制到屏幕上的额外成本使得该算法对于实时渲染系统来说成本很高。当性能不重要时,它有助于生成更高质量的图像,因为每个像素可以使用任何数量的样本,放置在任何地方。累积缓冲器曾经是一个独立的硬件。它在OpenGL API中受到直接支持,但在3.0版中被弃用。在现代GPU上,通过为输出缓冲区使用更高精度的颜色格式,可以在像素着色器中实现累积缓冲区概念。

Additional samples are needed when phenomena such as object edges, specular highlights, and sharp shadows cause abrupt color changes. Shadows can often be made softer and highlights smoother to avoid aliasing. Particular object types can be increased in size, such as electrical wires, so that they are guaranteed to cover at least one pixel at each location along their length. Aliasing of object edges still remains as a major sampling problem. It is possible to use analytical methods,where object edges are detected during rendering and their influence is factored in,but these are often more expensive and less robust than simply taking more samples.However, GPU features such as conservative rasterization and rasterizer order views have opened up new possibilities.

当对象边缘、镜面高光和尖锐阴影等现象导致颜色突然变化时,需要额外的样本。通常可以使阴影更柔和,高光更平滑,以避免锯齿。特定对象类型的尺寸可以增加,例如电线,以便保证它们在沿其长度的每个位置覆盖至少一个像素。物体边缘的锯齿仍然是一个主要的采样问题。可以使用分析方法,在渲染过程中检测对象边缘,并将它们的影响考虑在内,但这些方法通常比简单地获取更多样本更昂贵,也更不稳定。然而,保守光栅化和光栅化顺序视图等GPU功能开辟了新的可能性。

Techniques such as supersampling and accumulation buffering work by generating samples that are fully specified with individually computed shades and depths. The overall gains are relatively low and the cost is high, as each sample has to run through a pixel shader.

诸如超级采样和累积缓冲之类的技术通过生成用单独计算的阴影和深度完全指定的样本来工作。总体增益相对较低,成本较高,因为每个样本都必须经过像素着色器。

Multisampling antialiasing (MSAA) lessens the high computational costs by computing the surface’s shade once per pixel and sharing this result among the samples.Pixels may have, say, four (x, y) sample locations per fragment, each with their own color and z-depth, but the pixel shader is evaluated only once for each object fragment applied to the pixel. If all MSAA positional samples are covered by the fragment, the shading sample is evaluated at the center of the pixel. If instead the fragment covers fewer positional samples, the shading sample’s position can be shifted to better represent the positions covered. Doing so avoids shade sampling off the edge of a texture, for example. This position adjustment is called centroid sampling or centroid interpolation and is done automatically by the GPU, if enabled. Centroid sampling avoids off-triangle problems but can cause derivative computations to return incorrect values. See Figure 5.26.

多采样抗锯齿(MSAA)通过每个像素计算一次表面的着色并在样本之间共享该结果来减少高计算成本。比方说,像素的每个片段可能有四个(x,y)采样位置,每个采样位置都有自己的颜色和z深度,但是对于应用于像素的每个对象片段,像素着色器只计算一次。如果片段覆盖了所有MSAA位置样本,则在像素的中心评估着色样本。相反,如果片段覆盖更少的位置样本,则可以移动着色样本的位置,以更好地表示所覆盖的位置。例如,这样做可以避免纹理边缘的着色采样。这种位置调整称为质心采样或质心插值,由GPU自动完成(如果启用)。质心采样避免了非三角形问题,但会导致导数计算返回不正确的值。参见图5.26。

Figure 5.26. In the middle, a pixel with two objects overlapping it. The red object covers three samples, the blue just one. Pixel shader evaluation locations are shown in green. Since the red triangle covers the center of the pixel, this location is used for shader evaluation. The pixel shader for the blue object is evaluated at the sample’s location. For MSAA, a separate color and depth is stored at all four locations. On the right the 2f4x mode for EQAA is shown. The four samples now have four ID values, which index a table of the two colors and depths stored. 

图5.26。中间是两个对象重叠的像素。红色物体覆盖三个样本,蓝色物体只覆盖一个样本。像素着色器评估位置以绿色显示。由于红色三角形覆盖了像素的中心,因此该位置用于着色器评估。蓝色对象的像素着色器在样本位置进行计算。对于MSAA,不同的颜色和深度存储在四个位置。右侧显示了EQAA的2f4x模式。这四个样本现在有四个ID值,它们为存储的两种颜色和深度的表建立索引。

MSAA is faster than a pure supersampling scheme because the fragment is shaded only once. It focuses effort on sampling the fragment’s pixel coverage at a higher rate and sharing the computed shade. It is possible to save more memory by further decoupling sampling and coverage, which in turn can make antialiasing faster still—the less memory touched, the quicker the render. NVIDIA introduced coverage sampling antialiasing (CSAA) in 2006, and AMD followed suit with enhanced quality antialiasing (EQAA). These techniques work by storing only the coverage for the fragment at a higher sampling rate. For example, EQAA’s “2f4x” mode stores two color and depth values, shared among four sample locations. The colors and depths are no longer stored for particular locations but rather saved in a table. Each of the four samples then needs just one bit to specify which of the two stored values is associated with its location. See Figure 5.26. The coverage samples specify the contribution of each fragment to the final pixel color. If the number of colors stored is exceeded, a stored color is evicted and its samples are marked as unknown. These samples do not contribute to the final color [382, 383]. For most scenes there are relatively few pixels containing three or more visible opaque fragments that are radically different in shade, so this scheme performs well in practice [1405]. However, for highest quality, the game Forza Horizon 2 went with 4× MSAA, though EQAA had a performance benefit.

MSAA比纯粹的超级采样方案更快,因为片段只被着色一次。它致力于以更高的速率对片段的像素覆盖进行采样,并共享计算出的着色。通过进一步分离采样和覆盖,可以节省更多的内存,这反过来又可以使抗锯齿更快-占用的内存越少,渲染越快。NVIDIA在2006年推出了覆盖采样抗锯齿(CSAA),AMD紧随其后推出了增强质量抗锯齿(EQAA)。这些技术通过以较高的采样率仅存储片段的覆盖来工作。例如,EQAA的“2f4x”模式存储两个颜色和深度值,在四个采样位置之间共享。颜色和深度不再存储在特定位置,而是保存在表格中。四个样本中的每一个都只需要一个比特来指定两个存储值中的哪一个与其位置相关联。参见图5.26。覆盖样本指定了每个片段对最终像素颜色的贡献。如果超过存储的颜色数量,存储的颜色将被清除,其样本将被标记为未知。这些样本对最终颜色没有贡献[382,383]。对于大多数场景,包含三个或三个以上明显不同的不透明片段的像素相对较少,因此该方案在实践中表现良好[1405]。然而,为了获得最高质量,游戏Forza Horizon 2采用了4倍MSAA,尽管EQAA有性能优势。

Once all geometry has been rendered to a multiple-sample buffer, a resolve operation is then performed. This procedure averages the sample colors together to determine the color for the pixel. It is worth noting that a problem can arise when using multisampling with high dynamic range color values. In such cases, to avoid artifacts you normally need to tone-map the values before the resolve. This can be expensive, so a simpler approximation to the tone map function or other methods can be used.

一旦所有几何图形都被渲染到多采样缓冲区,就执行解析操作。此过程将样本颜色平均在一起,以确定像素的颜色。值得注意的是,当使用具有高动态范围颜色值的多重采样时,可能会出现问题。在这种情况下,为了避免伪像,通常需要在解析之前对值进行色调映射。这可能是昂贵的,因此可以使用色调映射函数的更简单的近似或其他方法。

By default, MSAA is resolved with a box filter. In 2007 ATI introduced custom filter antialiasing (CFAA), with the capabilities of using narrow and wide tent filters that extend slightly into other pixel cells. This mode has since been supplanted by EQAA support. On modern GPUs pixel or compute shaders can access the MSAA samples and use whatever reconstruction filter is desired, including one that samples from the surrounding pixels’ samples. A wider filter can reduce aliasing, though at the loss of sharp details. Pettineo found that the cubic smoothstep and B-spline filters with a filter width of 2 or 3 pixels gave the best results overall. There is also a performance cost, as even emulating the default box filter resolve will take longer with a custom shader, and a wider filter kernel means increased sample access costs.

默认情况下,使用箱式过滤器解析MSAA。2007年,ATI推出了自定义过滤器抗锯齿(CFAA),能够使用窄和宽帐篷过滤器,稍微延伸到其他像素单元。这种模式已经被EQAA支持所取代。在现代GPU上,像素或计算着色器可以访问MSAA样本,并使用任何所需的重建过滤器,包括从周围像素的样本中采样的过滤器。较宽的滤镜可以减少锯齿,但会损失清晰的细节。Pettineo发现,滤波器宽度为2或3个像素的三次smoothstep和B样条滤波器整体效果最好。还有一个性能成本,因为即使使用自定义着色器模拟默认的长方体过滤器解析也将花费更长的时间,并且更宽的过滤器内核意味着增加的采样访问成本。

NVIDIA’s built-in TXAA support similarly uses a better reconstruction filter over a wider area than a single pixel to give a better result. It and the newer MFAA (multiframe antialiasing) scheme both also use temporal antialiasing (TAA), a general class of techniques that use results from previous frames to improve the image. In part such techniques are made possible due to functionality that lets the programmer set the MSAA sampling pattern per frame. Such techniques can attack aliasing problems such as the spinning wagon wheel and can also improve edge rendering quality.

NVIDIA的内置TXAA支持类似地在比单个像素更宽的区域上使用更好的重建过滤器,以提供更好的结果。它和较新的MFAA(多帧抗锯齿)方案都使用时间抗锯齿(TAA),这是一种使用先前帧的结果来改善图像的通用技术。在某种程度上,这些技术之所以成为可能,是因为允许程序员设置每帧的MSAA采样模式的功能。这种技术可以解决走样问题,如旋转的车轮,还可以提高边缘渲染质量。

Imagine performing a sampling pattern “manually” by generating a series of images where each render uses a different location within the pixel for where the sample is taken. This offsetting is done by appending a tiny translation on to the projection matrix.The more images that are generated and averaged together, the better the result. This concept of using multiple offset images is used in temporal antialiasing algorithms.A single image is generated, possibly with MSAA or another method, and the previous images are blended in. Usually just two to four frames are used.Older images may be given exponentially less weight, though this can have the effect of the frame shimmering if the viewer and scene do not move, so often equal weighting of just the last and current frame is done. With each frame’s samples in a different subpixel location, the weighted sum of these samples gives a better coverage estimate of the edge than a single frame does. So, a system using the latest two frames averaged together can give a better result. No additional samples are needed for each frame, which is what makes this type of approach so appealing. It is even possible to use temporal sampling to allow generation of a lower-resolution image that is upscaled to the display’s resolution. In addition, illumination methods or other techniques that require many samples for a good result can instead use fewer samples each frame, since the results will be blended over several frames.

想象一下,通过生成一系列图像来“手动”执行采样模式,其中每个渲染使用像素内的不同位置作为采样位置。这种偏移是通过在投影矩阵上附加微小的平移来完成的。生成和平均的图像越多,结果越好。使用多个偏移图像的概念用于时间抗锯齿算法。可能使用MSAA或另一种方法生成单个图像,并融合先前的图像。通常只使用两到四帧。较旧的图像可能被赋予指数级的较低权重,尽管如果观看者和场景不移动,这可能具有帧闪烁的效果,因此通常仅对最后一帧和当前帧进行相等的加权。由于每个帧的样本在不同的子像素位置,这些样本的加权和给出了比单个帧更好的边缘覆盖估计。因此,使用一起平均的最新两帧的系统可以给出更好的结果。每一帧都不需要额外的样本,这就是这种方法如此吸引人的原因。甚至有可能使用时间采样来产生较低分辨率的图像,该图像被放大到显示器的分辨率。此外,照明方法或其他需要许多样本以获得良好结果的技术可以改为每帧使用较少的样本,因为结果将在几帧上混合。

While providing antialiasing for static scenes at no additional sampling cost, this type of algorithm has a few problems when used for temporal antialiasing. If the frames are not weighted equally, objects in a static scene can exhibit a shimmer.Rapidly moving objects or quick camera moves can cause ghosting, i.e., trails left behind the object due to the contributions of previous frames. One solution to ghosting is to perform such antialiasing on only slow-moving objects [1110]. Another important approach is to use reprojection (Section 12.2) to better correlate the previous and current frames’ objects. In such schemes, objects generate motion vectors that are stored in a separate “velocity buffer” (Section 12.5). These vectors are used to correlate the previous frame with the current one, i.e., the vector is subtracted from the current pixel location to find the previous frame’s color pixel for that object’s surface location.Samples unlikely to be part of the surface in the current frame are discarded.Because no extra samples, and so relatively little extra work, are needed for temporal antialiasing, there has been a strong interest and wider adoption of this type of algorithm in recent years. Some of this attention has been because deferred shading techniques (Section 20.1) are not compatible with MSAA and other multisampling support. Approaches vary and, depending on the application’s content and goals, a range of techniques for avoiding artifacts and improving quality have been developed. Wihlidal’s presentation, for example,shows how EQAA, temporal antialiasing, and various filtering techniques applied to a checkerboard sampling pattern can combine to maintain quality while lowering the number of pixel shader invocations. Iglesias-Guitian et al. summarize previous work and present their scheme to use pixel history and prediction to minimize filtering artifacts. Patney et al. extend TAA work by Karis and Lottes on the Unreal Engine 4 implementation for use in virtual reality applications, adding variable-sized sampling along with compensation for eye movement (Section 21.3.2).

虽然在不增加采样成本的情况下为静态场景提供了抗锯齿,但这种类型的算法在用于时间抗锯齿时存在一些问题。如果帧的权重不相等,静态场景中的对象可能会出现微光。快速移动的对象或快速的相机移动会导致重影,即由于先前帧的影响而在对象后面留下的痕迹。重影的一个解决方案是仅对缓慢移动的物体执行这种抗锯齿[1110]。另一个重要的方法是使用重新投影(第12.2节)来更好地关联前一帧和当前帧的对象。在这种方案中,物体产生的运动矢量存储在一个单独的“速度缓冲器”中(第12.5节)。这些向量用于将前一帧与当前帧相关联,即,从当前像素位置减去向量,以找到该对象表面位置的前一帧的颜色像素。不太可能成为当前帧中表面一部分的样本将被丢弃。因为时间抗锯齿不需要额外的样本,所以相对来说额外的工作很少,所以近年来这种类型的算法引起了强烈的兴趣并被广泛采用。这种关注部分是因为延迟着色技术(20.1节)与MSAA和其他多采样支持不兼容。方法多种多样,根据应用程序的内容和目标,已经开发了一系列避免伪像和提高质量的技术。例如,Wihlidal的演示显示了EQAA、时间抗锯齿和应用于棋盘采样模式的各种过滤技术如何结合在一起,以保持质量,同时减少像素着色器调用的数量。Iglesias-Guitian等人总结了以前的工作,并提出了使用像素历史和预测来最小化滤波伪像的方案。Patney等人扩展了Karis和Lottes在虚幻引擎4实现上的TAA工作,用于虚拟现实应用,增加了可变大小的采样以及眼球运动补偿(第21.3.2节)。

Sampling Patterns 采样模式

Effective sampling patterns are a key element in reducing aliasing, temporal and otherwise.Naiman shows that humans are most disturbed by aliasing on nearhorizontal and near-vertical edges. Edges with near 45 degrees slope are next most disturbing. Rotated grid supersampling (RGSS) uses a rotated square pattern to give more vertical and horizontal resolution within the pixel. Figure 5.25 shows an example of this pattern.

有效的采样模式是减少时间和其他方面锯齿的关键要素。Naiman表明,人类最受近水平和近垂直边缘上的锯齿干扰。斜率接近45度的边缘是第二大干扰。旋转栅格超级采样(RGSS)使用旋转正方形图案在像素内提供更高的垂直和水平分辨率。图5.25显示了这种模式的一个例子。

The RGSS pattern is a form of Latin hypercube or N-rooks sampling, in which n samples are placed in an n×n grid, with one sample per row and column. With RGSS, the four samples are each in a separate row and column of the 4 × 4 subpixel grid. Such patterns are particularly good for capturing nearly horizontal and vertical edges compared to a regular 2 × 2 sampling pattern, where such edges are likely to cover an even number of samples, so giving fewer effective levels.

RGSS模式是拉丁超立方体或N-rooks采样的一种形式,其中N个样本放置在n×n网格中,每行和每列一个样本。对于RGSS,四个样本分别位于4 × 4子像素网格的独立行和列中。与常规2 × 2采样模式相比,这种模式特别适合捕捉接近水平和垂直的边沿,常规2×2采样模式的边沿可能覆盖偶数个样本,因此有效等级较少。

N-rooks is a start at creating a good sampling pattern, but it is not sufficient. For example, the samples could all be places along the diagonal of a subpixel grid and so give a poor result for edges that are nearly parallel to this diagonal. See Figure 5.27.For better sampling we want to avoid putting two samples near each other. We also want a uniform distribution, spreading samples evenly over the area. To form such patterns, stratified sampling techniques such as Latin hypercube sampling are combined with other methods such as jittering, Halton sequences, and Poisson disk sampling.

N-rooks是创建一个好的采样模式的开始,但还不够。例如,样本可以都沿着子像素网格的对角线放置,因此对于几乎平行于该对角线的边缘给出了差的结果。见图5.27。为了更好地取样,我们要避免把两个样本放在一起。我们还希望分布均匀,将样本均匀地分布在整个区域。为了形成这样的模式,分层抽样技术(如拉丁超立方体抽样)与其他方法(如抖动、哈尔顿序列和泊松圆盘抽样)相结合。

Figure 5.27. N-rooks sampling. On the left is a legal N-rooks pattern, but it performs poorly in capturing triangle edges that are diagonal along its line, as all sample locations will be either inside or outside the triangle as this triangle shifts. On the right is a pattern that will capture this and other edges more effectively. 

图5.27。N-rooks抽样。左边是一个合法的N-rooks模式,但它在捕捉沿其线成对角线的三角形边时表现不佳,因为随着三角形的移动,所有样本位置要么在三角形内部,要么在三角形外部。右边是一个模式,可以更有效地捕捉这个和其他边缘。

In practice GPU manufacturers usually hard-wire such sampling patterns into their hardware for multisampling antialiasing. Figure 5.28 shows some MSAA patterns used in practice. For temporal antialiasing, the coverage pattern is whatever the programmer wants, as the sample locations can be varied frame to frame. For example,Karis [862] finds that a basic Halton sequence works better than any MSAA pattern provided by the GPU. A Halton sequence generates samples in space that appear random but have low discrepancy, that is, they are well distributed over the space and none are clustered.

在实践中,GPU制造商通常将这种采样模式硬连线到他们的硬件中,用于多采样抗锯齿。图5.28显示了实际中使用的一些MSAA模式。对于时间抗锯齿,覆盖模式是程序员想要的,因为采样位置可以逐帧变化。例如,Karis [862]发现基本的Halton序列比GPU提供的任何MSAA模式都更好。Halton序列在空间中生成看似随机但差异很小的样本,也就是说,它们在空间中分布良好,没有一个是聚集的。

Figure 5.28. MSAA sampling patterns for AMD and NVIDIA graphics accelerators. The green square is the location of the shading sample, and the red squares are the positional samples computed and saved. From left to right: 2×, 4×, 6× (AMD), and 8× (NVIDIA) sampling. (Generated by the D3D FSAA Viewer.) 

图5.28。AMD和NVIDIA图形加速器的MSAA采样模式。绿色方块是着色样本的位置,红色方块是计算并保存的位置样本。从左到右:2倍、4倍、6倍(AMD)、8倍(NVIDIA)采样。(由D3D·FSAA观察器生成。)

While a subpixel grid pattern results in a better approximation of how each triangle covers a grid cell, it is not ideal. A scene can be made of objects that are arbitrarily small on the screen, meaning that no sampling rate can ever perfectly capture them.If these tiny objects or features form a pattern, sampling at constant intervals can result in Moir´e fringes and other interference patterns. The grid pattern used in supersampling is particularly likely to alias.

虽然子像素网格图案可以更好地近似每个三角形如何覆盖一个网格单元,但它并不理想。一个场景可以由屏幕上任意小的物体组成,这意味着任何采样率都无法完美地捕捉它们。如果这些微小的物体或特征形成一个图案,那么以恒定的间隔进行采样会产生莫尔条纹和其他干涉图案。超级采样中使用的网格模式特别容易混叠。

One solution is to use stochastic sampling, which gives a more randomized pattern.Patterns such as those in Figure 5.28 certainly qualify. Imagine a fine-toothed comb at a distance, with a few teeth covering each pixel. A regular pattern can give severe artifacts as the sampling pattern goes in and out of phase with the tooth frequency.Having a less ordered sampling pattern can break up these patterns. The randomization tends to replace repetitive aliasing effects with noise, to which the human visual system is much more forgiving [1413]. A pattern with less structure helps,but it can still exhibit aliasing when repeated pixel to pixel. One solution is use a different sampling pattern at each pixel, or to change each sampling location over time. Interleaved samplingindexsampling!interleaved, where each pixel of a set has a different sampling pattern, has occasionally been supported in hardware over the past decades. For example, ATI’s SMOOTHVISION allowed up to 16 samples per pixel and up to 16 different user-defined sampling patterns that could be intermingled in a repeating pattern (e.g., in a 4 × 4 pixel tile). Molnar [1234], as well as Keller and Heidrich, found that using interleaved stochastic sampling minimizes the aliasing artifacts formed when using the same pattern for every pixel.

一个解决方案是使用随机抽样,它给出了一个更随机的模式。如图5.28所示的模式当然是合格的。想象一下,远处有一把细齿梳子,几个齿盖住了每个像素。当采样模式与齿频同相或异相时,常规模式会产生严重的伪像。有序度较低的采样模式会破坏这些模式。随机化倾向于用噪声代替重复的锯齿效应,人类视觉系统对此更加宽容[1413]。具有较少结构的图案有所帮助,但是当逐个像素地重复时,它仍然会表现出锯齿。一种解决方案是在每个像素使用不同采样模式,或者随时间改变每个采样位置。交错采样索引采样!在过去几十年中,硬件偶尔会支持隔行扫描,其中一组像素中的每个像素都有不同的采样模式。例如,ATI的SMOOTHVISION允许每像素多达16个样本和多达16种不同的用户定义的采样模式,这些模式可以混合在一个重复的模式中(例如,在一个4 × 4像素的拼贴中)。Molnar [1234]以及Keller和Heidrich发现,当对每个像素使用相同的模式时,使用交错随机采样可以最小化形成的锯齿伪影。

A few other GPU-supported algorithms are worth noting. One real-time antialiasing scheme that lets samples affect more than one pixel is NVIDIA’s older Quincunx method. “Quincunx” means an arrangement of five objects, four in a square and the fifth in the center, such as the pattern of five dots on a six-sided die. Quincunx multisampling antialiasing uses this pattern, putting the four outer samples at the corners of the pixel. See Figure 5.25. Each corner sample value is distributed to its four neighboring pixels. Instead of weighting each sample equally (as most other real-time schemes do), the center sample is given a weight of 1/2 , and each corner sample has a weight of 1/8 . Because of this sharing, an average of only two samples are needed per pixel, and the results are considerably better than two-sample FSAA methods.This pattern approximates a two-dimensional tent filter, which, as discussed in the previous section, is superior to the box filter.

其他一些GPU支持的算法值得注意。一种让样本影响多个像素的实时抗锯齿方案是NVIDIA的旧五点形方法。“五点形”是指五个物体的排列,四个在正方形中,第五个在中间,例如六面骰子上的五个点的图案。梅花形多采样抗锯齿使用此模式,将四个外部样本放在像素的角上。参见图5.25。每个拐角样本值被分配给它的四个相邻像素。不同于对每个样本进行平均加权(大多数其他实时方案都是这样),中心样本的权重为1/2,每个角落样本的权重为1/8。由于这种共享,每个像素平均只需要两个样本,结果比双样本FSAA方法好得多。这种模式近似于二维帐篷过滤器,如前一节所述,它优于箱式过滤器。

Quincunx sampling can also be applied to temporal antialiasing by using a single sample per pixel. Each frame is offset half a pixel in each axis from the frame before, with the offset direction alternating between frames. The previous frame provides the pixel corner samples, and bilinear interpolation is used to rapidly compute the contribution per pixel. The result is averaged with the current frame.Equal weighting of each frame means there are no shimmer artifacts for a static view.The issue of aligning moving objects is still present, but the scheme itself is simple to code and gives a much better look while using only one sample per pixel per frame.

通过对每个像素使用单个样本,梅花形采样也可以应用于时间抗锯齿。每个帧在每个轴上从之前的帧偏移半个像素,偏移方向在帧之间交替。前一帧提供像素角样本,双线性插值用于快速计算每个像素的贡献。将结果与当前帧进行平均。每帧的权重相等意味着静态视图中没有闪烁伪像。对齐移动对象的问题仍然存在,但该方案本身编码简单,并且在每帧每像素仅使用一个样本时,外观更好。

When used in a single frame, Quincunx has a low cost of only two samples by sharing samples at the pixel boundaries. The RGSS pattern is better at capturing more gradations of nearly horizontal and vertical edges. First developed for mobile graphics, the FLIPQUAD pattern combines both of these desirable features. Its advantages are that the cost is only two samples per pixel, and the quality is similar to RGSS (which costs four samples per pixel). This sampling pattern is shown in Figure 5.29. Other inexpensive sampling patterns that exploit sample sharing are explored by Hasselgren et al.

当在单个帧中使用时,通过在像素边界共享样本,梅花形具有仅两个样本的低成本。RGSS模式更擅长捕捉接近水平和垂直边缘的更多层次。FLIPQUAD模式最初是为移动图形开发的,它结合了这两种理想的特性。其优点是成本仅为每像素两个样本,质量类似于RGSS(每像素四个样本)。这种采样模式如图5.29所示。Hasselgren等人探索了利用样本共享的其他廉价采样模式。

Figure 5.29. To the left, the RGSS sampling pattern is shown. This costs four samples per pixel. By moving these locations out to the pixel edges, sample sharing can occur across edges.However, for this to work out, every other pixel must have a reflected sample pattern, as shown on the right. The resulting sample pattern is called FLIPQUAD and costs two samples per pixel. 

图5.29。左侧显示了RGSS采样模式。这需要每像素四个样本。通过将这些位置移出到像素边缘,可以跨边缘进行样本共享。然而,为了解决这个问题,每隔一个像素必须有一个反射样本模式,如右图所示。产生的样本模式称为FLIPQUAD,每像素两个样本。

Like Quincunx, the two-sample FLIPQUAD pattern can also be used with temporal antialiasing and spread over two frames. Drobot tackles the question of which two-sample pattern is best in his hybrid reconstruction antialiasing (HRAA) work. He explores different sampling patterns for temporal antialiasing, finding the FLIPQUAD pattern to be the best of the five tested. A checkerboard pattern has also seen use with temporal antialiasing. El Mansouri [415] discusses using twosample MSAA to create a checkerboard render to reduce shader costs while addressing aliasing issues. Jimenez uses SMAA, temporal antialiasing, and a variety of other techniques to provide a solution where antialiasing quality can be changed in response to rendering engine load. Carpentier and Ishiyama sample on edges, rotating the sampling grid by 45◦. They combine this temporal antialiasing scheme with FXAA(discussed later) to efficiently render on higher-resolution displays.

像五点形一样,双样本FLIPQUAD模式也可以与时间抗锯齿一起使用,并扩展到两帧。Drobot在他的混合重建抗锯齿(HRAA)工作中解决了两个样本模式哪个最好的问题。他探索了时间抗锯齿的不同采样模式,发现FLIPQUAD模式是五种测试模式中最好的。棋盘模式也可以用于时间抗锯齿。埃尔曼苏里[415]讨论了使用双采样MSAA创建棋盘渲染,以降低着色器成本,同时解决混叠问题。Jimenez使用SMAA、时间抗锯齿和各种其他技术来提供一种解决方案,其中抗锯齿质量可以根据渲染引擎负载而改变。Carpentier和Ishiyama在边缘上采样,将采样网格旋转45度。他们将这种时间抗锯齿方案与FXAA(稍后讨论)相结合,以在更高分辨率的显示器上有效地进行渲染。

Morphological Methods形态学方法

Aliasing often results from edges, such as those formed by geometry, sharp shadows, or bright highlights. The knowledge that aliasing has a structure associated with it can be exploited to give a better antialiased result. In 2009 Reshetov presented an algorithm along these lines, calling it morphological antialiasing (MLAA). “Morphological” means “relating to structure or shape.” Earlier work had been done in this area, as far back as 1983 by Bloomenthal. Reshetov’s paper reinvigorated research into alternatives to multisampling approaches, emphasizing searching for and reconstructing edges.

锯齿通常是由边缘造成的,例如由几何图形、尖锐阴影或明亮高光形成的边缘。可以利用锯齿具有与之相关的结构这一知识来给出更好的抗锯齿结果。在2009年,Reshetov提出了一个沿着这些路线的算法,称之为形态学反走样(MLAA)。“形态学”的意思是“与结构或形状有关”早在1983年,Bloomenthal就在这方面做了工作。Reshetov的论文重振了对多采样方法替代方案的研究,强调搜索和重建边缘。

This form of antialiasing is performed as a post-process. That is, rendering is done in the usual fashion, then the results are fed to a process that generates the antialiased result. A wide range of techniques have been developed since 2009. Those that rely on additional buffers such as depths and normals can provide better results,such as subpixel reconstruction antialiasing (SRAA), but are then applicable for antialiasing only geometric edges. Analytical approaches, such as geometry buffer antialiasing (GBAA) and distance-to-edge antialiasing (DEAA), have the renderer compute additional information about where triangle edges are located, e.g., how far the edge is from the center of the pixel.

这种形式的抗锯齿是作为后处理来执行的。也就是说,渲染以通常的方式完成,然后将结果提供给生成抗锯齿结果的过程。自2009年以来,已经开发了广泛的技术。那些依赖于额外的缓冲区(如深度和法线)的缓冲区可以提供更好的结果,如子像素重建抗锯齿(SRAA),但只适用于几何边缘的抗锯齿。分析方法,如几何缓冲抗锯齿(GBAA)和边距离抗锯齿(DEAA),让渲染器计算有关三角形边位置的附加信息,例如边距离像素中心有多远。

The most general schemes need only the color buffer, meaning they can also improve edges from shadows, highlights, or various previously applied post-processing techniques, such as silhouette edge rendering (Section 15.2.3). For example, directionally localized antialiasing (DLAA) is based on the observation that an edge which is nearly vertical should be blurred horizontally, and likewise nearly horizontal edges should be blurred vertically with their neighbors.

最通用的方案只需要颜色缓冲,这意味着它们还可以改善阴影、高光或各种先前应用的后处理技术的边缘,如轮廓边缘渲染(第15.2.3节)。例如,方向定位抗锯齿(DLAA)是基于这样的观察,即接近垂直的边缘应该在水平方向上模糊,类似地,接近水平的边缘应该在垂直方向上与其邻居模糊。

More elaborate forms of edge detection attempt to find pixels likely to contain an edge at any angle and determine its coverage. The neighborhoods around potential edges are examined, with the goal of reconstructing as possible where the original edge was located. The edge’s effect on the pixel can then be used to blend in neighboring pixels’ colors. See Figure 5.30 for a conceptual view of the process.

更精细的边缘检测形式试图找到可能包含任何角度的边缘的像素,并确定其覆盖范围。检查潜在边缘周围的邻域,目标是尽可能地重建原始边缘所在的位置。边缘对像素的影响可用于混合相邻像素的颜色。请参见图5.30,了解该流程的概念性视图。

Figure 5.30. Morphological antialiasing. On the left is the aliased image. The goal is to determine the likely orientation of the edge that formed it. In the middle, the algorithm notes the likelihood of an edge by examining neighbors. Given the samples, two possible edge locations are shown. On the right, a best-guess edge is used to blend neighboring colors into the center pixel in proportion to the estimated coverage. This process is repeated for every pixel in the image. 

图5.30。形态学抗锯齿。左边是锯齿图像。目标是确定形成它的边缘的可能方向。在中间,该算法通过检查邻居来记录边缘的可能性。给定样本,显示了两个可能的边缘位置。在右侧,最佳猜测边缘用于将相邻颜色混合到中心像素中,与估计的覆盖范围成比例。对图像中的每个像素重复这个过程。

Iourcha et al. improve edge-finding by examine the MSAA samples in pixels to compute a better result. Note that edge prediction and blending can give a higherprecision result than sample-based algorithms. For example, a technique that uses four samples per pixel can give only five levels of blending for an object’s edge: no samples covered, one covered, two, three, and four. The estimated edge location can have more locations and so provide better results.

Iourcha等人通过检查像素中的MSAA样本来计算更好的结果,从而改进了边缘发现。注意,边缘预测和混合可以给出比基于样本的算法更高精度的结果。例如,每像素使用四个样本的技术只能为对象的边缘提供五个级别的混合:无样本覆盖、一个样本覆盖、两个、三个和四个。估计的边缘位置可以有更多的位置,从而提供更好的结果。

There are several ways image-based algorithms can go astray. First, the edge may not be detected if the color difference between two objects is lower than the algorithm’s threshold. Pixels where there are three or more distinct surfaces overlapping are difficult to interpret. Surfaces with high-contrast or high-frequency elements, where the color is changing rapidly from pixel to pixel, can cause algorithms to miss edges. In particular, text quality usually suffers when morphological antialiasing is applied to it.Object corners can be a challenge, with some algorithms giving them a rounded appearance.Curved lines can also be adversely affected by the assumption that edges are straight. A single pixel change can cause a large shift in how the edge is reconstructed,which can create noticeable artifacts frame to frame. One approach to ameliorate this problem is to use MSAA coverage masks to improve edge determination.

基于图像的算法有几种可能误入歧途。首先,如果两个对象之间的色差低于算法的阈值,则可能检测不到边缘。有三个或更多不同表面重叠的像素很难解释。具有高对比度或高频元素的表面,其颜色从一个像素到另一个像素快速变化,可能导致算法错过边缘。特别是,当对文本应用形态抗锯齿时,文本质量通常会受到影响。对象角可能是一个挑战,一些算法给他们一个圆形的外观。假设边是直的,也会对曲线产生不利影响。单个像素的变化会导致边缘重建方式的巨大变化,从而在帧与帧之间产生明显的伪像。改善这个问题的一种方法是使用MSAA覆盖掩模来改善边缘确定。

Morphological antialiasing schemes use only the information that is provided. For example, an object thinner than a pixel in width, such as an electrical wire or rope,will have gaps on the screen wherever it does not happen to cover the center location of a pixel. Taking more samples can improve the quality in such situations; image-based antialiasing alone cannot. In addition, execution time can be variable depending on what content is viewed. For example, a view of a field of grass can take three times as long to antialias as a view of the sky.

形态学抗锯齿方案仅使用所提供的信息。例如,宽度比一个像素更细的物体,如电线或绳子,在屏幕上没有覆盖像素中心位置的地方会有间隙。在这种情况下,采集更多的样本可以提高质量;仅仅基于图像的抗锯齿是不行的。此外,根据查看的内容,执行时间可能会有所不同。例如,草地视图的抗锯齿时间可能是天空视图的三倍。
All this said, image-based methods can provide antialiasing support for modest memory and processing costs, so they are used in many applications. The color-only versions are also decoupled from the rendering pipeline, making them easy to modify or disable, and can even be exposed as GPU driver options. The two most popular algorithms are fast approximate antialiasing (FXAA), and subpixel morphological antialiasing (SMAA), in part because both provide solid (and free) source code implementations for a variety of machines. Both algorithms use color-only input, with SMAA having the advantage of being able to access MSAA samples. Each has its own variety of settings available, trading off between speed and quality. Costs are generally in the range of 1 to 2 milliseconds per frame, mainly because that is what video games are willing to spend. Finally, both algorithms can also take advantage of temporal antialiasing. Jimenez presents an improved SMAA implementation, faster than FXAA, and describes a temporal antialiasing scheme. To conclude, we recommend the reader to the wide-ranging review by Reshetov and Jimenez of morphological techniques and their use in video games.

综上所述,基于图像的方法可以以适中的内存和处理成本提供抗锯齿支持,因此被用于许多应用中。彩色版本也与渲染管道分离,使它们易于修改或禁用,甚至可以作为GPU驱动程序选项公开。两种最流行的算法是快速近似抗锯齿(FXAA)和亚像素形态抗锯齿(SMAA),部分原因是这两种算法都为各种机器提供了可靠(和免费)的源代码实现。这两种算法都使用彩色输入,SMAA的优势是能够访问MSAA样本。每一种都有自己的各种设置,在速度和质量之间进行权衡。成本通常在每帧1到2毫秒的范围内,主要是因为这是视频游戏愿意花费的。最后,这两种算法还可以利用时间抗锯齿。希门尼斯提出了一个改进的SMAA实现,比FXAA更快,并描述了一个时间抗锯齿方案。总之,我们向读者推荐Reshetov和Jimenez对形态学技术及其在电子游戏中的应用的广泛综述。

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

椰子糖莫莫

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值