opengl superbible 5 edition chapter 10——Fragment Operations: The End of the Pipeline

After reading through the first nine chapters, you should be well-versed in using vertex and fragment shaders to generate output based on your geometry. But what happens when your fragment shader is finished? Where do all the fragments go? It just so happens there are a few more steps these fragments must make before they can retire to a final resting place in a buffer or window.

this chapter walks through the last steps in the opengl pipeline, the per-fragment operations. we start with the scissor test, which is the first stop along the way and follow a virtual fragment through multisample operations, stencil testing, depth buffer testing, blending, dithering, and logic ops. figure 10.1 shows the path a fragment follows when all states are enabled.

在这里插入图片描述

Scissoring—Cutting Your Geometry Down To Size

The first step in sending those fragments to their final resting place is to decide whether they lie in a region that has been cut out of the renderable area. Scissoring is performed on window coordinates. That means all incoming fragments have a window coordinate between (0, 0) and (width, height) where width and height are the window dimensions. Applications can define a scissor plane that cuts off portions of geometry. This is done through a maximum and minimum x value as well as a maximum and minimum y value. To set a scissor region, call glScissor:

void glScissor(GLint left, Glint bottom, sizei width, sizei height);

Scissoring must also be enabled by calling glEnable(GL_SCISSOR_TEST) for the scissor test to work. If the window coordinates of the fragment fall within the region defined by the scissor, the fragment will continue on down the pipeline. Otherwise it will be discarded. Another way to express this operation is through two equations using the values you passed into glScissor. If left <= xw < (left + width) and bottom <= yw < (bottom + height), then the test passes. You learned about the scissor test back in Chapter 3, “Basic Rendering.” Take a look back there for a refresher if scissor operations are a bit hazy 朦胧的.

Multisampling
After scissoring is performed, the next stop in the pixel pipeline is multisampling. You had your first taste of multisampling back in Chapter 9, “Advanced Buffers: Beyond the Basics.” Now let’s look into how you can control the specifics of multisampling.

Remember that the multisampling stage generates multiple subsamples for any given pixel. This can be particularly helpful when a pixel happens to fall near the edge of a line or polygon. The number of samples a buffer has is determined at allocation time. For window surfaces, you have to specify the sample count when you choose a pixel format or config. For framebuffers, you can select your sample count when creating the storage of the textures and renderbuffers bound to the framebuffer. Note that the sample count for all attachments of a framebuffer should be the same. Back in Chapter 9 you also learned how to get each subpixel sample location within the pixel by calling glGetMultisamplefv. Now let’s look at how you can control those subpixels. There are two stages you can control that affect how multisampling is handled: modifying
coverage values and masking off samples.

sample coverage
coverage refers to how much area a subpixel “covers”. u can convert the alpha value of a fragment directly to a coverage vaue to determine how many samples of the framebuffer will be updated by the fragment. to do this, call glEnable(GL_SAMPLE_ALPHA_TO_COVERAGE). The coverage value for a fragment is used to determine how many subsamples will be written. For instance, a fragment with an alpha of 0.4 would generate a coverage value of 40%. For an 8-sample MSAA buffer, three of that pixel’s samples would be written to.

because the alpha value was already used to decide how many subsamples should be written, it would not make sense to then blend those subsamples with the same alpha value. after all, using alpha-to-coverage is a way of doing blending. to help prevent these subpixels from also being blended when blending is enabled, u can force the alpha values for those samples to 1 by calling glEnable(GL_SMAPLE_ALPHA_TO_ONE).

using alpha-to-coverage has several advantages over simple blending. when rendering to a multisampled buffer, the alpha blend would normally be applied equally to the entire pixel. with alpha-to-coverage, alpha masked edges are antialiased, producing a much more natural and smooth result. this is particularly useful when drawing bushes, trees, or dense foliage 叶子 where parts of the brush are alpha transparent.

OpenGL also allows you to set the sample coverage manually by calling glSampleCoverage. Manually applying a coverage value for a pixel occurs after the mask for alpha-to-coverage is applied. For this step to take effect, sample coverage must be enabled by calling glEnable(GL_SAMPLE_COVERAGE).

glSampleCoverage(clampf value, Boolean invert)

the coverage value passed into the value parameter can be between 0 and 1. the invert parameter signals to opengl if the resulting mask should be inverted. for instance, if u were drawing two overlapping trees, one with a coverage 60% and the other with 40%, u would want to invert one of the coverage values to make sure the same mask was not used for both draw calls.

glSampleCoverage(0.5, GL_FALSE);
// Draw first geometry set
. . .
glSampleCoverage(0.5, GL_TRUE);
// Draw second geometry set
. . .

sample mask
the last configurable option in the multisample stage is the sample mask. this step allows u to mask off specific samples using the glSampleMaski function. unlike the earlier stages, u can specify exactly which samples u want to turn off. Remember that the
alpha-to-coverage and sample coverage affect which samples are enabled before we get to this stage. That means setting the sample mask to one in this stage does not guarantee samples will be enabled.

glSampleMaski(GLuint maskNumber, GLbitfield mask);

The mask parameter is essentially a 32-bit bitwise mask of the pixel samples with bit 0 mapping to sample 0, bit 1 mapping to sample 1, and so on. You can use the maskNumber to address bits beyond the first 32 bits with each incremental mask value representing another 32 bits. You can query GL_MAX_SAMPLE_MASK_WORDS to find out how many masks are supported. As of this writing, implementations only support one word, which makes sense considering no implementations support more than 32 samples per pixel.

There is another way to modify the sample mask. You can write to the built-in output gl_SampleMask[] array in a fragment shader to set the mask inside your shaders.

Putting It All Together
The sample program for this chapter, called oit, draws several semitransparent objects shaped like a tinted glass wind chime. When several semitransparent surfaces are drawn in OpenGL, simply blending them together produces the wrong result. Think about what
happens if you draw an object with alpha of 0.5 and then try to draw another object behind it, also with an alpha of 0.5. If you leave depth testing enabled, the back object is simply discarded as a result of failing the depth test. If depth testing is disabled, the back object just draws over the front one and looks as if it is in front. We dig into blending in more detail later in this chapter.

To overcome this blending shortcoming, we need to use order independent transparency, or OIT. Most algorithms for correctly rendering transparent geometry involve sorting the objects being rendered by depth and then rendering the farthest objects first. This can be very complex and time-consuming. Even worse, in many situations there is no correct sorting.

to deal with this, we store each rendering pass in a separate sample of a multisampled framebuffer using sample masks. after the scene is rendered, the resolve operation combines all samples for each pixel in the correct order. let us get started.

the first step is to draw all of the geometry to a multisampled framebuffer. part of the geometry is drawn in listing 10.1. all nontransparent objects are masked to sample 0. each semitransparent object is rendered to a unique sample using the sample mask.

LISTING 10.1 Setting Up Sample Mask State

glSampleMaski(0, 0x01);
glEnable(GL_SAMPLE_MASK);
. . .
glBindTexture(GL_TEXTURE_2D, textures[1]);
shaderManager.UseStockShader(GLT_SHADER_TEXTURE_REPLACE,
transformPipeline.GetModelViewProjectionMatrix(), 0);
bckgrndCylBatch.Draw();
. . .
modelViewMatrix.Translate(0.0f, 0.8f, 0.0f);
modelViewMatrix.PushMatrix();
modelViewMatrix.Translate(-0.3f, 0.f, 0.0f);
modelViewMatrix.Scale(0.40, 0.8, 0.40);
modelViewMatrix.Rotate(50.0, 0.0, 10.0, 0.0);
glSampleMaski(0, 0x02);
shaderManager.UseStockShader(GLT_SHADER_FLAT,
transformPipeline.GetModelViewProjectionMatrix(), vLtYellow);
glass1Batch.Draw();
modelViewMatrix.PopMatrix();
modelViewMatrix.PushMatrix();
modelViewMatrix.Translate(0.4f, 0.0f, 0.0f);
modelViewMatrix.Scale(0.5, 0.8, 1.0);
modelViewMatrix.Rotate(-20.0, 0.0, 1.0, 0.0);
glSampleMaski(0, 0x04);
shaderManager.UseStockShader(GLT_SHADER_FLAT,
transformPipeline.GetModelViewProjectionMatrix(), vLtGreen);
glass2Batch.Draw();
modelViewMatrix.PopMatrix();
. . .

once all surfaces are drawn to unique sample locations, they have to be combined. but using an ordinary multisample resolve just will not do! instead we use the custom resolve shader shown in listing 10.2. the color and depth values for each sample are first fetched into an array and then analyzed to determine the color of the fragment.

LISTING 10.2 Resolving Multiple Layers by Depth

#version 150
// oitResolve.fs
//
in vec2 vTexCoord;
uniform sampler2DMS origImage;
uniform sampler2DMS origDepth;
out vec4 oColor;
void main(void)
{
const int sampleCount = 8;
vec4 vColor[sampleCount];
float vDepth[sampleCount];
int vSurfOrder[sampleCount];
int i = 0;
// Calculate un-normalized texture coordinates
vec2 tmp = floor(textureSize(origDepth) * vTexCoord);
// First, get sample data and init the surface order
for (i = 0; i < sampleCount; i++)
{
vSurfOrder[i] = i;
vColor[i] = texelFetch(origImage, ivec2(tmp), i);
vDepth[i] = texelFetch(origDepth, ivec2(tmp), i).r;
}
// Sort depth values, largest to front and smallest to back
// Must run through array (size^2-size) times, or early-exit
// if any pass shows all samples to be in order
for (int j = 0; j < sampleCount; j++)
{
bool bFinished = true;
for (i = 0; i < (sampleCount-1); i++)
{
float temp1 = vDepth[vSurfOrder[i]];
float temp2 = vDepth[vSurfOrder[i+1]];
if (temp2 < temp1)
{
// swap values
int tempIndex = vSurfOrder[i];
vSurfOrder[i] = vSurfOrder[i+1];
vSurfOrder[i+1] = tempIndex;
bFinished = false;
}
}
if (bFinished)
j = 8; // Done. Early out!
}
// Now, sum all colors in order from front to back. Apply alpha.
bool bFoundFirstColor = false;
vec4 summedColor = vec4(0.0, 0.0, 0.0, 0.0);
for (i = (sampleCount-1); i >= 0; i—)
{
int surfIndex = vSurfOrder[i];
if(vColor[surfIndex].a > 0.001)
{
if (bFoundFirstColor == false)
{
// apply 100% of the first color
summedColor = vColor[surfIndex];
bFoundFirstColor = true;
}
else
{
// apply color with alpha
// same as using glBlendFunc(GL_SRC_ALPHA,
GL_ONE_MINUS_SRC_ALPHA);
summedColor.rgb =
(summedColor.rgb * (1 - vColor[surfIndex].a)) +
(vColor[surfIndex].rgb * vColor[surfIndex].a);
}
}
}
oColor = summedColor;
oColor.a = 1.0f;
}

for transparency to work correctly, the color of each piece of geometry must be applied from back to front. to do this, we need to figure out what geometry is overlapping other geometry. that means the depth values have to be parsed and sorted. we store the result 把深度排序之后的索引存储在数组中 of the sort in the vSurfOrder array for use in the next step. this array holds indexes that point to the sample arrays. index 0 points to the cloest sample, index 1 to the next cloest, and so on. For locations where only one or two layers of geometry were drawn, 两个物体重叠在同一个像素才绘制,其他的都是0,颜色 + 透明度,all other samples contain 0 for color and alpha. Figure 10.2 shows the result of the sort. On the far left are all of the closest samples pointed to by vSurfOrder[0], second are the next closest samples as indicated by vSurfOrder[1], and so on. Notice that sample 0 contains mostly background because nothing is overlapping the background; therefore, the background is the closest, and only one sample in vSurfOrder is relevant. For this app, there are only at most four overlapping pieces of geometry in any given region.

在这里插入图片描述

unknowing…

to be continued.

stencil operations
the next step in the fragment pipeline is the stencil test. think about the stencil test as cutting out a shape in cardborad and then using cutout of spray paint the shape on a mural 壁画. the spray paint only hits the wall in places where the cardborad is cut out. if u have a pixel format that supports a stencil buffer, u can similaryly mask your draws to the framebuffer. u can enable stenciling by calling glEnable(GL_STENCIL_TEST). most stencil buffers contain 8 bits, but some configurations may support fewer bits.

your draw commands can have a direct effect on the stencil buffer, and the value of the stencil buffer can have a direct effect on the pixels u draw. to control interations with the stencil buffer, opengl provides two commands: glStencilFuncSeparate and glStencilOpSeparate. opengl tells u set both of these separately for fron-and back facing geometry.

void glStencilFuncSeparate(GLenum face, GLenum func, GLint ref, Gluint mask);
void glStencilOpSeparate(GLenum face, GLenum sfail, GLenum dpfail, Glenum dppass);

First let’s look at glStencilFuncSeparate, which controls the conditions under which the stencil test passes or fails. You can pass GL_FRONT, GL_BACK, or GL_FRONT_AND_BACK for face, signifying which geometry will be affected. The value of func can be any of the values in Table 10.1. These describe under what conditions geometry will pass the stencil test. The ref value is the reference used to compute the pass/fail result, and the mask lets you control which bits of the reference and the buffer are compared.

在这里插入图片描述
The next step is to tell OpenGL what to do when the stencil test passes or fails by using glStencilOpSeparate. This function takes four parameters with the first specifying which faces will be affected. The next three parameters control what happens after the stencil
test is performed and can be any of the values in Table 10.2. The second parameter, sfail, is the action taken if the stencil test fails. dpfail parameter specifies the action taken if the depth buffer test fails, and the final parameter, dppass, specifies what happens if the
depth buffer test passes.
在这里插入图片描述

So how does this actually work out? Let’s look at a simple example of typical usage shown
in Listing 10.2. The first step is to clear the stencil buffer to 0 by setting the stencil clear
value through glClearStencil and then calling clear with the stencil buffer bit. Next a
window border is drawn that may contain details such as a player’s score and statistics.
Set up the stencil test to always pass with the reference value being 1 by calling
glStencilFuncSeparate. Then tell OpenGL to replace the value in the stencil buffer only
when the depth test passes by calling glStencilOpSeparate followed by rendering the
border geometry. This turns the border area pixels to 1 while the rest of the framebuffer
remains at 0.
Next, set up the stencil state so that the stencil test will only pass if the stencil buffer
value is 0 and then render the rest of the scene. This causes all pixels that would overwrite
the border we just drew to fail the stencil test and not be drawn to the framebuffer. Listing
10.3 shows an example of how stencil can be used.
LISTING 10.3 Example Stencil Buffer Usage, Stencil Border Decorations

// Clear stencil buffer to 0
glClearStencil(0);
glClear(GL_STENCIL_BUFFER_BIT);
// Setup Stencil state for border rendering
glStencilFuncSeparate(GL_FRONT, GL_ALWAYS, 1, 0xff);
glStencilOpSeparate(GL_FRONT, GL_KEEP, GL_ZERO, GL_REPLACE);
// Render border decorations
. . .
// Now, border decoration pixels have a stencil value of 1
// All other pixels have a stencil value of 0.
// Setup Stencil state for regular rendering,
// fail if pixel would overwrite border
glStencilFuncSeparate(GL_FRONT_AND_BACK, GL_LESS, 1, 0xff);
glStencilOpSeparate(GL_FRONT, GL_KEEP, GL_KEEP, GL_KEEP);
// Reder the rest of the scene, will not render over stenciled
// boarder content
. . .

There are also two other stencil functions: glStencilFunc and glStencilOp. These behave
the same way as glStencilFuncSeparate and glStencilOpSeparate with the face set to
GL_FRONT_AND_BACK.

depth testing
after stencil operations are complete, the hardware tests the depth value of a fragment when depth testing is enabled. if depth writes are enabled and the fragment has passed the depth test, the depth buffer is updated with the new depth value of the fragment. if the depth test fails, the fragment is killed and does not pass to the other stages of fragment operations. we have used depth buffers and depth testing throughout the entire book. their operation should be as familiar as waking up in the moring! as a refresher u can take a peek back at chapter 3.

depth clamp
there is one more useful piece of functionality related to depth testing called depth clamping. depth clamping is disabled by default but can be enabled by calling glEnable(GL_DEPTH_CLMAP). if depth clamping is eabled, the incoming pixel’s depth will be clamped to the near and far clip planes before the depth test is performed.

depth clamping can be useful in preventing geometry from being clipped to the clip volume. one applicable case is shadow volume rendering. when rendering shadow volumes u want to preserve as much of the geometry along the z-axis as possible. to do this u can enable depth clamping, which prevents data that is farther than the far clip plane or nearer than the near clip plane from being cut off.

blending everything together
once a fragment passes depth testing, it is handed off 被移交 to the blending stage. blending allows u to combine the incoming source color with the color already in the color buffer or with other constants using one of the many supported blend equations. blending can only be done on fixed and floating-point formats. u can not blend with integer formats such as GL_RGB_16I or GL_RGB32I. also if the buffer u are drawing to is fixed-point, the incoming source colors will be clamped to 0.0-1.0 before any blending opeartions occur. blending is controlled on a per-drawbuffer basis and is enabled by calling glEnablei(GL_BLEND, bufferIndex). just like using glDrawBuffers. the buffer index can be GL_DRAW_BUFFER0, GL_DRAW_BUFFER1, and so on . if the default FBO is bound, blending is performed on all enabled buffers.

blend equation
blending is highly customizable. the first aspect to consider is how u want to combine the pixel value (source) with the framebuffer color(destination). u can choose separate operations for the RGB values and the alpha values if you use glBlendEquationSeperate or use the same equation for both RGB and alpha if you use glBlendEquation. the blend equations available are listed in table 10.3. blending is performed as if the source and destination colors were floating-point.

glBlendEquation(GLenum mode);
glBlendEquationSeparate(GLenum modeRGB, GLenum modeAlpha);
在这里插入图片描述
Blend Function
Now that you have chosen an equation to combine the source and destination colors; you have to set the factors used in the blend equation. This can be done by calling glBlendFunc or glBlendFuncSeparate with the factors you intend to use. Just like glBlendEquation, you can either set separate functions for RGB and alpha or use one command to set them both to the same value. glBlendFuncSeparate(GLenum srcRGB, GLenum dstRGB, GLenum srcAlpha, GLenum dstaAlpha);
glBlendFunc(GLenum src, GLenum dst);
The possible values for these calls can be found in Table 10.4. Note that functions that require addition or subtraction perform these operations as vectors. Some also require a constant value that can be set by calling glBlendColor:
glBlendColor(clampf red, clampf green, clampf blue, clampf alpha);

You may have noticed that some of the factors in Table 10.4 use source 0 colors, and
others use source 1 colors. Your shaders can export more than one final color for a given
color attachment by setting up the outputs using glBindFragDataLocationIndexed. The
way to make use of two outputs is by blending the colors togther using appropriate blend
factors. You can find out how many dual output buffers are supported by querying the
value of GL_MAX_DUAL_SOURCE_DRAW_BUFFERS.

dithering 抖动
great! your pixels has almost gotten to the end of the pipeline. after blending, pixel data is still represented as a set of float-point numbers. but unless your framebuffer is a floating point buffer, the pixel data has to be converted before it can be stored. for instance, most window-renderable formats only support 8 bits of color per channel. that means the GPU has to convert the final color output before it can be stored.

this conversion can happen one of two ways depending on whether dithering is disabled or enabled. first, the result can be simply mapped to the largest positive representable color. for instance, if the R-value of a particular pixel is 0.3222 and the window format is GL_RGB_8, the GPU could map this to either value 82 of 256 or 83 of 256. if dithering is disabled, the GPU automatically chooses 83. u can force this behaviour by calling glDisable(GL_DITHER).

the second option is to dither the result. dithering is enabled by default, but u can also enable it by calling glEnable(GL_DITHER). what is dithering? it is a way for the hardware to blend the transition from one representable color to the next step. instead of an abrupt switch from one color level to another, a GPU can soften the boarder of the transition by mixing the two colors together in areas where neither neighboring color can truly represent the color at that location. take a look at figure 10.5.
在这里插入图片描述

there are several formulas to compute how dithering is done. but basically if the underlying color is between 82 and 83 for an 8-bit color buffer, the percentage of each used is proportional to how close the color is to 82 to 83. it is worth nothing that the dithering algorithm is up to each vendor. Some implementations may choose to simply step right to the next shade when certain color buffer formats are used.

Dithering can be very handy. It can eliminate banding issues when your objects are gradually smooth-shaded. The best part is you don’t even have to worry about it. Dithering is enabled by default and works to make your rendering more pleasing and natural.

Logic Ops——operations
once the pixel color is in the same format and bit depth as the framebuffer, there are two more steps that can affect the final result. the first allows u to apply a logical operation to the pixel color before it is passed on. when enabled, the effects of blending are ignored. logic operations do not affect-point buffers. u can enable logic ops by calling

glEnable(GL_COLOR_LOGIC_OP);

logic operations use the values of the incoming pixel and the exiting framebuffer to compute a final value. 有点像混合 u can pick the operation that computes the final value. the possible options are listed in table 10.5. pass your logic op of choice into glLogicOp:

glLogicOp(GLenum op);

在这里插入图片描述

logic ops are applied sepearately to each color channel. operations that combine source and destination are performed bitwise on the color values. Logic ops are not commonly used in today’s graphics applications but still remain part of OpenGL because the functionality is still supported on common GPUs.

masking output
one of the last modifications that can be made to a fragment before it is written is masking. by now u recognize that three different types of data can be written by a fragment shader: color, depth, and stencil data. likewise, there are separate operations u can use to mask the result of each.

color
to mask color writes or prevent color writes from happening, u can use glColorMask and glColorMaski. u do not have to mask all color channels at once; u can choose to mask the red and green channels while permitting 允许 writes to the blue channel for instance. u can pass in GL_TRUE for a channel to allow writes for that channel to occur, or GL_FALSE to mask these writes off. The first function, glColorMask, allows you to mask all buffers currently enabled for rendering while the second function, glColorMaski, allows you to set the mask for a specific color buffer.

glColorMask(writeR, writeG, writeB);
glColorMaski(colorBufIndex, writeR, writeG, writeB);

depth
writes to the depth buffer can be masked in a similar way. glDepthMask also takes a Boolean value that turns writes on if GL_TRUE and off if GL_FALSE.

glDepthMask(GL_FALSE);

Stencil
Stencil buffers can be masked too. You guessed it; the function you use for stencil buffers is called glStencilMask. But unlike the other functions, you have more granular control over what gets masked off. Instead of just setting a Boolean value, the stencil mask functions
take a bitfield. The least significant portion of this bitfield maps to the same number of bits in the stencil buffer. If a mask bit is set to 1, the corresponding bit in the stencil buffer can be updated. But if the mask bit is 0, the corresponding stencil bit will not be written to.

GLuint mask = 0x0007;
glStencilMask(mask);
glStencilMaskSeparate(GL_BACK, mask);

In the preceding example, the first call to glStencilMask enables the lower three bits of the stencil buffer for writing while leaving the rest disabled. The second call, glStencilMaskSeparate, allows you to set separate masks for primitives that are frontfacing and back-facing.

usage
write masks can be useful for many operatoins. for instance, if u want to fill a shadow volume with depth information, you can mask off all color writes because only the depth information is important. Or if you want to draw a decal directly to screen space, you can disable depth writes to prevent the depth data from being polluted. The key point about masks is you can set them and immediately call your normal rendering paths, which may set up necessary buffer state and output all color, depth, and stencil data you would normally use without needing any knowledge of the mask state. You don’t have to alter your shaders to not write some value, detach some set of buffers, or change the enabled draw buffers. The rest of your rendering paths can be completely oblivious and still generate the right results.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值