opengl proamming guide chapter 10——Geometry Shaders

in this chapter, we introduce an entirely new shader stage——the geometry shader. the geometry shader sits logically right before primitive assembly and fragment shading. it receives as its input complete primitives as a collection of vertices, and these inputs are represented as arrays. typically, the inputs are provided by the vertex shader. however, when tessellation is active, the input the geometry shader is provided by the tessellation evaluation shader. because each invocation of the geometry shader process an entire primitive, it is possible to implement techniques that require access to all of the vertices of that primitive.

in addition to this enhanced, multivertex access, the geometry shader can output a variable amout of data. outputting nothing amounts to culling geometry and outputing more vertices than were in the original primitive results in geometry amplification. the geometry shader is also capable of producing a different primitive type at its output than it accepts on its input, allowing it to change the type of geometry as it passes through the pipeline. there are four special primitive types provided for use as inputs to geometry shaders. finally, geometry shaders can be used with transform feedback to split an input stream of vertex data into several substreams. these are very powerful features that enable a large array of techniques and algorithms to be implemented on the gpu.

it has the following major sections:

  1. “creating a geometry shader” describes the fundamental mechanics of using geometry shaders.
  2. geometry shader inputs and outputs defines the input and output data structures used with geometry shaders.
  3. producing primitives illustrates how primitives can be generated within a geometry shader.
  4. advanced transform feedback extends the transform feedback mechanism to support more advanced techniques.
  5. geometry shader intancing describes optimization techniques available when using geometry shaders for geometric instancing.
  6. multiple viewports and layered rendering explains rendering to multiple viewports in a single rendering pass.

creating a geometry shader
geometry shaders are created in exactly the same manner as any other type of shader——by using the glCreateShader() function. to create a geometry shader, pass GL_GEOMETRY_SHADER as the shader type parameter to glCreateShader(). the shader source is passed as normal using the glShaderSource() function and then the shader is compiled using glCompileShader(). multiple geometry shaders may be attached to a single program object and when that program is linked, the attached geometry shaders will be linked into an executable that can run on the gpu. when a program object containing a geometry shader is active the geometry shader will run on each primitive produced by opengl. these primitives may be points, lines, triangle or one of the special adjacency primitives, which will be discussed shortly.

the geometry shader is an optional stage in opengl——your program object does not need to contain one. it sits right before rasterization and fragment shading. the output of the geometry shader can be captured using transform feedback, and it is often used in this mode to process verties for use in subsequent rendering, or even nongraphics tasks. if no fragment shader is present, rasterization can even be turned off by calling glEnable() with the parameter GL_RASTERIZER_DISCARD. this makes transform feedback the end of the pipeline and it can be used in this mode when only the captured vertex data is of interest and the rendering of primitives is not required.

one of the unique aspects of the geometry shader is that it is capable of chaning the type and number of primitives that are passing through the opengl pipeline. the methods and applications of doing things will be explained shortly. however, before a geometry shader may be linked, the input primitive type, output primitive tpe, and the maximum number of vertices that it might produce must be specified. these parameters are given in the form of layout qualifiers in the geometry shader source code.

Example 10.1 shows a very basic example of a geometry shader that simply passes primitives through unmodified (a pass-through geometry shader).

Example 10.1 A Simple Pass-Through Geometry Shader

// This is a very simple pass-through geometry shader
#version 330 core
// Specify the input and output primitive types, along with the maximum number of vertices that this shader
// might produce. Here, the input type is triangles and the output type is triangle strips.
layout (triangles) in;
layout (triangle_strip, max_vertices = 3) out;
// Geometry shaders have a main function just like any other type of shader

void main()
{
	int n;
	// Loop over the input vertices
	for (n = 0; n < gl_in.length(); n++)
	{
		// Copy the input position to the output
		gl_Position = gl_in[0].gl_Position;
		// Emit the vertex
		EmitVertex();
	}
	// End the primitive. This is not strictly necessary and is only here for illustrative purposes.
	EndPrimitive();
}

This shader simply copies its input into its output. You don’t need to worry about how this works right now, but you might notice several features of this example that are unique to geometry shaders. First, at the top of the shader is a pair of layout qualifiers containing the declaration of the input and output primitive types and the maximum number of vertices that may
be produced. These are shown in Example 10.2.
Example 10.2 Geometry Shader Layout Qualifiers

layout (triangles) in;
layout (triangle_strip, max_vertices = 3)out;

The first line specifies that the input primitive type is triangles. This means that the geometry shader will be run once for each triangle rendered. Drawing commands used by the program must use a primitive mode that is compatible with the primitive type expected by the geometry shader (if present). If a drawing command specifies strips or fans (GL_TRIANGLE_STRIP or GL_TRIANGLE_FAN, in the case of triangles), the geometry shader will run once for each triangle in the strip or fan. The second line of the declaration specifies that the output of the geometry shader is triangle strips and that the maximum number of vertices that will be produced is three. The accepted primitive types accepted as inputs to the geometry shader and the corresponding primitive types that are allowed to be used in drawing commands are listed in Table 10.1. Notice that even though we are only producing a single triangle in this example, we still specify that the output primitive type is triangle strips.
Geometry shaders are designed to produce only points, line strips, or triangle strips, but not individual lines or triangles, nor loops or fans. This is because strips are a superset of individual primitive types—think of an independent triangle or line as a strip of only one primitive. By terminating the strip after only a single triangle, independent triangles may be drawn.

在这里插入图片描述

The special GLSL function EmitVertex() produces a new vertex at the output of the geometry shader. Each time it is called, a vertex is appended to the end of the current strip if the output primitive type is line_strip or triangle_strip. If the output primitive type is points, then each call to EmitVertex() produces a new, independent point. A second special geometry shader function, EndPrimitive(), breaks the current strip and signals OpenGL that a new strip should be started the next time EmitVertex() is called. As discussed, single primitives such as lines or triangles are not directly supported, although they may be generated by calling EndPrimitive() after every two or three vertices in the case of lines or triangles, respectively. By calling EndPrimitive() after every two vertices are emitted when producing line strips or after every three vertices are emitted when producing triangle strips, it is possible to generate independent lines or triangles. As there is no such thing as a point strip, each point is treated as an individual primitive and so calling EndPrimitive() when the output primitive mode is points has no effect (although it is still legal). When the geometry shader exits, the current primitive is ended implicitly and so it is not strictly necessary to call EndPrimitive() explicitly at the end of the geometry shader. When EndPrimitive() is called (or at the end of the shader), any incomplete primitives will simply be discarded. That is, if the shader produces a triangle strip with only two vertices or if it
produces a line strip with only one vertex, the extra vertices making up the partial strip will be thrown away.

总结:EmitVertex()产生一个新的顶点;EndPrimitive()结束当前图元。如输出是三角条带,当发射出三个顶点之后,调用EndPrimitive之后,则表面当前图元结束绘制。并非一定要调用它,当几何着色器结束之后,自动结束当前图元操作。
此外注意,如果输出指定为三角条带,当只有两个顶点的时候,如果调用了EndPrimitive,那么结束当前图元操作,这两个顶点也就无效了,被丢弃了。再比如,如果输出指定为线段,当只有一个顶点的时候调用EndPrimitive,那么结束当前图元操作,前已经产生的一个顶点也将无效,被丢弃处理。

geometry shader inputs and outputs
the inputs and outputs of the geometry shader are specified using layout qualifiers and the in and out keywords in glsl. in addition to user-defined inputs and outputs, there are several built-in inputs and outputs that are specific to geometry shaders. these are described in some detail in the following subsections. the in and out keywords are also used in conjuction with layout qualifiers to specify how the geometry shader fits into the pipeline, how it behaves, and how it interacts with adjacent shader stages.

geometry shader inputs
the input to the geometry shader is fed by the output of the vertex shader, of if the tessellation is active, the output of the tessellation evaluation shader. as the geometry shader runs once per input primitive, outputs from the previous stage (vertex shader or tessellation evaluation shader) become arrays in the geometry shader. this includes all uer-defined inputs and the special built-in input variable, gl_in, which is array containing the built-in outputs that are available in the previous stage. the gl_in input is implicitly declared as an interface block. the definition of gl_in is shown in example 10.3.

Example 10.3 Implicit Declaration of gl_in[]

in gl_PerVertex 
{
	vec4 gl_Position;
	float gl_PointSize;
	float gl_ClipDistance[];
} gl_in[];

as noted, gl_in is implicitly declared as an array. the length of the array is determined by the input primitive type. whatever is written to gl_Position, gl_PointSize, or gl_ClipDistance in the vertex shader (or tessellation evaluation shader) becomes visible to the geometry shader in the appropriate member of each member of the gl_in array. like any array, the number of elements in the gl_in array can be found using the .length() method. returning to our example geometry shader, we see a loop:

// Loop over the input vertices
for (n = 0; n < gl_in.length(); n++)
{
	...
}

the loop runs over the elements of the gl_in array, whose length is dependent on the input primitive type declared at the top of the shader.
in this particular shader, the input primitive type is triangles, meaning that each invocation of the geometry shader processes a single triangle, and so the length gl_in.length() function will return three. this is very convenient as it allows us to change the input primitive type of the geometry shader without changing any source code except the input primitive type layout qualifier.
for example, if we change the input primitive type to lines, the geometry shader will now run once per line, and gl_in. length() will return two. the rest of the code in the shader need not change.

the size of the input arrays is determined by the type of primitives that the geometry shader accepets. the accepted primitive types are points, lines, triangles, line_adjacency, and triangles_adjacency.

the number of vertices in each primitive of these types is shown in table 10.2 below.
在这里插入图片描述

The first three represent points, lines, and triangles, respectively. Points are represented by single vertices, and so although the inputs to the geometry shader are still arrays, the length of those arrays is one.
这个不对吧,除了图元是顶点时候,数组的长度为1,其他的两个应该不是吧。
Lines and triangles are generated both by independent triangles (GL_TRIANGLES and GL_LINES primitive types) and from the individual members of strips and fans (GL_TRIANGLE_STRIP, for example). Even if the drawing command specified GL_TRIANGLE_STRIP, GL_TRIANGLE_FAN, GL_LINE_STRIP, or GL_LINE_LOOP, the geometry shader still receives individual primitives as appropriate.

The last two input primitive types represent adjacency primitives, which are special primitives that are accepted by the geometry shader. They have special meaning and interpretation when no geometry shader is present (which will be described shortly), but for most cases where a geometry shader is present can be considered to be simple collections of four or six vertices and it is up to the geometry shader to convert them into some other primitive type. You cannot specify an adjacency primitive type as the output mode of the geometry shader. Just as the built-in variable gl_in is an array with a length determined by the input primitive type, so are user-defined inputs. Consider the following vertex shader output declarations:

out vec4 position;
out vec3 normal;
out vec4 color;
out vec2 tex_coord;

In the geometry shader, these must be declared as arrays as follows:

in vec4 position[];
in vec3 normal[];
in vec4 color[];
in vec2 tex_coord[];

Note that the size of the arrays does not have to be given explicitly. If the array declarations are left unsized, then the size is implied by the input primitive type declared earlier in the shader. If the size is given explicitly then it is cross-checked at compile time against the input primitive type, giving an additional layer of error checking. If an input array is declared
with an explicit size and that size does not match what is expected given the input primitive type, the GLSL compiler will generate an error.

GLSL versions earlier than 4.3 did not contain support for two-dimensional arrays. So, what happened to vertex shader outputs that are declared as arrays? To pass an array from a vertex shader to a geometry shader, we took advantage of an interface block. Using an interface block helps group all the data for a single vertex, rather than managing collections of arrays, so you may want to use interface blocks regardless of arrays or version numbers. The interface block can contain arrays, but it is the interface block itself that becomes an array when passed into a geometry shader. This technique is already used in the definition of the gl_in[] built-in variable—the gl_ClipDistance[] array is a member of the block.

Consider the example above. Let’s assume that we wish to pass more than one texture coordinate from the vertex shader to the fragment shader. We will do that by making tex_coord an array. We can re-declare the variables listed in the example in an interface block and see how that affects their declaration in the geometry shader.

First, in the vertex shader:

out VS_GS_INTERFACE
{
	out vec4 position;
	out vec3 normal;
	out vec4 color;
	out vec2 tex_coord[4];
} vs_out;

Now, in the geometry shader:

in VS_GS_INTERFACE
{
	out vec4 position;
	out vec3 normal;
	out vec4 color;
	out vec2 tex_coord[4];
} gs_in[];

now we have declared the output of the vertex shader as vs_out using an interface block, which is matched to gs_in[] in the geometry shader. remember that interface block matching is performed by block name (VS_GS_INTERFACE in this exmaple) rather than instacne name. this allows the variables representing the block instance to have a different name in each shade stage. gs_in[] is an array, and the four texture coordinates are available in the geometry shader as gs_in[n].tex_coord[m]. anything that can be passed from a vertex shader to a fragment shader can be passed in this manner, including arrays, structures, matrices, and other compound types.

in addition to the built-in members of gl_in[] and to use-defined inputs, there are a few other special inputs to the geometry shader.
these are gl_PrimitiveDIn and gl_InvocationID. the first, gl_PrimitiveIDIn, is the equivalent of gl_PrimitiveID that is available to the fragment shader. The In suffix distinguishes it from gl_PrimitiveID, which is actually an output in the geometry shader and must be assigned by the geometry shader if it is to be made available in the subsequent fragment shader. The second input, gl_InvocationID, is used during geometry shader instancing, which will be explained shortly. Both gl_PrimitiveIDIn and gl_InvocationID are intrinsically declared as
integers.

Special Geometry Shader Primitives
Special attention should be paid to the adjacency primitive types available to geometry shaders (lines_adjacency and triangles_adjacency).

These primitives have four and six vertices, respectively, and allow adjacency information—information about adjacent primitives or
edges—to be passed into the geometry shader. Lines with adjacency information are generated by using the GL_LINES_ADJACENCY or GL_LINE_STRIP_ADJACENCY primitive mode in a draw command such as glDrawArrays(). Likewise, triangles with adjacency information are produced by using the GL_TRIANGLES_ADJACENCY or GL_TRIANGLE_STRIP_ADJACENCY primitive types. These primitive types can be used without a geometry shader present and will be interpreted as lines or triangles with additional vertices being discarded.

line with adjacency
at the input of the geometry shader, each lines_adjacency primitive is represented as a four-vertex primitive (i.e., the geometry shader inputs

lue

geometry shader outputs

the output of the geometry shader is fed into the primitive setup engine, rasterizer, and eventually into the fragment shader. in general, the output of the geometry shader is equivalent ot the ouput of the vertex shader if no geometry shader is present. as many of the same outputs exist in the geometry shader as exist in the vertex shader. The same gl_PerVertex
interface block specification is used for per-vertex outputs in the geometry shader. The definition of this block is given in Example 10.4.
Example 10.4 Implicit Declaration of Geometry Shader Outputs

out gl_PerVertex
{
	vec4 gl_Position;
	float gl_PointSize;
	float gl_ClipDistance[];
};

fur rendering using a geometry shader

the following is a worked example of using amplification in a geometry shader to produce a fur-rendering effect. this is an implementation of the fur shell method——there are several methods for rendering fur and hair, but this method neatly demonstrates how moderate amlificatoin in a geometry shader can be used to implement the effect. the basic principle is that hair or fur on a surface is modeled as volume that is rendered using slices, and the geometry shader is used to generate those slices. the more slices that are rendered, the more detailed and continuous the hair effect will be. this number can be varied to hit a particular performance or quality target. the input to the geometry shader is the triangles forming the underlying mesh and the effect parameters are the number of layers(shells) and the depth of the fur. the geometry shader produces the fur shells by displacing the incoming vertices along their normals and essentially producing multiple copies of the incoming geometry. as the shells are rendered, the fragment shader uses a fur texture to selectively blend and ultimately discard pixels that are not part of a hair. the geometry shader is shown in example 10.7.

Example 10.7 Fur Rendering Geometry Shader

// Fur rendering geometry shader
#version 330 core
// Triangles in, triangles out, large max_vertices as we’re amplifying
layout (triangles) in;
layout (triangle_strip, max_vertices = 120) out;
uniform mat4 model_matrix;
uniform mat4 projection_matrix;
// The number of layers in the fur volume and the depth of the volume
uniform int fur_layers = 30;
uniform float fur_depth = 5.0;
// Input from the vertex shader
in VS_GS_VERTEX
{
	vec3 normal;
	vec2 tex_coord;
} vertex_in[];
// Output to the fragment shader
out GS_FS_VERTEX
{
	vec3 normal;
	vec2 tex_coord;
	flat float fur_strength;
} vertex_out;
void main()
{
	int i, layer;
	// The displacement between each layer
	float disp_delta = 1.0 / float(fur_layers);
	float d = 0.0;
	// For each layer...
	for (layer = 0; layer < fur_layers; layer++)
	{
		// For each incoming vertex (should be three of them)
		for (i = 0; i < gl_in.length(); i++) 
		{
			// Get the vertex normal
			vec3 n = vertex_in[i].normal;
			// Copy it to the output for use in the fragment shader
			vertex_out.normal = n;
			// Copy the texture coordinate too - we’ll need that to fetch from the fur texture
			vertex_out.tex_coord = vertex_in[i].tex_coord;
			// Fur "strength" reduces linearly along the length of the hairs
			vertex_out.fur_strength = 1.0 - d;
			// This is the core - displace each vertex along its normal to generate shells
			position = gl_in[i].gl_Position + vec4(n * d * fur_depth, 0.0);
			// Transform into place and emit a vertex
			gl_Position = projection_matrix * (model_matrix * position);
			EmitVertex();
		}
		// Move outwards by our calculated delta
		d += disp_delta;
		// End the "strip" ready for the next layer
		EndPrimitive();
	}
}

the geometry shader in example 10.7 begins by specifying that it takes triangles as input and will produce a triangle strip as output with a maximum of 120 vertices. this is quite a large number, but we will not use them all unless the number of fur layers is increased significantly. a maximum of 120 vertices output from the geometry shader will allow for 40 fur layers. the shader will displace vertices along their normal vectors (which are assumed to point outwards) and amplify the incoming geometry to produce the shells that will be used to render the fur. the displacement for each shell is calcualted into disp_delta. then for each layer (the number of layers is in the fur_layers uniform) the vertex position is displaced by scaling the normal and adding it to the original position. a displaced version of the triangle is thus generated by performing the operation on each vertex. a call to EndPrimitive() causes the geometry shader to create unconnected triangles as its output.

Next, we pass into the fragment shader, which is given in Example 10.8 below.

Example 10.8 Fur Rendering Fragment Shader

#version 330 core
// One output
layout (location = 0) out vec4 color;
// The fur texture
uniform sampler2D fur_texture;
// Color of the fur. Silvery gray by default...
uniform vec4 fur_color = vec4(0.8, 0.8, 0.9, 1.0);
// Input from the geometry shader
in GS_FS_VERTEX
{
	vec3 normal;
	vec2 tex_coord;
	flat float fur_strength;
} fragment_in;
void main()
{
	// Fetch from the fur texture. We’ll only use the alpha channel here, but we could easily have a color fur texture.
	vec4 rgba = texture(fur_texture, fragment_in.tex_coord);
	float t = rgba.a;
	// Multiply by fur strength calculated in the GS for the current shell t *= fragment_in.fur_strength; 
	// Scale fur color alpha by fur strength.
	color = fur_color * vec4(1.0, 1.0, 1.0, t);
}

在这里插入图片描述

the fur fragment shader uses a texture to represent the layout of hairs in the fur. the texture used in the fur example is shown in figure 10.6. the brightness of each texel maps to the length of the hair at that point. zero essentially mans no hair, and white represents hairs whose length is equal to the full depth of the fur volume.

The texture in Figure 10.6 is generated using a simple random placement of hairs. A more sophisticated algorithm could be developed to allow hair density and distribution of be controlled programmatically. the current depth of the shell being rendered is passed from the geometry shader into the fragment shader. the fragment shader uses this, along with the contents of the fur texture to determine how far along the hair the fragment being rendered is. this information is used to calcualte the fragment’s color and opacity, which are used to generate the fragment shader output.

A first pass of the underlying geometry is rendered without the fur shaders active. This represents the skin of the object and prevents holes or gaps appearing when the hair is sparse 稀疏的. Next, the fur rendering shader is activated and another pass of the original geometry is rendered. Depth testing is used to quickly reject fur fragments that are behind the solid geometry. However, while the fur is being rendered, depth writes are turned off. This causes the very fine tips of the hairs to not occlude thicker hairs that may be behind them. Figure 10.7 shows the result of the algorithm.

在这里插入图片描述
在这里插入图片描述

advanced transform feedback
we have already covered the concept of transform feedback and see how it works when only a vertex shader is present. in summary, the output of the vertex shader is captured and recorded into one or more buffer objects. those buffer objects can subsequently be used for rendering (e.g., as vertex buffers) or read back by the cpu using functions like glMapBuffer() or glGetBufferSubData(). we have also seen how to disable rasterization such that only the vertex shader is active. however, the vertex shader is a relatively simple one-in, one-out shader stage and can not create or destroy vertices. also, it only has a single set of outputs.

u have just read about the ability of a geometry shader to produce a varaible amount of output vertices. when a geometry shader is present, transform feedback captures the output of the geometry shader. in addition to the stream of vertices that is usually sent to primitive assembly and rasterization, the geometry shader is capable of producing other, ancillary 辅助的 streams of vertex information that can be captured using transform feedback.
by combining the geometry shader’s ability to produce a variable amount of vertices at its output and its ability to send those input vertices to any one of several output streams, some sophisticated sorting, bucketing, and processing algorithms can be implemented using the geometry shader and transform feedback.

in this subsection, we will introduce the concept of multiple vertex streams as outputs from the geometry shader. we also introduce methods to determine how many vertices were produced by the geometry shader, both when using a single output stream and when using multiple output streams. finally, we discuss methods to use data generated by a geometry shader and stored into a transform feedback buffer in subsequent draw commands without requiring a round-trip to the cpu.

multiple output streams
multiple streams of vertices can be declared as outputs in the geometry shader. output streams are declared using the stream layout qualifier. This layout qualifier may be applied globally, to an interface block, or to a
single output declaration. Each stream is numbered, starting from zero and an implementation defined maximum number of streams can be declared. That maximum can be found by calling glGetIntegerv() with the parameter GL_MAX_VERTEX_STREAMS, and all OpenGL implementations are required to support at least four geometry shader output streams. When the stream number is given at global scope, all subsequently declared geometry shader outputs become members of that stream until another output stream layout qualifier is specified. The default output stream for all outputs is zero. That is, unless otherwise specified, all outputs belong to stream zero. The global stream layout qualifiers shown in Example 10.9 demonstrate how to assign geometry shader outputs to different streams.

// Redundant as the default stream is 0
layout (stream=0) out;
// foo and bar become members of stream 0
out vec4 foo;
out vec4 bar;
// Switch the output stream to stream 1
layout (stream=1) out;
// proton and electron are members of stream 1
out vec4 proton;
flat out float electron;
// Output stream declarations have no effect on input
// declarations elephant is just a regular input
in vec2 elephant;
// It’s possible to go back to a previously
// defined stream
layout (stream=0) out;
// baz joins it’s cousins foo and bar in stream 0
out vec4 baz;
// And then jump right to stream 3, skipping stream 2
// altogether
layout (stream=3) out;
// iron and copper are members of stream 3
flat out int iron;
out vec2 copper;

The declarations in Example 10.9 set up three output streams from a geometry shader, numbered zero, one, and three. Stream zero contains foo, bar, and baz, stream one contains proton and electron and stream three contains iron and copper. Note that stream two is not used at all and there are no outputs in it. An equivalent stream mapping can be
constructed using output interface blocks and is shown in Example 10.10.

Example 10.10 Example 10.9 Rewritten to Use Interface Blocks

// Again, redundant as the default output stream is 0
layout (stream=0) out stream0
{
	vec4 foo;
	vec4 bar;
	vec4 baz;
};
// All of stream 1 output
layout (stream=1) out stream1
{
	vec4 proton;
	flat float electron;
};
// Skip stream 2, go directly to stream 3
layout (stream=3) out stream3
{
	flat int iron;
	vec2 copper;
};

as can be seen in example 10.10, grouping memebers of a single stream in an interface block can make the declarations appear more organized and so easier to read. now that we have defined which outputs belong to which streams, we need to direct output vertices to one or more of those streams. as with a regular, single-stream geometry shader, vertices are emitted and primitives are ended programmatically using special built-in glsl functions. when multiple output streams are active, the function to emit vertices on a specific stream is EmitStreamVertex(int stream) and the function to end a primitive on a specific stream is EndStreamPrimitive(int stream). calling EmitVertex is equivalent to calling EmitStreamVertex with stream set to zero. Likewise, calling EndPrimitive is equivalent to calling EndStreamPrimitive with stream set to zero.

when EmitStreamVertex is called, the current values for any variables associated with the specified stream are recorded and used to form a new vertex on that stream. just as when EmitVertex is called, the values of all output variables become undefined, so too do they become undefined when EmitStreamVertex is called. in fact, the current values of all output variables on all streams become undefined. this is an important consideration as code that assumes that the values of output variables remain consistent across a all to EmitStreamVertex (or EmitVertex) may work on some opengl implementations and not others, and most shader compilers will not warn about this——especially on implementations where it will work!

to illustrate, consider the example shown in example 10.11.
Example 10.11 Incorrect Emission of Vertices into Multiple Streams

// Set up outputs for stream 0
foo = vec4(1.0, 2.0, 3.0, 4.0);
bar = vec4(5.0);
baz = vec4(4.0, 3.0, 2.0, 1.0);
// Set up outputs for stream 1
proton = atom;
electron = 2.0;
// Set up outputs for stream 3
iron = 4;
copper = shiny;
// Now emit all the vertices
EmitStreamVertex(0);
EmitStreamVertex(1);
EmitStreamVertex(3);

this example will produce undefined results because it assumes that the values of the ouput variables associated with streams 1 and 3 remain valid across the calls to EmitStreamVertex. this is incorrect, and on some opengl implementations, the values of proton, electron, iron and copper will become undefined after the first call to EmitStreamVertex. such a shader should be written as shown in example 10.12.
Example 10.12 Corrected Emission of Vertices into Multiple Streams

// Set up and emit outputs for stream 0
foo = vec4(1.0, 2.0, 3.0, 4.0);
bar = vec4(5.0);
baz = vec4(4.0, 3.0, 2.0, 1.0);
EmitStreamVertex(0);
// Set up and emit outputs for stream 1
proton = atom;
electron = 2.0;
EmitStreamVertex(1);
// Note, there’s nothing in stream 2
// Set up and emit outputs for stream 3
iron = 4;
copper = shiny;
EmitStreamVertex(3);

now that we have a shader that outputs vertices on multiple output streams, we need to inform opengl how those streams are mapped into transform feedback buffers. this mapping is specified with the glTransformFeedbackVariying() function just as when only a single output stream is present. under normal circumstances, all output variables are to tbe captured by tansform feedback recorded into a single buffer (by specifying GL_INTERLEAVED_ATTRIBS as the bufferMode parameter to glTransformFeedbackVaryings()) or into a separate buffer for each variable (by specifying GL_SEPARATE_ATTRIBS). when multiple streams are active, it is required that variables assoiated with a single stream are not written into the same buffer binding point as those associated with any other stream. it may be desirable, however, to have some or all of the varyings associated with a single stream written, to have some or all of the varyings associated with a single stream written, interleaved, into a single buffer. to provide this functionality, the reserved variable name gl_NextBuffer is used to signal that the following output variables are to be recorded into the buffer object bound the next transform feedback binding point. Recall from Chapter 3 that gl_NextBuffer is not a real variable—it cannot be used in the shader; it is provided solely as a marker to delimit groups of variables that will be written into the same buffer. For Examples 10.9 and 10.10, we will record the variables for the first stream (foo, bar, and baz) into the buffer object bound to the first transform feedback buffer binding point, the variables for the second stream (proton and electron) into the buffer bound to the second binding point, and
finally the variables associated with stream 3 (iron and copper) into the buffer bound to the third buffer binding point. Example 10.13 shows how to express this layout.

Example 10.13 Assigning Transform Feedback Outputs to Buffers

static const char * const vars[] =
{
"foo", "bar", "baz", // Variables from stream 0
"gl_NextBuffer", // Move to binding point 1
"proton", "electron", // Variables from stream 1
"gl_NextBuffer", // Move to binding point 2
// Note, there are no variables
// in stream 2
"iron", "copper" // Variables from stream 3
};
glTransformFeedbackVaryings(prog, sizeof(vars) / sizeof(vars[0]), varyings, GL_INTERLEAVED_ATTRIBS); 
glLinkProgram(prog);

notice the call to glLinkProgram() after the call to glTransformFeedbackVaryings() in example 10.13. as previously mentioned, the mapping specified by glTransformFeedbackVaryings() does not take effect unitl the next time the program object is linked. therefore, it is necessary to call glLinkProgram() after glTransformFeedbackVaryings() before the program object is used.

If rasterization is enabled and there is a fragment shader present, the output variables belonging to stream 0 (foo, bar, and baz) will be used to form primitives for rasterization and will be passed into the fragment shader. Output variables belonging to other streams (proton, electron, iron, and copper) will not be visible in the fragment shader and if transform feedback is not active, they will be discarded. Also note that when multiple output streams are used in a geometry shader, they must all have points as the primitive type. This means that if rasterization is used in conjunction with multiple geometry shader output streams, an application is limited to rendering points with that shader.

Primitive Queries
Transform feedback was introduced in ‘‘Transform Feedback’’ on Page 239 as a method to record the output of a vertex shader into a buffer that could be used in subsequent rendering. because the vertex shader is simple, one-in, one-out pipeline stage, it is known up front how many vertices the vertex shader will generate. assuming that the transform feedback buffer is large enough to hold all of the output data, the number of verties stored in the transform feedback buffer is simply the number of vertices processed by the vertex shader. such a simple relationship is not present for the geometry shader. because the geometry shader can emit a variable number of vertices per invocation, the number of vertices recorded into transform feedback buffers when a geometry shader is present may not be easy to infer. in addition to this, should the space available in the transform feedback buffers be exhausted, the geometry shader will produce more vertices than are actually recorded, the geometry shader will produce more vertices than are actually recorded. Those vertices will still be used to
generate primitives for rasterization (if they are emitted on stream 0), but they will not be written into the transform feedback buffers.

To provide this information to the application, two types of queries are
available to count both the number of primitives the geometry shader
generates, and the number of primitives actually written into the
transform feedback buffers. These are the GL_PRIMITIVES_GENERATED
and GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN queries. The GL_PRIMITIVES_GENERATED query counts the number of vertices output by the geometry shader, even if space in the transform feedback buffers
was exhausted and the vertices were not recorded. The GL_TRANSFORM_ FEEDBACK_PRIMITIVES_WRITTEN query counts the number of vertices actually written into a transform feedback buffer. Note that the GL_PRIMITIVES_GENERATED query is valid at any time, even when transform feedback is not active (hence the lack of TRANSFORM_FEEDBACK in the name of the query), whereas GL_TRANSFORM_FEEDBACK_PRIMITIVES_ WRITTEN only counts when transform feedback is active.

this makes sense. in a way, a GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN query does continue to count when transform feedback is not active, but as no primitives are written, it will not increment, and so the result is the same.

Because a geometry shader can output to multiple transform feedback streams, primitive queries are indexed. That is, there are multiple binding points for each type of query—one for each supported output stream. To begin and end a primitive query for a particular primitive stream, call:

void glBeginQueryIndexed(GLenum target, GLuint index, GLuint id);

begins a query using the query object id on the indexed query target point specified by target and index.

and

void glEndQueryIndexed(GLenum target, GLuint index);

Ends the active query on the indexed query target point specified by target and index.

Here, target is set to either GL_PRIMITIVES_GENERATED or
GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN, index is the index of
the primitive query binding point on which to execute the query, and id is the name of a query object that was previously created using the glGenQueries() function. Once the primitive query has been ended, the availability of the result can be checked by calling glGetQueryObjectuiv() with the pname parameter set to GL_QUERY_RESULT_AVAILABLE and
the actual value of the query can be retrieved by calling glGetQueryObjectuiv() with pname set to GL_QUERY_RESULT. Don’t forget that if the result of the query object is retrieved by calling glGetQueryObjectuiv() with name set to GL_QUERY_RESULT and the result was not available yet, the GPU will likely stall 熄火, significantly reducing performance.

It is possible to run both a GL_PRIMITIVES_GENERATED and a GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN query simultaneously on the same stream. If the result of the GL_PRIMITIVES_ GENERATED query is greater than the result of the GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN query, it may indicate that the transform feedback buffer was not large enough to record all of the results.

using transform feedback results
now that the number of vertices recorded into a transform feedback buffer is known, it is possible to pass that vertex count into a function like glDrawArrays() to use it as the source of vertex data in subsequent rendering. however, to retrieve this count requires the cpu to read information generated by the gpu, which is generally detrimental 有害的 to performance. in this case, the cpu will wait for the gpu to finish rendering anything that might contribute to the primitive count, and then the gpu will wait for the cpu to send a new rendering command using that count. ideally, the count would never make the round trip from the gpu the the cpu and back again. to achieve this, the opengl commands glDrawTransformFeedback() and glDrawTransformFeedbackStream() are supplied. The prototypes of these functions are as follows:

void glDrawTransformFeedback(GLenum mode, GLuint id);
void glDrawTransformFeedbackStream(GLenum mode, GLuint id, GLuint stream);

Draw primitives as if glDrawArrays() had been called with mode set as specified, first set to zero and count set to the number of primitives captured by transform feedback stream stream on the transform feedback
object id.
Calling glDrawTransformFeedback() is equivalent to calling glDrawTransformFeedbackStream() with stream set to zero.

when glDrawTransformFeedbackStream() is called, it is equivalent to calling glDrawArrays() with the same mode parameter, with first set to zero, and with the count parameter taken from a virtual GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN query running on stream stream of the transform feedback object id. note that there is no need to execute a real GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN query and the primitive count is never actually transfered from the gpu to the cpu. also, there is no requirement that the buffers used to record the results of the transform feedback operation need to be bound for use in the new draw. The vertex count used in such a draw is whatever was recorded the last time glEndTransformFeedback() was called while the transform
feedback object id was bound. It is possible for transform feedback to still be active for id—the previously recorded vertex count will be used.

By using the glDrawTransformFeedbackStream() function, it is possible to circulate the result of rendering through the pipeline. By repeatedly calling glDrawTransformFeedbackStream(), vertices will be transformed by the OpenGL vertex and geometry shaders. Combined with double buffering of vertex data,8 it is possible to implement recursive algorithms that change the number of vertices in flight on each iteration of the loop.

double buffering is required because undefined results will be produced if the same buffer objects are bound both for transform feedback and as the source of data.

Drawing transform feedback may be combined with instancing to allow you to draw many instances of the data produced by transform feedback. To support this, the functions glDrawTransformFeedbackInstanced() and glDrawTransformFeedbackStreamInstanced() are provided. Their prototypes are as follows:

void glDrawTransformFeedbackInstanced(GLenum mode, GLuint id, GLsizei instancecount);
void glDrawTransformFeedbackStreamInstanced(GLenum mode, GLuint id, GLuint stream, GLsizei instancecount);

Draw primitives as if glDrawArraysInstanced() had been called with first set to zero, count set to the number of primitives captured by transform feedback stream stream on the transform feedback object id and with mode and instancecount passed as specified. Calling glDrawTransformFeedbackInstanced() is equivalent to calling glDrawTransformFeedbackStreamInstanced() with stream set to zero.

combining multiple streams and DrawTransformFeedback
As a worked example of the techniques just described, we’ll go over an application that demonstrates how to use a geometry shader to sort incoming geometry, and then render subsets of it in subsequent passes. In this example, we’ll use the geometry shader to sort ‘‘left-facing’’ and ‘‘right-facing’’ polygons—that is, polygons whose face normal points to the left or right. The left-facing polygons will be sent to stream zero, while the right-facing polygons will be sent to stream one. Both streams will be recorded into transform feedback buffers. The contents of those buffers will then be drawn using glDrawTransformFeedbackStream() while a different program object is active. This causes left-facing primitives to be rendered with a completely different state from right-facing primitives, even though they are physically part of the same mesh.

first, we will use a vertex shader to transform incoming vertices into view space. this shader is shown in example 10.14 below.
在顶点着色器中,将顶点转换到view空间。

Example 10.14 Simple Vertex Shader for Geometry Sorting

#version 330 core
uniform mat4 model_matrix;
layout (location = 0) in vec4 position;
layout (location = 1) in vec3 normal;
out vec3 vs_normal;

void main()
{
	vs_normal = (model_matrix * vec4(normal, 0.0)).xyz;
	gl_Position = model_matrix * position;
}

Vertices enter the geometry shader shown in Example 10.15 in view space. This shader takes the incoming stream of primitives, calculates a per-face normal, and then uses the sign of the X component of the normal to determine whether the triangle is left-facing or right-facing. The face normal for the triangle is calculated by taking the cross product of two of its edges. 面的法线是通面上的两条边叉乘得到。 Left-facing triangles are emitted to stream zero and right-facing triangles are emitted to stream one, where outputs belonging to each stream will be recorded into separate transform feedback buffers.
面向左边的三角形,发射到0号stream;面向右边的三角形发射到1号stream;0号和1号的流分别记录到不同的tfo对象上。

Example 10.15 Geometry Shader for Geometry Sorting

#version 330 core
// triangles input, points output (although we will write three points for each incoming triangle).
layout (triangles) in;
layout (points, max_vertices = 3) out;
uniform mat4 projection_matrix;
in vec3 vs_normal[];


// stream 0 -left-facing polygons
layout(stream = 0) out vec4 lf_position;
layout(stream = 0) out vec3 lf_normal;

// stream 1 -- right-facing polygons
layout(stream = 1) out vec4 rf-position;
layout(stream = 1) out vec3 rf_normal;

void main()
{
	// take the three vertices and find the (unnormalized face normal)
	vec4 A = gl_in[0].gl_Position;
	vec4 B = gl_in[1].gl_Position;
	vec4 C = gl_in[2].gl_Position;
	vec3 AB = (B-A).xyz;
	vec3 AC = (C-A).xyz;
	vec3 face_normal = cross(AB, AC);
	int i;
	// if the normal's X coordinate is negative, it faces to the left of the viewer and is "left-facing", so stuff 塞满 it in stream 0 
	if(face_normal.x < 0.0)
	{
		// for each input vertex ...
		for(i =0; i<gl_in.length(); i++)
		{
			// transform to clip space
			lf_position = projection_matrix * (gl_in[i].gl_Position - vec4(30.0, 0.0, 0.0, 0.0));
			// copy the incoming normal to the output stream
			lf_normal = vs_normal[i];
			EmitStreamVertex(0);
		}
		// calling EndStreamPrimitive is not strictly necessary as these are points
		EndStreamPrimitive(0);
	}
	// otherwise, it is "right-facing" and we should write it to stream 1. 
	else
	{
		// exactly as above but writing to rf_position and rf_normal for stream 1
		for (i = 0; i < gl_in.length(); i++)
		{
			rf_position = projection_matrix * (gl_in[i].gl_Position -
 vec4(30.0, 0.0, 0.0, 0.0));
			rf_normal = vs_normal[i];
			EmitStreamVertex(1);
		}
		EndStreamPrimitive(1);
	}
}

when rendering the sorting pass, we will not be rasterizing any polygons, and so our first pass program has no fragment shader. to disable rasterization we will cal glEnable(GL_RASTERIZER_DISCARD). if an attempt is made to render witha program object that does not contain a fragment shader and rasterization is not disabled, an error will generated. before linking the sorting program, we need to specify where the transform feedback varyings will be written to. to do this, we use the code shown in example 10.16 below.
Example 10.16 Configuring Transform Feedback for Geometry Sorting

static const char * varyings[] =
{
// These two varyings belong to stream 0
"rf_position", "rf_normal",
// Move to the next binding point (can’t write
// varyings from different streams to the same buffer
// binding point.)
"gl_NextBuffer",
// These two varyings belong to stream 1
"lf_position", "lf_normal"
};
glTransformFeedbackVaryings(sort_prog, 5, varyings,GL_INTERLEAVED_ATTRIBS);

notice that the output of the geometry shader for stream zero and stream one are identical. the same data is written to the selected stream regardlesss of whether the polygon is left-or right-facing. in the first pass, all of the vertex data recorded into the transform feedback buffers have already been transformed into clip space and so we can reuse that work on the second and third passes that will be used to render it. all we need to supply is a pass-through vertex shader to read the pre-transformed vertices and feed the fragment shader. there is no geometry shader in the second pass.

example 10.17 pass-through vertex shader used for geometry shader sorting

#version 330 core
layout (location = 0) in vec4 position;
layout (location = 1) in vec3 normal;
out vec3 vs_normal;
void main()
{
	vs_normal = normal;
	gl_Position = position;
}

we will use the same fagment shader in the second and third passes, but in a more complex application of this technique, a different shader could be used for each pass.

now, to drive this system we need several objects to manage data and logic at the opengl api level. first, we need two program objects for the programs that will be used in the three passes (one containg the vertex and geometry shaders for sorting the left-facing and right-facing primitives, and one containing the pass-through vertex and fragment shaders for the two rendering passes). we need buffer objects for sorting the input geometry shader and the intermediate data produced by geometry shader. we need a pair of vertex array objects (VAOs) to present the vertex inputs to the two rendering passes. finally, we need a transform feedback object to manage transform feedback data and primitive counts. the code to set all this up is given in example 10.18 below.
我们需要一对VAO来存放两个渲染通道的顶点输入。最后我们还需要一个tfo对象来管理tfo数据和图元数。

Example 10.18 OpenGL Setup Code for Geometry Shader Sorting

// create a pair of vertex array objects and buffer objects to store the intermediate data. 
glGenVertexArrays(2, vao);
glGenBuffers(2, vbo);

// create a transform feedback object upon which transform feedback operations (including the following buffer bindings) will operate, and then bind it. 
glGenTransformFeedbacks(1, &xfb);
glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, xfb);
// for each of the two streams...
for(i = 0; i<2;++i)
{
	// bind the buffer object to create it.
	glBindBuffer(GL_TRANSFORM_FEEDBACK_BUFFER, vbo[i]);
	// call glBufferData to allocate space. 220 floats should be enough for this example. note GL_DYNAMIC_COPY, this means that the data will change often (DYNAMIC) and will be both written by and used by the GPU(COPY).
	glBufferData(GL_TRNASFORM_FEEDBACK_BUFFER, 1024 * 1024 * sizeof(GLfloat), NULL, GL_DYNAMIC_COPY);
	// now bind it to the transform feedback buffer binding point cooresponding to the stream.
	glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, i, vbo[i]);
	
	// now set up the VAOs. first, bind to create. 
	glBindVertexArray(vao[i]);
	//now bind the VBO to the ARRAY_BUFFER binding.
	glBindBuffer(GL_ARRAY_BUFFER, vbo[i]);
	// set up the vertex attribues for position and normal ...
	glVertexAttribPoniter(0, 4, GL_FLOAT, GL_FALSE, sizeof(vec4) + sizeof(vec3), NULL);
	glVertexAttribPointer(1, 3, GL_FLOAT, GL_FALSE, sizeof(vec4) + sizeof(vec3), (GLvoid *)(sizeof(vec4)));
	// and remember to enable them!
	glEnableVertexAttribArray(0);
	glEnableVertexAttribArray(1);
}

once we have created and set up all of our data management objects, we need to write our rendering loop. the general flow is shown in figure 10.8 below. the first pass is responsible for sorting the geometry into front-and-back-facing polygons and performs no rasterization. the second and third passes are essentially identical in this example, although a completely different shading algorithm could be used in each. these passes actually render the sorted geometry as if it were supplied by the appliation.

在这里插入图片描述
for the first pass, we bind the VAO representing the original input geometry and the program object containing the sorting geometry shader. we bind the transform feedback object and the intermediate buffer to the transform feedback buffer binding, start transform feedback, and draw the original geometry. the geometry shader sorts the incoming triangles into left and right facing groups and sends them to the appropriate stream. after the first pass, we turn off transform feedback.

glEnable(GL_RASTERIZER_DISCARD);//关闭光栅化
glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, xfb); //激活xfb
glBeginTransformFeedback(GL_POINTS); //开始捕捉点
object.Render(); //渲染开始
glEndTransformFeedback(); //捕捉结束
glBindTransformFeedback(GL_TRANSFORM_FEEDBACK, 0); //非激活xfb
glDisable(GL_RASTERIZER_DISCARD);//关闭光栅化

for the second pass, bind the VAO representing the intermediate data written to stream zero, bind the second pass program object, and use glDrawTransformFeedbackStream() to draw the inermediate left-facing geometry using the primitives-written count from stream zero on the first pass. likewise, in the third pass we draw the right-facing geometry by using glDrawTransformFeedbackStream() with stream one.

Example 10.19 Rendering Loop for Geometry Shader Sorting

glUseProgram(render_prog);
glUniform4fv(0, 1, colors[0]);
glBindVertexArray(vao[0]);
glDrawTransformFeedbackStream(GL_TRIANGLES, xfb, 0);
glUniform4fv(0, 1, colors[1]);
glBindVertexArray(vao[1]);
glDrawTransformFeedbackStream(GL_TRIANGLES, xfb, 1);

the output of the program shown in example 10.19 is shown in figure 10.9. while this is not the most exciting program ever written, it does demonstrate the techniques involved in configuring and using transform feedback with multiple streams and the glDrawTransformFeedback() function.
在这里插入图片描述

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值