OGL（教程30）——Basic Tessellation

本文链接：https://blog.csdn.net/wodownload2/article/details/83108689

tessllation is an exciting new feature in opengl 4.x.
the core problem that tessellation deals with is the static nature of 3D models in terms of their detail and polygon count.

the thing is that when we look at a complex model such as a human face up close we prefer to use a highly detailed model that will bring out the tiny details (e.g. skin bumps, etc).

a highly detailed model automatically translates to more triangles and more compute power required for processing.

when we render the same model at a greater distance we prefer to use a lower detailed model and allow more compute resources to the objects that are closer to the camera.
this is simply a matter of balancing GPU resources and diverting 转移 more resources to the area near the camera where small details are more noticeable.

one possible way to solve this problem using the existing features of opengl is to generate the same model at multiple levels of detail (LOD).

for example, highly detailed, average and low.
we can then select the version to use based on the distance from the camera.
this, however, will require more artist resources and often will not be flexible enough.
what we need is a way to start with a low polygon model and subdivide each triangle on the fly into smaller triangles.

this, in a nutshell, is Tessellation.
being able to do all this dynamically on the GPU and also select the level of detail per triangle is part of what the tessellation pipeline in opengl 4.x provides.

tessellation has been defined and integrated into the opengl spec after several years of research both in the academia as well as the industry.

its design was heavily influenced by the mathematical background of geometric surfaces and curves, bezier patches and subdivision 细分.
we will engage tessellation in two steps.
in this tutorial we will focus on the new mechanics of the pipeline in order to get tessellation up and running without too much mathematical hassle 麻烦.
the technique itself will be simple but it will expose all the relevant components.
in the next tutorial we will study bezier patches and see how to apply them to apply them to a tessellation technique.

let us take a look at how tessellation has been implemented in the graphics pipeline.
the core components that are responsible for tessellation are two new shader stages and in between them a fixed function stage that can be configured to some degree but does not run a shader.

the first shader stage is called tessellation control shader (TCS), the fixed function stage is called the primitive generator (PG), and the second shader stage is called tessellation evaluation shader (TES).
here is a diagram showing the location of the new stages in the pipeline:

在这里插入图片描述

the TCS works on a group of vertices called control points (CP).
the CPs do not have a well defined polygonal form such as a triangle, square, pentagon 五角星 or whatever.
instead, they define a geometric surface.
this surface is usually defined by some polynomial formula and the idea is that moving a CP has an effect on the entire face. 控制点的移动影响表面的形状。
u are probably familiar with some graphic software that allows u to define surfaces or curves using a set of CPs and shape them by moving the CPs.
the group of CPs is usually called a Patch. 面片
the yellow surface in the following picture is defined by a patch with 16 CPs:

在这里插入图片描述
the TCS takes an input patch and emits an output patch.
the developer has the option in the shader to do some transformation on the CPs or even add/delete CPs.

in addition to the output patch the control shader calculates a set of numbers called Tessellation Levels (TL).
the TLs determine the Tessellation level of detail - how many triangles to generate for the patch.
since all this happens in a shader the developer has the freedom to use any algorithm in order to calculate the TLs.

for example, we can decide that the TLs will be 3 if the rasterized triangle is going to cover less than a 100 pixel, 7 in case of 101 to 500 pixels and 12.5 for everything abovee that (we will later see how the value of the TL translates into coarser 粗粒度 or finer tessellation). another algorithm can be based on a distance from the camera.

the nice thing about all of this is that each patch can get different TLS according to its own characteristics.

after the TCS finishes comes the fixed function PG whose job is to do the actual subdivision.
this is probably the most confusing point for newcomers.
the thing is that the PG does not really subdivides the output patch of the TCS.
in fact, it does not even have access to it.
instead, it takes the TLs and subdivides what is called a Domain.
the domain can either be a normalized (in the range of 0.0-1.0) square of 2D coordinates or an equilateral triangle 等边三角形 defined by 3D barycentric 重心坐标 coordinates:
在这里插入图片描述
http://mathworld.wolfram.com/BarycentricCoordinates.html

Barycentric coordinates of a triangle is a method of defining a location inside a triangle as a combination of the weight of the three vertices.

the vertices of the triangle as designated as U,V and W and as the location gets closer to one vertex its weight increases while the weight of the other vertices decreases.

if the location is exactly on a vetex the weight of that vertex is 1 while the other two are zero.

for example, the barycentric coordinate of U is (1,0,0), V is (0,1,0) and W is (0,0,1).
the center of the triangle is on the barycentric coordiante of (1/3,1/3,1/3).

the interesting property of barycentric coordinates is that if we sum up the individual components of the barycentric coordinate of each and every point inside the triangle we always get 1.

for simplicity let us focus on the triangle domain from now on.

the PG takes the TLs and based on their values generates a set of points inside the triangle.
each point is defined by its own barycentric coordinate.
the developer can configure the output topology to be either points or triangles.
if points are chosen then the PG simply sends them down the pipeline to be rasterized as points.

if triangles are chosen the PG connects all the points together so that the entire face of the triangle is tessellated with smaller triangles:

在这里插入图片描述

in general, the TLs tell the PG the number of segements on the outer edge of the triangle and the number rings towards the center.

so how do the small triangles in the above picture relate to the patch that we say earlier?
well, it depends on what u want to do with Tessellation.
one very simple option (and the one that we will use in this tutorial) is to skip the whole notion of curved geometric surfaces with their polynomial representation and simply say that the triangles from your model are simply mapped to patches.
in that case the 3 triangle vertices becomes our 3 CPs and the original triangle is both the input and output patch of the TCS.
we use the PG to tessellate the triangle domain and generate small ‘generic’ triangles represented by barycentric coordinates and use a linear combination of these coordinates (i.e. multiply them by the attributes of the original triangle) in order to tessellate the triangles of the original model.

in the next tutorial we will see an actual use of the patch as a representative of a geometric surface.
at any rate, remember that the PG ignores both the input and output patch of the TCS. all it cares about are the per patch TLS.

这段话的主要意思是：
PG阶段只关心TL，即细分的段数；而不关心TCS的输入和输出面片。
PG阶段会根绝TL的数量，进行细分处一些点，这些点的信息只包括质心坐标信息，具体这个定的其他的信息，则是在TES中进行计算，TES根据质心坐标计算出真正可用的点，包括位置、法线信息等。

so after the PG has finished subdividing the triangle domain we still need someone to take the results of this subdivision and do sth. with it.
after all, the PG does not even have access to the patch.
its only output are barycentric coordinates and their connectivity.
enter the TES. this shader stage has access both to the output patch of the TCS and the barycentric coordinates that the PG generated.
the PG exectues the TES on every barycentric coordiante and the job of the TES is to generate a vertex for that point.
since the TES has access to the patch it can take stuff from it such as position, normal, etc and use them to generate the vertex.
after the PG executes the TES on the tree barycentric coordinates of a ‘small’ triangle it takes the three vertices the TES generated and sends them down as a complete triangle for rasterization.

这段主要的意思是，PG阶段无权访问Path，只能根据TL数量分出点，且这些点只包含质心坐标，在TES阶段会根据质心坐标进行计算出真正可用的点。最中输出给管线的下个阶段进行光栅化处理。

the TES is similar to the VS in the scene that it always has a single input (the barycentric coordiante) and a simple output (the vertex).
the TES can not generate more than one vertex per invocation nor can it decide to drop the vertex.
the main purpose of the TES that the architects of tessellation in opengl envisioned is to evaluate the surface equation at the given domain location.
in simpler terms this means placing the barycentric coordinate in the polynomial that represents the surface and calcualte the result.
the result is the position of the new vertex which can then be transformed and projected as usual.
as u can see, when dealing with geometric surfaces the higher we choose our TLs, the more domain locations we get and by evaluating them in the TES we get more vertices that better represent the true mathematical surface.
in this tutorial the evaluation of the surface equation will simply be a linear combination.

after the TES has processed the domain locations the PG takes the new vetices and sends them as triangles to the next stages of the pipeline.
after the TES comes either the GS rasterizer and form here on everything runs as well.

let us summarize the entire pipeline:

the VS is executed on every vertex in a patch. the patch comprises several CPs from the vertex buffer (up to a limit defined by the driver and GPU).
the TCS takes the vertices that have been processed by the VS and generates an output patch. in addition, it generates TLs.
based on the configured domain, the TLs it got from the TCS and the configured output topology, the PG generates domain location and their connectivity.
the TES is executed on all generated domain locations.
the primitives that were generated in step3 continue down the pipe. the output from the TES is their data.
processing continues either at the GS or at the rasterizer.

Source walkthru
(tutorial30.cpp:80)

GLint MaxPatchVertices = 0;
glGetIntegerv(GL_MAX_PATCH_VERTICES, &MaxPatchVertices);
printf("Max supported patch vertices %d\n", MaxPatchVertices);	
glPatchParameteri(GL_PATCH_VERTICES, 3);

when Tessellation is enabled (i.e. when we have either a TCS or TES) the pipeline needs to know how many vertices comprise each input patch.
remember that a patch does not necessarily have a defined geometric form.
it is simply a list of control points.
the call to glPatchParameteri() in the code excerpt above 上面的摘录 tells the pipeline that the size of the input patch is going to be 3.
that number can be up to what the driver defines as GL_MAX_PATCH_VERTICES.
this value can be different from one GPU/driver to another so we fetch it using glGetIntegerv() and print it.

#version 410 core

layout(location = 0) in vec3 Position_VS in;
layout(location = 1) in vec2 TexCoord_VS_in;
layout(location = 2) in vec3 Normal_VS_in;

uniform mat4 gWorld;
out vec3 WorldPos_CS_in;
out vec2 TexCoord_CS_in;
out vec3 Normal_CS_in;

void main()
{
	WorldPos_CS_in = (gWorld * vec4(Position_VS_in, 1.0)).xyz;
    TexCoord_CS_in = TexCoord_VS_in;
    Normal_CS_in = (gWorld * vec4(Normal_VS_in, 0.0)).xyz;
}

this is our VS and the only difference between it and the previous ones is that we are no longer transforming the local space coordinates to clip space (by multiplying by the world-view-projection matrix).
the reason is that there is simply to point in that.
we expect to generate a lot of new vertices that will need that transformation anyway.
therefore, this action is postphoned until we get to the TES.

(lighting.cs)
#version 410 core
// define the number of CPs in the output patch
layout (vertices = 3) out;
uniform vec3 gEyeWorldPos;
// attributes of the input CPs
in vec3 WorldPos_CS_in[];
in vec2 TexCoord_CS_in[];
in vec3 Normal_CS_in[];

// attributes of the output CPs
out vec3 WorldPos_ES_in[];
out vec2 TexCoord_ES_in[];
out vec3 Normal_ES_in[];

this is the start of the TCS. it is executed once per CP in the ouput patch and we start by defining the number of CPs in the output patch. next we define a uniform variable that we will need in order to calculate the TLS.
after that we have a few input and output CP attributes.
in this tutorial, we have the same structure for both the input and output patch but it does not always have to be this way.
each input and output CP has a world position, texture coordinate and normal.
since we can have more than one CP in the input and output patches each attribute is defined using the array modifier[].
this allow us to freely index into any CP.

lighting.cs:33

void main()
{
	// set the control points of the output patch
	TexCoord_ES_in[gl_InvocationID] = TexCoord_CS_in[gl_InvocationID];
    Normal_ES_in[gl_InvocationID] = Normal_CS_in[gl_InvocationID];
    WorldPos_ES_in[gl_InvocationID] = WorldPos_CS_in[gl_InvocationID];
}

we start the main function of the TCS by copying the input CP into the output CP.
this function is executed once per output CP and the builtin variable gl_InvocationID contains the index of the current invocation.
the order of execution is undefined because the GPU probably distributes the CPs across of its cores and runs them in parallel.
we use gl_InvocationID as an index into both the input and output patch.

(lighting.cs:40)

    // Calculate the distance from the camera to the three control points
    float EyeToVertexDistance0 = distance(gEyeWorldPos, WorldPos_ES_in[0]);
    float EyeToVertexDistance1 = distance(gEyeWorldPos, WorldPos_ES_in[1]);
    float EyeToVertexDistance2 = distance(gEyeWorldPos, WorldPos_ES_in[2]);

    // Calculate the tessellation levels
    gl_TessLevelOuter[0] = GetTessLevel(EyeToVertexDistance1, EyeToVertexDistance2);
    gl_TessLevelOuter[1] = GetTessLevel(EyeToVertexDistance2, EyeToVertexDistance0);
    gl_TessLevelOuter[2] = GetTessLevel(EyeToVertexDistance0, EyeToVertexDistance1);
    gl_TessLevelInner[0] = gl_TessLevelOuter[2];
}

after generating the output patch we calcualte the TLs.
the TLs can be set differently for each output patch.
opengl provides two builtin arrays of floating points for the TLs:
gl_TessLevelOuter (size 4) and
gl_TessLevelInner (size 2).
in the case of a triangle domain we can use only the first 3 memebers of gl_TessLevelOuter and the first
member from gl_TessLevelInner (in addition to the triangle domain there are also the quad and isoline 等值线 domain that
provide different access to arrays).
gl_TessLevelOuter[] roughly determines the number of segements on each edge and gl_TessLevelInner[0] roughly determines how many rings the triangle will contain.
if we designate the triangle vertices as U,V and W then the corresponding edge for each vertex is the one which is opposite to it:
在这里插入图片描述

the algorithm we use to calcualte the TLs is very simple and is based on the distance in world space between the camera and the vertices.
it is implemented in the function GetTessLevel (see below).
we calculate the distance between the camera and each vertex and call GetTessLevel() three times to update each member in gl_TessLevelOuter[].
each entry is mapped to an edge according to the picture above (TL of edge 0 goes to gl_TessLevelOuter[0], etc)
and the TL for that edge is calcualted based on the distance from the camera to the two vertices that create it.
the inner TL is selected the same as the TL of edge W.

u can use any algorithm that u want to calcualte the TLs.
for example, one algorithm estimates the size of the final triangle on the screen in pixels and sets the TLs such that no tessllated triangle becomes smaller than a given number of pixels.

(lighting.cs:18)

float GetTessLevel(float Distance0, float Distance1)
{
    float AvgDistance = (Distance0 + Distance1) / 2.0;

    if (AvgDistance <= 2.0) {
        return 10.0;
    }
    else if (AvgDistance <= 5.0) {
        return 7.0;
    }
    else {
        return 3.0;
    }
}

this function calcualtes the TL for an edge based on the distance from the camera to the two vertices of the edge.
we take the average distance and set the TL to 10 or 7 or 3.
as the distance grows we prefer a smaller TL so as not to waste GPU cycles.

(lighting.es)

#version 410 core

layout(triangles, equal_spacing, ccw) in;

this is the start of the TES. the ‘layout’ keyword defines three configuration items:

triangles this is the domain the PG will work on. the other two options are quads and isolines.等值线
equal_spacing means that the triangle edges will be subdivided into segments with equal lengths (according to the TLs).
u can also use fractional_even_spacing or fractional_odd_spacing that provide a smoother transition between the lengths of the segments whenever the TL crosses an even or odd integer.
for example, if u use fractional_odd_spacing and the TL is 5.1 it means there will be 2 very short segements and 5 longer segments. 这里什么意思，当TL=5.1，向上取整得到6，六个点，就是7段，有2段是非常短的，而有5段是比较长的意思？？？
as the TL grows towards 7 all the segments become closer in length.
when the TL hits 7 two new very short segments are created.
fractional_even_spacing is the same with even integer TLs.
这里讲的不是很清楚。
ccw means that the PG will emit triangles in counter-clocwise order (u can also use cw for clockwise order).
u may be wondering why we are doing that while our front facing triangles in clockwise order.
the reason is that the model i supplied with this tutorial (quad2.obj) was generated by Blender in counter clockwise order.
i could also have specified the Assimp flag ‘aiProcess_FlipWindingOrder’ when loading the model and use ‘cw’ here.
i simply did not want to change the ‘mesh.cpp’ at this point. the bottom line is that whatever u do, make sure u are consistent.

note that u can also specify each configuration item with its own layout keyword.
the scheme above simply saves some space.

(lighting.es:5)

uniform mat4 gVP;
uniform sampler2D gDisplacementMap;
uniform float gDispFactor;

in vec3 WorldPos_ES_in[];
in vec2 TexCoord_ES_in[];
in vec3 Normal_ES_in[];

out vec3 WorldPos_FS_in;
out vec2 TexCoord_FS_in;
out vec3 Normal_FS_in;

the TES can have uniform variables just like any other shader.
the displacement map is basically a height map which means that every texel represents the height at this location.
we will use it to generate bumps on the surface of our mesh. in addition, the TES also access the entire TCS output patch. finally, we declare the attributes of our output vertex.
note that the array modifer is not present here because the TES alyways outputs a single vertex.

(lighting.es:27)

void main()
{
   	// Interpolate the attributes of the output vertex using the barycentric coordinates
   	TexCoord_FS_in = interpolate2D(TexCoord_ES_in[0], TexCoord_ES_in[1], TexCoord_ES_in[2]);
   	Normal_FS_in = interpolate3D(Normal_ES_in[0], Normal_ES_in[1], Normal_ES_in[2]);
   	Normal_FS_in = normalize(Normal_FS_in);
   	WorldPos_FS_in = interpolate3D(WorldPos_ES_in[0], WorldPos_ES_in[1], WorldPos_ES_in[2]);

this is the main function of the TES.
let us recap what we have when we get here. 扼要概述下：
the mesh vertices were processed by the VS and the world space position and normal were calcualted.
the TCS got each triangle as a patch with 3 CPs and simply passed it through to the TES.
the PG subdivided an equilateral triangle into smaller triangles and executed the TES for every generated vertex.
in each TES invocation we can access the barycentric coordinates (a.k.a Tessellation Coordinates) of the vertex in the 3D-vector gl_TessCoord. since the barycentric coordinates within a triangle represent a weight combination of the 3 vertices we can use it to interpolate all the attributes of the new vertex.
the functions interpolated2D() and interpolate3D() (see below) do just that.
they take an attribtue from the CPs of the patch and interpolate it using gl_TessCoord.

(lighting.es:35)

   	// Displace the vertex along the normal
   	float Displacement = texture(gDisplacementMap, TexCoord_FS_in.xy).x;
   	WorldPos_FS_in += Normal_FS_in * Displacement * gDispFactor;
   	gl_Position = gVP * vec4(WorldPos_FS_in, 1.0);
}

having each triangle of the original mesh subdivided into many smaller triangles does not really contribute much to the general appearance of the mesh because the smaller triangles are all on the same plane of the originial triangle.

we would like to offset (or displace) each vertex in a way that will match the contens of our color texture. for example, if the texture contains the image of bricks or rocks we would like our vertices to move along the edges of the bricks or rocks.
to do that we need to complement the color texture with a displacement map.
there are various tools and editors that generate a displacement map and we are not going into the specifics here.
u can find more information on the web.
to use the displacement map we simply need to sample from it using the current texture coordinate and this will give us the height of this vertex.
we then displace the vertex in world space by multiplying the vertex normal by the height and by a displacement factor uniform variable that can be controlled by the application.
so every pixel is displaced along its normal based on its height.
finally, we multiply the new world space position by the view-projection matrix and set it into ‘gl_Position’.

(lighting.es:17)

vec2 interpolate2D(vec2 v0, vec2 v1, vec2 v2)
{
   	return vec2(gl_TessCoord.x) * v0 + vec2(gl_TessCoord.y) * v1 + vec2(gl_TessCoord.z) * v2;
}

vec3 interpolate3D(vec3 v0, vec3 v1, vec3 v2)
{
   	return vec3(gl_TessCoord.x) * v0 + vec3(gl_TessCoord.y) * v1 + vec3(gl_TessCoord.z) * v2;
}

these two function interpolate between a trio of 2D-vectors and 3D-vectors using ‘gl_TessCoord’ as a weight.

(lighting_technique.cpp:277)

bool LightingTechnique::Init()
{
    ...
    if (!AddShader(GL_TESS_CONTROL_SHADER, pTessCS)) {
        return false;
    }

    if (!AddShader(GL_TESS_EVALUATION_SHADER, pTessES)) {
        return false;
    }
    ...

we have two new shader stage so we must compile them.

(mesh.cpp:226)

glDrawElements(GL_PATCHES, m_Entries[i].NumIndices, GL_UNSIGNED_INT, 0);

finally, we have to use GL_PATCHES as the primitive type instead of GL_TRIANGLES.

the demo:
the demo in this tutorial shows how to tessellate a quad terrain and displace vertices along the rocks in the color texture.
u can use ‘+’ and ‘-’ on the keyborad to update the displacement factor and by that control the displacement level.
u can also switch to wireframe mode using ‘z’ and see the actual triangles generated by the tessellation process.
it is interesting to move closer and further away from the terrain in wireframe mode and see how the tessllation level changes based on the distance. this is why we need the TCS.