OpenGL Perspective Projection Matrix


注: 下面的n, f 分别表示近剪切平面, 远剪切平面距离, 因此始终是正的 。

OpenGL如是说:


void glFrustum( GLdouble   left, 
  GLdouble   right, 
  GLdouble   bottom, 
  GLdouble   top, 
  GLdouble   nearVal, 
  GLdouble   farVal);


Parameters

left, right
Specify the coordinates for the left and right vertical clipping planes.
bottom, top
Specify the coordinates for the bottom and top horizontal clipping planes.
nearVal, farVal
Specify the distances to the near and far depth clipping planes. Both distances must be positive.


http://www.scratchapixel.com/lessons/3d-advanced-lessons/perspective-and-orthographic-projection-matrix/opengl-perspective-projection-matrix/


OpenGL Perspective Projection Matrix

Please do not copy the content of this page without our express written permission. Doing so is an infringement of theCopyright Act. If you wish to use some materials from this page, pleaseget in touch with us. Alternatively, you can post a link to this page on your blog or your website.

The OpenGL Perspective Projection Matrix

In all OpenGL books and references, the perspective projection matrix used in OpenGL is defined as:

2nrl00002ntb00r+lrlt+btbf+nfn1002fnfn0


What similarities this matrix has with the matrix we studied in the previous chapter? First, it is important to remember that matrices in OpenGL are defined using a column-major notation (as opposed to a row-major). In lesson on Geometry we have explained that to go from one notation to the other we can simply transpose the matrix. If we were transposing the above matrix we would get:

2nrl0r+lrl002ntbt+btb000f+nfn2fnfn0010


Pay attention to the element in red (third row and fourth column). When we multiply an homogeneous point with this matrix, the point's w coordinate is multiplied by this element and the value of w ends up being the projected point's z coordinate:

[xyzw]=[xyzw=1]2nrl0r+lrl002ntbt+btb000f+nfn2fnfn0010


Pw=0Px+0Px1Pz+00=Pz

    (注: 下面透视除法中除的是-Pz)

Principle

In summary we already know that this matrix is setup properly for the z-divide. Now let see how the points are projected in OpenGL. The principle is course the same as in the previous chapter. We draw a line from the camera's origin to the point P we want to project, and the intersection of this line with the image plane indicates the position of the projected point Ps. The setup is exactly the same as in figure 1 from the previous chapter, however note that in OpenGL the image plane is located on the near clipping plane (rather than being exactly one unit away from the camera's origin).

Figure 1: xx.

The trick we have used in chapter 1 with similar triangles can be used here again. The triangles ΔABC and ΔDEF are similar. Therefore we can write:

ABDE=BCEF


If we replace AB with n the near clipping plane, DE with Pz (the z coordinate of P) and EF with Py (the y coordinate of P) we can re-write this equation as (equation 1): 

nPz=BCPyBC=Psy=nPyPz

从OpenGL的坐标系来看, BC/EF = Psy/Py 始终是正的;因此AB/DE也得是正的,而n是正的,Pz是负的, 因此AB/DE=n/-Pz.

As you can see, the only difference with the equation from chapter 1 is the term n in the numerator but the principle of the division by Pz is the same (note that because we use a right hand coordinate system the Pz is negative. To keep the the y coordinate of the projected point positive if Py is positive, we need to negate Pz) . If we follow the same reasoning we find the x coordinate of the projected P using the following equation (equation 2):

Psx=nPxPz


Derivation

Figure 2: the frustum or viewing volume of a camera is defined by the camera's field of view, the near and far clipping planes and the image aspect ratio. In OpenGL, points are projected on the front face of the frustrum (the near clipping plane). 

Now that we have two values for Psx and Psy we still need to explain how they relate to the the OpenGL perspective matrix. The goal of a projection matrix is to remap the values projected onto the image plane to a unit cube (a cube whose minimum and maximum extents are (-1,-1,-1) and (1,1,1) respectively). However, once the point P is projected on the image plane, Ps is visible if its x and y coordinates are contained within the range [left, rigth] for x and [bottom, top] for y. This is illustrated in figure 2. We will later see how these left, right, bottom top coordinates are computed. Lets just say for now that they define the limits or boundaries on the image plane of the visible points (all the points contained in the viewing frustum and projected on the image plane). Therefore if Psx is contained is visible we can write:

lPsxr


where l and r and the left and right coordinates respectively. Our goal now is to remap the term in the middle ( Psx ) such that final value lies in the range [-1,1] (the dimension of the unit cube along the x-axis). Let start by removing l from all the terms and re-write the above equation as:

0Psxlrl


We can normalize the term on the right by dividing all the terms of this formula by r-l:

0Psxlrl1


Then we will multiply all the terms by 2:

02Psxlrl2


In order to remove -1 from all the terms which express the central term within a range of values defined between -1 and 1.

12Psxlrl11


We get to the result we wanted however we can keep re-arranging the terms and write:

12Psxlrlrlrl1


If we develop we get:

12Psx2lr+lrl1


therefore:

12Psxlrrl112Psxrlr+lrl1


These two terms are quite similar to the first two terms of the first row in the OpenGL perspective projection matrix. We are getting closer. If we replace Psx from the previous equation with equation 2 we get:

12nPxPz(rl)r+lrl1


We can very easily encode this equation using the matrix form. If we replace the first and third coefficients of the matrix first row with the fist and second term of this formula here is what we get:

2nrl......00......0r+lrl......10......0


Remember that OpenGL matrix uses the colum-major convention therefore we will have to write the multiplication sign to the right of the matrix and the point coordintates using the column form:

2nrl......00......0r+lrl......10......0xyzw


Computing Psx using this matrix gives:

Psx=2nrlPx+0Py+r+lrlPz+0Pw


And since Psx will be divided at the end of the process by the Pz when we will convert Ps from homoneous to cartesian coordinate swe will get:

Psx=2nrlPxPz+r+lrl PzPz2nPxPz(rl)r+lrl


This is the first coordinates of the projected point Ps computed using the OpenGL perspective matrix. The derivation is quite long and we will skip it for Psy . However if you follow the steps we have been using for Psx doing it yourself shouldn't be a problem. You just need to replace l and r with b and t and you end up with the following formula:

12nPyPz(tb)t+btb1


We can get this result with a point-matrix multiplication if we replace the second and third coefficients of the matrix second row with the first and second term of this equation:

2nrl0...002ntb...0r+lrlt+btb...100...0


Computing Psy using this matrix gives:

Psy=0Px+2n(tb)Py+t+btbPz+0Pw


and after the division by Pz :

Psy=2nrlPyPz+r+lrlPzPz2nPyPz(rl)r+lrl


Out matrix works again. All we are left to do to complete it, is find a way of remapping the z coordinates of the projected point from -1 to 1. We know that the x and y coordinates of P doesn't contribute to the calculation of the projected point z coordinates. There the first and second coefficients of the matrix third row which would be multiplied by the P's x and y coordinates are necessarily zero (in green). We are left we two coefficients A and B in the matrix which are unknowns (in red).

2nrl00002ntb00r+lrlt+btbA100B0


If we right the equation to compute Psz using this matrix we would get (remember that Psz too is divided by Psw when the points is converted from homogeneous to cartesian coordinates):

Psz=0Px+0Py+APz+BPwPsw=PzAPz+BPsw=Pz


Finding a value for A and for B is what we need to solve. Hopefully we know that when Pz lies on the near clipping plane Psz needs to be remapped to -1 and when Pz lies on the far clipping plane Psz needs to be remapped to 1. Therefore we need to replace Psz by n and f in the equation to get two new equations (note that the z coordinates of all the points projected on the image plane are negative but n and f are positive therefore we will use -n and -f instead):

(Pz=n)A+B(Pz=(n)=n)=1(Pz=f)A+B(Pz=(f)=f)=1 when Pz=n when Pz=fnA+B=nfA+B=f(1)(2)


Lets solve for B in equation 1:

B=n+An


And substitute B in equation 2 with this equation:

fAn+An=f


And solve for A:

fA+An=f+n(fn)A=f+nA=f+nfn


Now that we have a solution for A it is easy to find B. We just replace A in equation 1 to find B:

B=n+An=nf+nfnn=(1+f+nfn)n=(fn+f+n)nfn=2fnfn


We can replace the solution we found for A and B in our matrix and we finally get:

2nrl00002ntb00r+lrlt+btbf+nfn1002fnfn0


which is the OpenGL perspective projection matrix.

Figure 3: the remapping of the projected point's z coordinate is non linear. This graph shows the result of Psz for near = 1 and far = 5.

In some documents on the OpenGL matrix, the formulas used for A and B in the matrix are some times presented as being purposefully designed for remapping z in a non linear fashion way. If we plot the result of Psz for values of Pz going from z = near to z = far, we would get a curve whose shape would be similar to that showed in the adjacent figure. This remapping has the property of representing points nearer to the camera with more numerical precision that the points further away. As we explained in the first chapter (in the notes), this property can be a problem when the lack of numerical precision causes some adjacent samples to have the same depth value after they have been projected to the screen, when their z coordinates in world space are actually different, a problem known as z-fighting. Over remapping functions could have been chosen if the formulas we have derived for A and B had not been suitable but the non linear mapping of the depth samples beside the z-fighting issue is actually not a bad thing: it insures that the visibility of objects nearer to the camera (and to which we visually pay more attention to) is resolved with a higher level of fidelity than what we would get if were using linear remapping. Some references however present the equation Az + B (written in all sort of forms) without explaining its origin and suggest that the non-linear remapping is a choice rather than the result of the logical derivation we have given above.

The field of view and Image Aspect Ratio

Figure 3: side view of the camera. The triangle ACD's apex defines the camera vertical fied of view (FOV). The image plane locatio is defined by the near clipping plane distance. From these two values (the FOV and the near clipping plane) we can compute the top coordinate using simple trigonometry.

You may have noticed that so far we haven't made any reference to the camera's field of view and image aspect ratio. However we have said in the previous chapter and the lesson on cameras (in the basic section), that changing the FOV changes the extent of the scene we see through the camera. Thus the field of view and the image aspect ratio should definitely be related to the projection process somehow. We deliberately ignored this detail until now to stay focused on the OpenGL perspective projection matrix which doesn't directly rely on the camera's field of view. But it does indirectly. The construction of matrix depends on six parameters, the left, right, bottom and top coordinates as well as the near and far clipping plane. The near and flar clipping plane values are given by the user, but what about the left, right, bottom and top coordinates. What are they, where do they come from and how do we calculate them? If you watch figure 2 and 4 you can see that these coordinates correspond to the lower-left and upper-right corner of the frustum front face, the face on which the image of the 3D scene is actually projected. How do we compute these coordinates. Figure 3 represents a view of the camera from the side. What we want is compute a value for the top coordinate, which is equal to the right-angle ΔABC triangle's opposite side. The angle subtended by AB and AC is half the field of view and the adjacent side of the triangle is the value for the near clipping plane. Using trigonometry we can write:

tan(FOV2)=oppositeadjacent=BCAB=topnear


therefore:

top=tan(FOV2) near


And since the bottom half of the camera is symetrical to the upper half, we can write that:

bottom=top


Figure 4: the image can be square (left) or rectangular (right). Note that the bottom-left coordinates and the top-right coordinates are symetric aboout the x- and y-axis.

If you look at figure 4 though, two cases should be taken into consideration. The image can either be square or rectangular. In the case where the camera is square, it is straightforward to see that that the left and bottom coordinates are the same, that the right and the top coordinates are also the same, and finally that if you mirror the bottom-left coordinates around the x- and y-axis, you get the top-right coordinates. Therefore if we can compute the top coordinates we can easily set the other three other coordinates:

top=tan(FOV2)nearright=topleft=bottom=top


If the camera is not square as with the frustum on the right inside of figure 4, computing the coordinates is slighty more complication. The bottom and top coordinates are still the same but the left and right coordinates are scaled by the a ratio defined as the image width over the image height, what we usually call the image aspect ratio. The final and general formulas for computing the left, right, bottom coordinates are:

aspectratio=widthheighttop=tan(FOV2)nearbottom=topright=topaspectratioleft=bottom=topaspectratio

The camera's field of view and image aspect ratio are used to calculate the left, right, bottom and top coordinates which are themselves used in the construction of the perspective projection matrix. This how, they indirectly contribute to modifying how much of the scene we see through the camera.

Source Code: Implementing glFrustum and gluPerspective

Setting up the perspective projection matrix in OpenGL is done through a call to glFrustum. The function takes six arguments:

glFrustum(float left, float right, float bottom, float top, float near, float far);

To emulate glFrustum, you need to write a function taking the same six arguments which are then used to set up the coefficients of the OpenGL perspective matrix presented at the beginning of this chapter (because Scratchapixel uses the row-major convention to represent matrices we will be using the transpose of this matrix). The code for this function is straightforward:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
void OpenGlFrustum( float l, float r, float b, float t, float n, float f, Matrix44< float> &mat){ mat[ 0][ 0] = 2 * n / (r - l); mat[ 0][ 1] = 0; mat[ 0][ 2] = 0; mat[ 0][ 3] = 0; mat[ 1][ 0] = 0; mat[ 1][ 1] = 2 * n / (t - b); mat[ 1][ 2] = 0; mat[ 1][ 3] = 0; mat[ 2][ 0] = (r + l) / (r - l); mat[ 2][ 1] = (t + b) / (t - b); mat[ 2][ 2] = -(f + n) / (f - n); mat[ 2][ 3] = - 1; mat[ 3][ 0] = 0; mat[ 3][ 1] = 0; mat[ 3][ 2] = - 2 * f * n / (f - n); mat[ 3][ 3] = 0;}

However you are still left with computing the first four arguments yourself (the left, right, bottom and top coordinates). The GLU library offers a method called gluPerspective which computes these arguments and call the glFrustum function for you. The function gluPerspective takes four arguments: the camera field of view, the image aspect ratio (which you need to compute by dividing the image width with the image height), the near and far clipping planes. The first three parameters are used to compute the left, right, top and bottom coordinates and these coordinates plus the near and far clipping planes are then passed to the glFrustum function.

1
2
3
4
5
6
7
void OpenGlPerspective( float angle, float imageAspectRatio, float n, float f, Matrix44< float> &mat){ float scale = tan(degtorad(angle * 0.5)) * n; float r = imageAspectRatio * scale, l = -r; float t = scale, b = -t; OpenGlFrustum(l, r, b, t, n, f, mat);}

As you can see the code is pretty simple and to be sure it works we have been using it in the simple self-contained program we wrote in the first chapter to test the first version of our perspective projection matrix. We mentioned in the first chapter, that even if matrices are built differently (and look different), they should still always give the same result (a point in 3D space should always be projected to the same pixel location on the image). If we compare the result of projecting the teapot's vertices using the first matrix with the result of projecting the same vertices with the same camera settings (same field of view, image aspect ratio, near and far clipping planes) and the OpenGL perspective projection matrix, we get the same two images (below).

You can find this code here and test it yourself.

The Point (or Vertex) Transformation Pipeline

Vertex is a better term when it comes to describe how points (vertices) are transformed in OpenGL. Because the pipeline is very different than the process followed in a ray tracer, readers are often confused between what they read on the OpenGL vertex transformation pipeline and what we have described in the first chapter and the lesson on camera (which is almost the reverse process to that of a vertex in OpenGL). OpenGL (and other rendering API) has two possible modes for modifying the state of the camera: GL_PROJECTION and GL_MODELVIEW. When the GL_PROJECTION mode is active, we actually modify the projection matrix itself. That is, how points from the 3D scene are projected onto the image plane. And as we have explained so far, this matrix only relies on the left, near, bottom and top coordinates (which are computed from the camera's field of view and near clipping plane), and the near and far clipping planes (which are parameters of the camera). These parameters define the shape of the camera's frustum and all the vertices or points from the scene contained within this frustum are visible. In OpenGL, these parameters are passed to the API through a call to glFrustum:glFrustum(float left, float right, float bottom, float top, float near, float far);We also need a matrix to transform the camera from its default position to its final position in the scene. This matrix is called the camera-to-world matrix and has nothing to do with the projection matrix. It is a standard 4x4 matrix similar to the matrices we use to transform objects in the scene. However rather than transforming objects, in this case, it transforms the camera. In OpenGL, this camera-to-world transformation matrix can be set after the GL_MODELVIEW mode is made active. A typical OpenGL program would set the perspective projection matrix and the model-view matrix using the following sequence of calls:

1 glMatrixMode (GL_PROJECTION);2 glLoadIdentity ();3 glFrustum (l, r, b, t, n, f);4 glMatrixMode (GL_MODELVIEW);5 glLoadIdentity ();6 glTranslate(0, 0, 10);7 ...First we make the GL_PROJECTION mode active (line 1). Next, to set up the matrix we make a call to glFrustum passing as arguments to the function, the left, right, bottom and top coordinates as well as the near and far clipping planes. Once the project matrix is set, we switch to the GL_MODELVIEW mode (line 4) to change the camera position (and transform 3D objects). It is important to understand that changing the camera to view an object from a different angle using a camera-to-view transform (lets call this matrix M ), is the same as keeping the camera to its default position and applying the inverse of the camera-to-world transform ( M1 ) to the object. This is what OpenGL does. Actually, the GL_MODELVIEW matrix can be seen as the combination of the "VIEW" transformation matrix (the camera-to-world matrix) with the "MODEL" matrix which is the transformation applied to the object (the object-to-world matrix). In OpenGL, there is not concept of camera-to-world transform separate from the object-to-world transform. The two transforms are combined in the GL_MODELVIEW matrix. The camera is treated as if it was located at the origin pointing down the negative z-axis and object and transformed using the inverse of GL_MODELVIEW matrix (as we just mentioned, moving the camera to frame an object is the same as keeping the camera to its default position and applying the inverse camera-to-world transform to the object).

GL_MODEVIEW=MobjecttoworldMcameratoworld


Lets now follow the path of a point of vertex processed through the vertex transformation pipeline. First the point Pw expressed in world space is transformed using the GL_MODEVIEW matrix tocamera oreye space. This point Pc is now defined in regards to the camera's coordinate system (figure xx). Next, Pc is projected onto the image plane using the GL_PROJECTION matrix. We end up with a point expressed in homogeneous coordinates in which the coordinate w contains the point Pc 's z coordinate.

"A pipeline is a sequence of stages operating in parallel and in a fixed order. Each stage receives its input from the prior stage and sends its output to the subsequent stage" (the Cg Tutorial by NVDIA).

OpenGL does not proceed with the z-divide right away and this is one of the most important things to know about the OpenGL vertex transformation pipeline. The coordinates of the point we get from multiplying Pc with the projection matrix are called the clipping coordinates. Why? Because at this stage of the pipeline, OpenGL can already make a test on this point (which we will call Pclip ) to check if it lies inside or outside the camera's frustum (points are clipped). How? The w coordinate of the homogeneous point Pclip contains the z coordinate of the point being projected. We already know that to convert this clipped point from homogeneous coordinates back to cartesian coordinates we need to divide all its coordinates by w (what we called the perspective or z-divide). We also know that after this z-divide, all the point visibile by the camera should have their coordinates contained with the range [-1:1] (the projection process remaps a visible point coordinates to the unit cube. If any of the projected point's coordinates are not contained within the range [-1:1], the point is not visible and can be culled or clipped). However note that if all visible points are defined such as:

1PclipxPclipw11PclipyPclipw11PclipzPclipw1


then we can also write that:

PclipwPclipxPclipwPclipwPclipyPclipwPclipwPclipzPclipw


Rather than doing the z-divide first and comparing the resulting coordinates against the boundaries of the unit cube to check if a point is visible or not, OpenGL compares the x y and z coordinates of the clipped point against Pclipw first and only proceed with the division by w if the point passed the clipping test successfully. This "early" clipping test is an optimisation to avoid an unnecessary division by w if the point is invisible. 

Point which have successfuly passed the clipping test are then converted to cartesian coordinates (through the perspective divide. The xyz coordinates are divided by w). At this stage the point coordinates are all in the range [-1:1], they are normalized, which is the reason we call this space the Normalized Device Coordinate orNDC space. Be aware that NDC space may have a different meaning for other rendering APIs but in an OpenGL context, it applies to points which coordinates have been normalized by the z-divide operation. Coordinates of points in NDC space are always in the range [-1:1]. Finally, the points are remapped to screen orwindow coordinates. The range of value along the x-axis (the width of the image plane) should be remapped to [0:width] (where width is the dimension of the image in pixels). Along the y-axis (the height of the image) points' coordinates are remapped from [-1:1] to [0:height] (where height is image dimension in pixels). The points' z coordinate also is remapped, generally from 0 to 1 (this range though can be controlled by a call to glDepthRange). In OpenGL the dimensions of the screen in pixels can be specified with a call to glViewport.

Be aware here again that screen space can mean something different in other rendering APIs. In the lesson on camera from the basic section, NDC and screen space actually have a different meaning. If you come from an OpenGL background, you may be very confused by the how these terms are used particularly within the context of ray tracing where NDC space generally denotes point within the range [0:1] and ther term screen space is used in place of NDC space.

The following diagram summarizes the successive transformations applied to a point throughout the OpenGL vertex transformation pipeline:

As we mentioned a few times already, in ray tracing, points don't need to be projected to the screen. Instead, rays emitted from the image plane are tested for intersections with the objects from the scene. The transformation we have just explained is useful for renderers using a z-buffer visibility algorithm (which is the case of OpenGL).

Readers interested in this lesson might also find completementary information inCamera, Advanced Techniques.




perspective project from msdn


glFrustum function

Applies to: desktop apps only

The glFrustum function multiplies the current matrix by a perspective matrix.

Syntax

void WINAPI glFrustum(
  GLdouble left,
  GLdouble right,
  GLdouble bottom,
  GLdouble top,
  GLdouble zNear,
  GLdouble zFar
);

Parameters

left

The coordinate for the left-vertical clipping plane.

right

The coordinate for the right-vertical clipping plane.

bottom

The coordinate for the bottom-horizontal clipping plane.

top

The coordinate for the bottom-horizontal clipping plane.

zNear

The distances to the near-depth clipping plane. Must be positive.

zFar

The distances to the far-depth clipping planes. Must be positive.

Return value

This function does not return a value.

Error codes

The following error codes can be retrieved by the glGetError function.

NameMeaning
GL_INVALID_ENUM

zNear or zFar was not postitive.

GL_INVALID_OPERATION

The function was called between a call to glBegin and the corresponding call to glEnd.

Remarks

The glFrustum function describes a perspective matrix that produces a perspective projection. The (left,bottom,zNear) and (right,top, zNear) parameters specify the points on the near clipping plane that are mapped to the lower-left and upper-right corners of the window,respectively, assuming that the eye is located at (0,0,0).ThezFar parameter specifies the location of the far clipping plane. BothzNear andzFar must be positive. The corresponding matrix is shown in the following image.

Dd373537.frust01(en-us,VS.85).png Dd373537.frust02(en-us,VS.85).png

The glFrustum function multiplies the current matrix by this matrix, with the result replacing the current matrix. That is, if M is the current matrix and F is the frustum perspective matrix, thenglFrustum replaces M with M • F.




评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值