算法原理:
所谓3D可见性剔除,通常包含四个步骤:Backface Culling(背面剔除), View Frustum Culling(视椎体剔除),Portal Culling(入口剔除),Occlusion Culling(遮挡剔除),其中前两种剔除一般3D游戏都需要,第三种剔除一般出现在Quake这种多个房间构成的室内游戏。
而所谓的遮挡剔除,就是说如果前面的物体完全遮挡住了后面的物体,则视线不可能到达后面的物体,则后面的物体就无需渲染了。遮挡剔除,最常用的算法就是HOC算法(Hardware Occlusion Culling),看名字我们就知道了,遮挡剔除在某种那个程度上利用到了硬件,其实就是我们熟悉的Z缓存。不过,不要指望硬件能帮我们解决所有的问题,具体的算法还是需要我们用代码实现的。
HOC的核心,就是先将3D物体“预渲染”,所谓“预渲染”就是“假渲染”。只是做一个渲染的动作,但并不真正地把物体渲染出来。然后会将物体的每个像素的深度,与Z缓存中对应的值进行对比,也就是进行所谓的深度测试,如果全部的像素都大于Z缓存中的值,那么就说明该物体完全被遮挡住了,那么该物体被剔除。
但是你可能要问,所谓的“预渲染”难道不耗时吗?如果一个物体被遮挡剔除了,那么它仍然经过了一次预渲染,如果一个物体没被剔除,那么它经过一次预渲染和一次真正的渲染。这样看起来,使用了遮挡剔除后效率反倒下降了。这里需要注意的是,不要对真正的3D物体进行预渲染,而要对物体的包围体进行预渲染,通常是使用AABB或OBB。假设一个3D模型由10000个顶点构成,它的包围盒由8个顶点构成,首先先将包围盒预渲染,如果包围盒被遮挡剔除了,则说明3D模型也需要被剔除(一个比模型大的包围盒都被完全遮挡住了,某型本身能不被完全遮挡吗?)。这样算下来真正被绘制的顶点只有包围盒的8个顶点。如果包围盒未被遮挡剔除,则通常认为3D物体也未被遮挡(虽然这里存在着一些误判),这时需要绘制的顶点共8+10000=10008个,相比10000个也没多出多少。
接下来的问题是,如果计算出有多少个像素通过了深度测试?DX9中专门有一个接口IDirect3DQuery9,这个接口就可以用来做这个事,该接口的创建方法如下:
LPDIRECT3DQUERY9 g_pd3dQuery = NULL;
g_pd3dDevice->CreateQuery(D3DQUERYTYPE_OCCLUSION, &g_pd3dQuery);
其中第一个参数D3DQUERYTYPE_OCCLUSION,就是表示创建的该接口是用来进行遮挡剔除的。创建的接口通过g_pd3dQuery返回。
遮挡剔除算法的具体框架如下:
g_pd3dQuery->Issue(D3DISSUE_BEGIN);
//对包围盒进行预渲染
g_pd3dQuery->Issue(D3DISSUE_END);
DWORD pixelsVisible = 0;
while (g_pd3dQuery->GetData((void *) &pixelsVisible,
sizeof(DWORD), D3DGETDATA_FLUSH) == S_FALSE);
if(pixelsVisible > 0)
{
//渲染真正的3D物体
}
其中g_pd3dQuery->Issue(D3DISSUE_BEGIN);这行就是告诉D3D,开始统计有多少像素深度大于Z缓存中的值,而g_pd3dQuery->Issue(D3DISSUE_END);表示通知D3D结束该统计。然后通过调用g_pd3dQuery接口的GetData方法,获取刚才统计出来的结果pixelsVisible,如果这个值大于0,说明至少有一个像素是可见的,则该物体不是完全被遮挡的,那么就理所当然的绘制真正的3D物体了。
接下来的技术难点就是,如何进行所谓的“预渲染”,也就是只做渲染的动作,而不真正把物体画出来。有多种办法可以达到这个目的。最容易想到的是利用融合技术,将原融合因子设为0,目标融合因子设为1,这样就不会再后台缓冲里画出真正的物体,同时还如你所愿的做了一次渲染的动作。现在算法的代码可以具体化一些了:
g_pd3dDevice->SetRenderState(D3DRS_LIGHTING, false);
g_pd3dDevice->SetRenderState(D3DRS_ALPHABLENDENABLE, true);
g_pd3dDevice->SetRenderState(D3DRS_ZWRITEENABLE, false);
g_pd3dDevice->SetTransform(D3DTS_WORLD, &g_pSpheres[i]->m_MatTranslate);
g_pd3dQuery->Issue(D3DISSUE_BEGIN);
//对包围盒进行预渲染
g_pd3dQuery->Issue(D3DISSUE_END);
DWORD pixelsVisible = 0;
while (g_pd3dQuery->GetData((void *) &pixelsVisible,
sizeof(DWORD), D3DGETDATA_FLUSH) == S_FALSE);
if(pixelsVisible > 0)
{
g_iNumVisibleSpheres++;
g_pd3dDevice->SetRenderState(D3DRS_LIGHTING, true);
g_pd3dDevice->SetRenderState(D3DRS_ALPHABLENDENABLE, false);
g_pd3dDevice->SetRenderState(D3DRS_ZWRITEENABLE, true);
//渲染真正的3D物体
}
请各位注意到,对预渲染,不仅要开启融合和正确设置融合因子,同时要关闭ZWRITEENABLE,否则包围盒的深度就被写入Z缓存了,同时为了较少不必要的计算,这里先暂时禁用光照。
示例程序:
下面给各位看一个我写的极其简陋的小Demo,这个程序画了五个球体,从进到远依次摆放在Z轴上。当前面的球完全遮挡住后面的球时,后面的球会被剔除。通过键盘调整摄像机的位置,你就可以测试这一点。这里只给出main.cpp的代码,摄像机的代码没有贴出来,请各位自行下载。
主要代码如下:
#include <d3d9.h>
#include <d3dx9.h>
#include <tchar.h>
#include "Camera.h"
#include "D3DUtil.h"
#pragma comment(lib,"d3d9.lib")
#pragma comment(lib,"d3dx9.lib")
#pragma comment(lib,"dxguid.lib")
#pragma comment(lib, "winmm.lib")
#define SCREEN_WIDTH 800
#define SCREEN_HEIGHT 600
#define CAMERA_MOVE_PITCH 5.0f
#define CAMERA_ROTATE_PITCH 1.0f
LRESULT CALLBACK WindowProcedure (HWND, UINT, WPARAM, LPARAM);
HRESULT Game_Init(HWND hwnd);
HRESULT Objects_Init();
void Game_Update(float);
void Game_Render();
void Game_Exit();
float Get_FPS();
void Draw_FPS_Text();
void Matrix_Set();
LPDIRECT3DDEVICE9 g_pd3dDevice = NULL;
LPD3DXFONT g_pFont = NULL;
Camera* g_pCamera = NULL;
LPDIRECT3DQUERY9 g_pd3dQuery = NULL;
int g_iNumVisibleSpheres = 0;
struct CSphere
{
CSphere(float radius, const D3DXVECTOR3& pos, const D3DMATERIAL9& mtrl);
~CSphere();
LPD3DXMESH m_pMesh; //球体的网格
LPD3DXMESH m_pBoundingBox; //包围盒的网格
D3DXVECTOR3 m_Pos; //球的位置
D3DXMATRIX m_MatTranslate; //球体的世界变换矩阵
D3DMATERIAL9 m_Mtrl; //球体的材质
float m_fRadius; //球体的半径
};
CSphere::CSphere(float radius, const D3DXVECTOR3& pos, const D3DMATERIAL9& mtrl)
{
m_fRadius = radius;
m_Pos = pos;
m_Mtrl = mtrl;
D3DXMatrixTranslation(&m_MatTranslate, pos.x, pos.y, pos.z);
D3DXCreateSphere(g_pd3dDevice, m_fRadius, 100, 100, &m_pMesh, NULL);
D3DXCreateBox(g_pd3dDevice, m_fRadius * 2, m_fRadius * 2, m_fRadius * 2, &m_pBoundingBox, NULL);
}
CSphere::~CSphere()
{
SAFE_RELEASE(m_pMesh);
SAFE_RELEASE(m_pBoundingBox);
}
CSphere* g_pSpheres[5];
int WINAPI WinMain (HINSTANCE hInstance,
HINSTANCE hPrevInstance,
LPSTR lpszArgument,
int iCmdShow)
{
HWND hwnd;
MSG msg;
WNDCLASS wndclass;
static TCHAR szClassName[ ] = TEXT("WindowsApp");
wndclass.hInstance = hInstance;
wndclass.lpszClassName = szClassName;
wndclass.lpfnWndProc = WindowProcedure;
wndclass.style = CS_HREDRAW | CS_VREDRAW | CS_DBLCLKS;
wndclass.cbWndExtra = 0;
wndclass.cbClsExtra = 0;
wndclass.hIcon = LoadIcon(NULL, IDI_APPLICATION);
wndclass.hCursor = LoadCursor(NULL, IDC_ARROW);
wndclass.hbrBackground = (HBRUSH)GetStockObject(WHITE_BRUSH);
wndclass.lpszMenuName = NULL;
if (!RegisterClass (&wndclass))
return 0;
RECT rc = {0, 0, SCREEN_WIDTH, SCREEN_HEIGHT};
AdjustWindowRect(&rc, WS_OVERLAPPEDWINDOW, false);
hwnd = CreateWindow(
szClassName,
TEXT("MyApp"),
WS_OVERLAPPEDWINDOW,
CW_USEDEFAULT,
CW_USEDEFAULT,
rc.right - rc.left,
rc.bottom - rc.top,
NULL,
NULL,
hInstance,
NULL
);
ShowWindow (hwnd, iCmdShow);
UpdateWindow(hwnd);
Game_Init(hwnd);
float lastTime = timeGetTime() * 0.001f;
float currentTime = timeGetTime() * 0.001f;
float delta;
while(true)
{
if(PeekMessage(&msg, NULL, 0, 0, PM_REMOVE))
{
if(msg.message == WM_QUIT)
break;
TranslateMessage(&msg);
DispatchMessage(&msg);
}
else {
currentTime = timeGetTime() * 0.001f;
delta = currentTime - lastTime;
Game_Update(delta);
Game_Render();
lastTime = currentTime;
}
Sleep(1);
}
Game_Exit();
return msg.wParam;
}
LRESULT CALLBACK WindowProcedure (HWND hwnd, UINT message, WPARAM wParam, LPARAM lParam)
{
switch (message)
{
case WM_DESTROY:
PostQuitMessage (0);
break;
default:
return DefWindowProc (hwnd, message, wParam, lParam);
}
return 0;
}
HRESULT Game_Init(HWND hwnd)
{
LPDIRECT3D9 pD3D = NULL;
if( NULL == ( pD3D = Direct3DCreate9( D3D_SDK_VERSION ) ) )
return E_FAIL;
D3DCAPS9 caps; int vp = 0;
if( FAILED( pD3D->GetDeviceCaps( D3DADAPTER_DEFAULT, D3DDEVTYPE_HAL, &caps ) ) )
{
return E_FAIL;
}
if( caps.DevCaps & D3DDEVCAPS_HWTRANSFORMANDLIGHT )
vp = D3DCREATE_HARDWARE_VERTEXPROCESSING;
else
vp = D3DCREATE_SOFTWARE_VERTEXPROCESSING;
D3DPRESENT_PARAMETERS d3dpp;
ZeroMemory(&d3dpp, sizeof(d3dpp));
d3dpp.BackBufferWidth = SCREEN_WIDTH;
d3dpp.BackBufferHeight = SCREEN_HEIGHT;
d3dpp.BackBufferFormat = D3DFMT_A8R8G8B8;
d3dpp.BackBufferCount = 1;
d3dpp.MultiSampleType = D3DMULTISAMPLE_NONE;
d3dpp.MultiSampleQuality = 0;
d3dpp.SwapEffect = D3DSWAPEFFECT_DISCARD;
d3dpp.hDeviceWindow = hwnd;
d3dpp.Windowed = true;
d3dpp.EnableAutoDepthStencil = true;
d3dpp.AutoDepthStencilFormat = D3DFMT_D24S8;
d3dpp.Flags = 0;
d3dpp.FullScreen_RefreshRateInHz = 0;
d3dpp.PresentationInterval = D3DPRESENT_INTERVAL_IMMEDIATE;
if(FAILED(pD3D->CreateDevice(D3DADAPTER_DEFAULT, D3DDEVTYPE_HAL,
hwnd, vp, &d3dpp, &g_pd3dDevice)))
return E_FAIL;
if(Objects_Init() != S_OK) return E_FAIL;
SAFE_RELEASE(pD3D)
return S_OK;
}
HRESULT Objects_Init()
{
if(FAILED(D3DXCreateFont(g_pd3dDevice, 30, 0, 0, 1, FALSE, DEFAULT_CHARSET,
OUT_DEFAULT_PRECIS, DEFAULT_QUALITY, 0, _T("宋体"), &g_pFont)))
return E_FAIL;
g_pd3dDevice->CreateQuery(D3DQUERYTYPE_OCCLUSION, &g_pd3dQuery);
g_pCamera = new Camera;
D3DXVECTOR3 cameraPos(0, 0.0f, -5.0f);
g_pCamera->SetPosition(&cameraPos);
D3DLIGHT9 light;
ZeroMemory(&light, sizeof(light));
light.Type = D3DLIGHT_DIRECTIONAL;
light.Direction = D3DXVECTOR3(1.0f, -1.0f, 1.0f);
light.Ambient = D3DXCOLOR(0.3f, 0.3f, 0.3f, 1.0f);
light.Diffuse = D3DXCOLOR(1.0f, 1.0f, 1.0f, 1.0f);
light.Specular = D3DXCOLOR(1.0f, 1.0f, 1.0f, 1.0f);
g_pd3dDevice->SetLight(0, &light);
g_pd3dDevice->LightEnable(0, true);
D3DMATERIAL9 mtrl;
mtrl.Ambient = D3DXCOLOR(1.0f, 1.0f, 1.0f, 1.0f);
mtrl.Emissive = D3DXCOLOR(0, 0, 0, 1.0f);
mtrl.Specular = D3DXCOLOR(1.0f, 1.0f, 1.0f, 1.0f);
mtrl.Power = 16;
D3DXCOLOR diffuse[5];
diffuse[0] = D3DXCOLOR(1.0f, 0, 0, 1.0f);
diffuse[1] = D3DXCOLOR(1.0f, 1.0f, 0, 1.0f);
diffuse[2] = D3DXCOLOR(0.0f, 1.0f, 0, 1.0f);
diffuse[3] = D3DXCOLOR(0.0f, 0, 1.0f, 1.0f);
diffuse[4] = D3DXCOLOR(0.0f, 1.0, 1.0f, 1.0f);
g_pd3dDevice->SetRenderState(D3DRS_LIGHTING, true);
int i;
for(i = 0; i < 5; i++)
{
//让不同球呈现不同的颜色
mtrl.Diffuse = diffuse[i];
mtrl.Ambient = diffuse[i];
g_pSpheres[i] = new CSphere(1.0f, D3DXVECTOR3(0, 0, i * 3.0f), mtrl);
}
//正确设置融合因子
g_pd3dDevice->SetRenderState(D3DRS_SRCBLEND, D3DBLEND_ZERO);
g_pd3dDevice->SetRenderState(D3DRS_DESTBLEND, D3DBLEND_ONE);
return S_OK;
}
void Game_Update(float delta)
{
if(GetAsyncKeyState('W') & 0x8000)
{
g_pCamera->Walk(CAMERA_MOVE_PITCH * delta);
}
if(GetAsyncKeyState('S') & 0x8000)
{
g_pCamera->Walk(-CAMERA_MOVE_PITCH * delta);
}
if(GetAsyncKeyState('A') & 0x8000)
{
g_pCamera->Strafe(-CAMERA_MOVE_PITCH * delta);
}
if(GetAsyncKeyState('D') & 0x8000)
{
g_pCamera->Strafe(CAMERA_MOVE_PITCH * delta);
}
if(GetAsyncKeyState('R') & 0x8000)
{
g_pCamera->Fly(CAMERA_MOVE_PITCH * delta);
}
if(GetAsyncKeyState('F') & 0x8000)
{
g_pCamera->Fly(-CAMERA_MOVE_PITCH * delta);
}
if(GetAsyncKeyState(VK_UP) & 0x8000)
{
g_pCamera->Pitch(-CAMERA_ROTATE_PITCH * delta);
}
if(GetAsyncKeyState(VK_DOWN) & 0x8000)
{
g_pCamera->Pitch(CAMERA_ROTATE_PITCH * delta);
}
if(GetAsyncKeyState(VK_LEFT) & 0x8000)
{
g_pCamera->Yaw(-CAMERA_ROTATE_PITCH * delta);
}
if(GetAsyncKeyState(VK_RIGHT) & 0x8000)
{
g_pCamera->Yaw(CAMERA_ROTATE_PITCH * delta);
}
if(GetAsyncKeyState('N') & 0x8000)
{
g_pCamera->Roll(CAMERA_ROTATE_PITCH * delta);
}
if(GetAsyncKeyState('M') & 0x8000)
{
g_pCamera->Roll(-CAMERA_ROTATE_PITCH * delta);
}
}
void Game_Render()
{
Matrix_Set();
g_pd3dDevice->Clear(0, NULL, D3DCLEAR_TARGET | D3DCLEAR_ZBUFFER,
D3DCOLOR_XRGB(0, 0, 0), 1.0f, 0);
g_pd3dDevice->BeginScene();
g_iNumVisibleSpheres = 0;
int i;
for (i = 0; i < 5; i++)
{
//先关闭光照,启用融合,关闭ZWRITE
g_pd3dDevice->SetRenderState(D3DRS_LIGHTING, false);
g_pd3dDevice->SetRenderState(D3DRS_ALPHABLENDENABLE, true);
g_pd3dDevice->SetRenderState(D3DRS_ZWRITEENABLE, false);
g_pd3dDevice->SetTransform(D3DTS_WORLD, &g_pSpheres[i]->m_MatTranslate);
//开始统计有多少像素比Z缓存中的值小
g_pd3dQuery->Issue(D3DISSUE_BEGIN);
g_pSpheres[i]->m_pBoundingBox->DrawSubset(0);
g_pd3dQuery->Issue(D3DISSUE_END);
//获取可见像素的个数
DWORD pixelsVisible = 0;
while (g_pd3dQuery->GetData((void *) &pixelsVisible,
sizeof(DWORD), D3DGETDATA_FLUSH) == S_FALSE);
//如何可见像素个数大于0,则进行真正的绘制
if(pixelsVisible > 0)
{
g_iNumVisibleSpheres++;
g_pd3dDevice->SetRenderState(D3DRS_LIGHTING, true);
g_pd3dDevice->SetRenderState(D3DRS_ALPHABLENDENABLE, false);
g_pd3dDevice->SetRenderState(D3DRS_ZWRITEENABLE, true);
g_pd3dDevice->SetMaterial(&g_pSpheres[i]->m_Mtrl);
g_pSpheres[i]->m_pMesh->DrawSubset(0);
}
}
Draw_FPS_Text();
g_pd3dDevice->EndScene();
g_pd3dDevice->Present(NULL, NULL, NULL, NULL);
}
void Draw_FPS_Text()
{
RECT rect;
rect.top = 0;
rect.bottom = SCREEN_HEIGHT;
rect.left = 0;
rect.right = SCREEN_WIDTH;
float fps = Get_FPS();
TCHAR fps_str[50];
_stprintf(fps_str, TEXT("Sphere:%d, FPS:%f"), g_iNumVisibleSpheres, fps);
g_pFont->DrawText(0, fps_str, -1, &rect, DT_TOP | DT_RIGHT, D3DCOLOR_XRGB(0, 255, 0));
}
void Game_Exit()
{
for (int i = 0; i < 5; i++)
{
delete g_pSpheres[i];
}
SAFE_DELETE(g_pCamera);
SAFE_RELEASE(g_pd3dQuery)
SAFE_RELEASE(g_pFont)
SAFE_RELEASE(g_pd3dDevice)
}
float Get_FPS()
{
static float currentTime = timeGetTime() * 0.001f;
static float lastTime = 0;
static float fps = 0.0f;
static int frameCount = 0;
frameCount++;
currentTime = timeGetTime() * 0.001f;
if(currentTime - lastTime > 1.0f)
{
fps = frameCount / (currentTime - lastTime);
frameCount = 0;
lastTime = currentTime;
}
return fps;
}
void Matrix_Set()
{
D3DXMATRIX matView;
g_pCamera->GetViewMatrix(&matView);
g_pd3dDevice->SetTransform(D3DTS_VIEW, &matView);
D3DXMATRIX matProj;
D3DXMatrixPerspectiveFovLH(&matProj, D3DX_PI / 4.0f, (float)SCREEN_WIDTH / (float)SCREEN_HEIGHT, 1.0f, 1000.0f);
g_pd3dDevice->SetTransform(D3DTS_PROJECTION, &matProj);
}
运行程序,可以看到窗口右上方,“Sphere:1”,这说明当前只有一个球体被绘制了。其它四个球都被最前面的这个球体挡住了,因此被剔除了。
通过键盘控制摄像机,可以调整视角和视点位置,让我们把视角调整到下图的位置。可以看到"Sphere:5",说明5个球体都未被剔除。
几点说明
关于Occlusion Culling,还有几点需要额外说明的:
【1】遮挡剔除不像背面剔除和视椎体剔除,并不是所有游戏都需要进行遮挡剔除
【2】即便你的程序需要遮挡剔除,也不是场景中的所有物体都需要,只有那些非常有可能被前面物体遮挡的物体才有必要进行遮挡剔除测试。如果对场景中所有物体进行遮挡剔除,程序性能往往会不升反降。
【3】在场景中的n个物体进行遮挡测试前,务必要对它们进行排序,离视点越近的物体,就应该越早测试和绘制。上面的demo程序中没有进行这个排序,是一个疏漏,请各位注意。
【4】预渲染的过程使用了融合技术,融合本身会降低性能。还有一种预渲染算法,新建一个纹理,并将该纹理指定为RenderTarget,则包围盒会被绘制到这个纹理上,这样就避免了绘制到屏幕上(也就是后台缓冲)。还有其他进行预渲染的方式,待各位发现和尝试。
最后是本节的源码下载: