Molehill / ND2D – speeding up the engine

13 篇文章 0 订阅

from: http://www.nulldesign.de/2011/04/07/molehill-nd2d-speeding-up-the-engine/

 

Molehill / ND2D – speeding up the engine

 

One of biggest challenges in modern computer graphics, still is the high cost of rendering thousands of different objects, no matter how simple they are. While developing ND2D, I’m experimenting and trying out different techniques to get a good performance.

 

To optimize the rendering you have to know it’s weaknesses. As a simple rule you can say: Every state change on the graphics context (Context3D) and especially the drawTriangles() call is using a lot of processing power. You’ll notice pretty fast, that if you try to render 2000 sprites (a sprite are just two textured triangles, so 4000 tri’s in total) and you’re doing a draw call for every single sprite, the overhead will be so high, that the output looks more like a slideshow than a smooth animation. The possible solution is simple: Just do as little state changes and draw calls as possible. The implementation is a bit more work…

 

So how do you save draw calls? The answer is geometry batching. Instead of drawing one sprite per draw call you just draw multiple sprites in a single call. To get it to work, you have to dig a bit deeper into pixel shader programming and the graphics hardware:

Single sprite per draw call:


A sprite consists of two triangles, a triangle of three vertices and each vertex has the following attributes: x,y,z, u,v, which will be the format for our vertex buffer. The shader input parameters (constants) will be the mvp matrix, a color (to tint a sprite and to enable transparency) and of course the texture image (image4). This way you’re able to draw one sprite per call, pretty easy and straight forward… but slow.

Improvement, batching calls:


You can only batch calls, if the sprites you want to draw have all the same texture (Setting a texture is also pretty expensive). The main idea is, that you pass multiple mvp matrices and multiple colors to the shader instead of just one. Within the shader, depending which sprite is drawn, a different mvp matrix is used. But how many values you can pass to the shader? Todays modern graphic hardware has at least 128 constant registers available in the GPU, so to be compatible with all the different graphics cards out there it’s limited to 128 in the Molehill API. In the following picture you can see the different inputs that are available for the vertex shader. We won’t bother with temp registers and input vectors now, because it’s just unlikely that we are running out of registers while drawing sprites. So just keep in mind, that the vertex shader has limited storage space. In our case we’re limited to 128 constants.

 



 (Image taken from the DX8 SDK documentation)

 

A single register can hold a float4. So, let’s do some simple math. The matrix uses 4 registers (4 x float4) and the color just one: 128 / 5 = 25. We should be able to batch 25 draw calls in a single call. But how does the shader know which matrix to use? To provide this information in the shader, we simply add a batch identifier to the vertex buffer: x,y,z, u,v, batchID. The vertex shader could look like this then:

 

...
parameter float4x4 clipSpaceMatrix[25];
 
void evaluateVertex()
{
    vertexClipPosition = vertexPosition * clipSpaceMatrix[batchID];
}

 

Yay! We just batched our draw calls and the engine will run a lot faster for sprites with the same texture.

But there is more… Right now, we can only batch sprites that share the same texture. Wouldn’t it be great if we could batch just everything? There is an idea called texture atlas. Basically it’s pretty simple as well: Instead of using different textures, you just “bake” every texture used in your game into a single big texture like this: Pocket God Texture Atlas. All you have to do then, is to adjust the UV coordinates of your sprites to match the original texture in the big one. Generating a texture atlas at runtime and adjusting the UV coords is in fact a bit more work…

Have fun exploring the GPU ;)

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值