GPU Profiling

提起优化,第一件事情要做的就是Profiling,因为没有经过Profiling谁都不知道瓶颈是什么,正确的Profiling是对游戏整体性能的全面认识。关于CPU的Profiling已经非常成熟,各种软件都很齐全,甚至没有这些软件,开发者也很容易自己写一个Profiling系统,最简单的就是在函数执行的开始调用QueryPerformanceCounter,记下时间,在函数执行结束的地方再次调用
摘要由CSDN通过智能技术生成

        提起优化,第一件事情要做的就是Profiling,因为没有经过Profiling谁都不知道瓶颈是什么,正确的Profiling是对游戏整体性能的全面认识。关于CPU的Profiling已经非常成熟,各种软件都很齐全,甚至没有这些软件,开发者也很容易自己写一个Profiling系统,最简单的就是在函数执行的开始调用QueryPerformanceCounter,记下时间,在函数执行结束的地方再次调用QueryPerformanceCounter,两个时间一减就知道这个函数的执行时间了。有一些性能分析工具像vTune可以抓出来执行时间最多的函数,开发者可以逐个优化最费的函数,因为最费的就是瓶颈,瓶颈没优化下去,别的再怎么优化贡献也不明显。

        对于GPU大家熟知的nVPerfHud,GPA,NSight都可以帮助开发者分析GPU的性能开销,然后PerfHud经常莫名的就无法使用,而且需要Attach上去,并且对A卡无解;GPA对大多显卡都可以Capture出来,但是必须要打开Capture的数据才能看到性能开销,而且很多时候不准而且很慢,NSight也强不到哪里去,对于游戏开发者来说,能够有一个内嵌到游戏引擎的GPU Profiler再好不过,不需要任何第三方工具,游戏跑起来就可以快速的看到每一部分的性能开销,Crysis在GPU Pro 3的一篇文章里有一张图片就提及了他们内嵌的GPU Profilor,网上搜一搜也有一些开发者在 DX 11下通过Query实现了类似的功能,Query这个东西DX9就已经有了,而且硬件厂商基本都支持,所以在DX9下面也可以很容易用Query做一个GPU Profilor.下面详细转载了如何在DX11下实现GPU Profilor,DX9下面可以参考DXSDK关于Query的文档。


http://www.reedbeta.com/blog/2011/10/12/gpu-profiling-101/

As you can see, even though Idyll is at a very, very early stage (it has no textures, only ambient and directional lighting), it still has a fairly complete performance measurement system. I chose to implement this early on in development because it’s my belief that although it can be too early to optimize, it’s never too early to profile.

Even at the very beginning of development, I want to know that the performance numbers I’m seeing are reasonable. I don’t need to worry about the details—I’m not going to worry that 0.47 ms is too long to spend drawing a 2700-triangle, untextured city—but I do want to know the numbers are at about the right order of magnitude. To put it another way, if I were spending 4.7 ms drawing a 2700-triangle, untextured city, then I’d be wondering what was going on! More than likely, it would be because I was doing something dumb that was forcing the driver or GPU into a slow mode of execution. This kind of bug can be hard to spot because the rendered frame is still visually correct, so you can’t see it (unless it causes you to drop from 60 to 30 Hz or something like that). But if you’re measuring your performance and you at least sanity-check your numbers from time to time, you’ll notice something’s wrong.

The other reason to implement GPU profiling early is that when it comes time to do your performance optimizations for real, you’ll have more trust in your profiling system. You’ll have ironed out most of the bugs, and you’ll have a feel for how much noise there is in the data and what sorts of artifacts there may be.

So now that you know why you should do GPU profiling ;), how do you actually do it? In this article, I’ll show you how to set it up in Direct3D 11. Similar commands should be available in any modern graphics API, although the details may vary.

First, let’s talk about how not to do GPU profiling. Do not measure performance in terms of framerate! This is something many beginners do, but it’s misleading at best, since framerate isn’t a linear measure of performance. People will make statements like “turning on XYZ dropped my framerate by 10 fps”, but this is meaningless, since a drop from 100 to 90 fps is a very different thing than a drop from 30 to 20 fps. You could clarify by reporting the fps you started from, but why bother? Just report performance results using units of real time. Milliseconds are fine, although you can go one step further and express everything in fractions of your frame budget. For instance, I’m currently targeting 60 Hz, so I have 16.67 ms to render a frame. Instead of saying that my objects rendered in 0.47 ms, I could make Idyll report that they rendered in 2.8% of a frame.

Another caveat that should be obvious: you can’t measure GPU performance by timing on the CPU. For instance, calling QueryPerformanceCounter before and after a draw call won’t measure how long the GPU took to draw your objects, only how long the graphics driver took to queue up that call in the various data structures it has under the hood. That might be useful information to have in general, but it’s not GPU profiling.

Placing Queries

The tools we’ll use to get profiling data out of the GPU are ID3D11Query objects. As

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值