glBeginQuery/glEndQuery/glQueryCounter获取GPU执行指令耗时

荆楚闲人

于 2023-05-02 09:50:45 发布

阅读量576

点赞数 1

分类专栏： # OpenGL技术点文章标签： glBeginQuery glEndQuery glQueryCounter

原文链接：https://zhuanlan.zhihu.com/p/165497334

版权

OpenGL技术点专栏收录该内容

30 篇文章 5 订阅

订阅专栏

Timer Queries

One further query type that you can use to judge how long rendering is taking is the timer query. Timer queries are made by passing the GL_TIME_ELAPSED query type as the target parameter of glBeginQuery() and glEndQuery(). When you call glGetQueryObjectuiv() to get the result from the query object, the value is the number of nanoseconds that elapsed between when OpenGL executes your calls to glBeginQuery() and glEndQuery(). This is actually the amount of time it took OpenGL to process all the commands between the glBeginQuery() and glEndQuery() commands. You can use this, for example, to identify the most expensive part of your scene. Consider the code shown in Listing 12.8.

另一个查询类型就是timer query，你可以用这个技术来查询渲染花费了多少时间。当我们使用timer query的时候，传给glBeginQuery和glEndQuery的参数要相应的变成GL_TIME_ELAPSED。当你调用glGetQueryObjectuiv获取 query object的结果的时候，返回值是从你调用glBeginQuery开始到你调用glEndQuery之间所有的OpenGL指令耗费的时间，单位是纳秒。你可以通过这个数据来评估你场景中渲染最耗时的部分。我们来看看清单12.8的代码：

// Declare our variables
GLuint queries[3]; // Three query objects that we'll use
GLuint world_time; // Time taken to draw the world
GLuint objects_time; // Time taken to draw objects in the world
GLuint HUD_time; // Time to draw the HUD and other UI elements
// Create three query objects
glGenQueries(3, queries);
// Start the first query
glBeginQuery(GL_TIME_ELAPSED, queries[0]);
// Render the world
RenderWorld();
// Stop the first query and start the second...
// Note, we're not reading the value from the query yet
glEndQuery(GL_TIME_ELAPSED);
glBeginQuery(GL_TIME_ELAPSED, queries[1]);
// Render the objects in the world
RenderObjects();
// Stop the second query and start the third
glEndQuery(GL_TIME_ELAPSED);
glBeginQuery(GL_TIME_ELAPSED, queries[2]);
// Render the HUD
RenderHUD();
// Stop the last query
glEndQuery(GL_TIME_ELAPSED);
// Now we can retrieve the results from the three queries
glGetQueryObjectuiv(queries[0], GL_QUERY_RESULT, &world_time);
glGetQueryObjectuiv(queries[1], GL_QUERY_RESULT, &objects_time);
glGetQueryObjectuiv(queries[2], GL_QUERY_RESULT, &HUD_time);// Done. world_time, objects_time, and hud_time contain the values we want.
// Clean up after ourselves.
glDeleteQueries(3, queries);

Listing 12.8: Timing operations using timer queries

After this code is executed, world_time, objects_time, and HUD_time will contain the number of nanoseconds it took to render the world, all the objects in the world, and the heads-up display (HUD), respectively. You can use this to determine what fraction of the graphics hardware’s time is taken up rendering each of the elements of your scene. This is useful for profiling your code during development—you can figure out what the most expensive parts of your application are and determine where to spend your optimization effort. You can also use this technique during runtime to alter the behavior of your application and get the best possible performance out of the graphics subsystem. For example, you could increase or reduce the number of objects in the scene depending on the relative value of objects_time. You could also dynamically switch between more or less complex shaders for elements of the scene based on the power of the graphics hardware. If you just want to know how much time passes, according to OpenGL, between two actions that your program takes, you can use glQueryCounter(), whose prototype is

当这些代码执行完毕之后，world_time,objects_time和HUD_time这三个变量就分别存储着渲染world，所有的objects和HUD所耗费的时间。你可以通过这些信息来确定你渲染的那些元素的开销占比。这对于你在开发的时候去调试代码的性能是非常有帮助的-你可以通过这样的方式去找到你的代码中最耗时的部分，以便于让你能够知道在哪些地方需要进行优化。举个例子，你可以通过object_time来决定是否需要在你的场景中增加或者减少物体。你可以基于应用程序运行时所使用的图形硬件来决定使用更复杂的shader还是更简单的shader来渲染物体。如果你仅仅是希望知道OpenGL的某两个操作之间经过了多少时间，你可以使用glQueryCounter函数来查询，它的函数申明如下：

void glQueryCounter(GLuint id, GLenum target);

You need to set id to GL_TIMESTAMP and target to the name of a query object that you’ve created earlier. This function puts the query straight into the OpenGL pipeline, and when that query reaches the end of the pipeline, OpenGL records its view of the current time into the query object. The time 0 is not really defined—it just indicates some unspecified time in the past. To use this information effectively, your application needs to take deltas between multiple timestamps. To implement the previous example using glQueryCounter(), we could write code as shown in Listing 12.9.

你需要设置id参数为GL_TIMESTAMP，然后把target参数设置成为你之前创建的query object。这个函数会直接把查询操作推送进OpenGL的流水线中去，当OpenGL执行到那个query object的结束位置的时候，OpenGL会把当前的时间写到query object里面去。时间0不一定真的存在-这个时间仅仅是表示经过了不确定的一些时间。为了更加有效的使用这个数据，你的程序需要计算多个时间戳的差。清单12.9展示了如何使用glQueryCounter来实现之前的那个案例的代码。

// Declare our variables
GLuint queries[4]; // Now we need four query objects
GLuint start_time; // The start time of the application
GLuint world_time; // Time taken to draw the world
GLuint objects_time; // Time taken to draw objects in the world
GLuint HUD_time; // Time to draw the HUD and other UI elements
// Create four query objects
glGenQueries(4, queries);// Get the start time
glQueryCounter(GL_TIMESTAMP, queries[0]);
// Render the world
RenderWorld();
// Get the time after RenderWorld is done
glQueryCounter(GL_TIMESTAMP, queries[1]);
// Render the objects in the world
RenderObjects();
// Get the time after RenderObjects is done
glQueryCounter(GL_TIMESTAMP, queries[2]);
// Render the HUD
RenderHUD();
// Get the time after everything is done
glQueryCounter(GL_TIMESTAMP, queries[3]);
// Get the result from the three queries, and subtract them to find deltas
glGetQueryObjectuiv(queries[0], GL_QUERY_RESULT, &start_time);
glGetQueryObjectuiv(queries[1], GL_QUERY_RESULT, &world_time);
glGetQueryObjectuiv(queries[2], GL_QUERY_RESULT, &objects_time);
glGetQueryObjectuiv(queries[3], GL_QUERY_RESULT, &HUD_time);
HUD_time -= objects_time;
objects_time -= world_time;
world_time -= start_time;
// Done. world_time, objects_time, and hud_time contain the values we want.
// Clean up after ourselves.
glDeleteQueries(4, queries);

Listing 12.9: Timing operations using glQueryCounter()

As you can see, the code in this example is not dramatically different from that in Listing 12.8. You need to create four query objects instead of three, and you need to subtract out the results at the end to find the time deltas. However, you don’t need to call glBeginQuery() and glEndQuery() in pairs, which means that there are fewer calls to OpenGL in total. The results of the two samples aren’t quite equivalent, however. When you issue a GL_TIMESTAMP query, the time is written when the query reaches the end of the OpenGL pipeline. However, when you issue a GL_TIME_ELAPSED query, internally OpenGL will take a timestamp when glBeginQuery() reaches the start of the pipeline and again when glEndQuery() reaches the end of the pipeline, and then subtract the two. Clearly, the results won’t be quite the same. Nevertheless, so long as you are consistent in which method you use, your results should still be meaningful.

如同你所见到的，这份代码跟清单12.8所给出的代码并不是说有特别大的差异。你需要在这里创建4个query object而不是之前的3个，并且你需要计算出来时间差。但是，你不需要成对的去调用glBeginQuery和glEndQuery。意思就是说，总体来说OpenGL的API调用的数量变少了，只不过这两个例子的结果并不完全等同。当你使用GL_TIMESTAMP的方式进行查询的时候，当OpenGL流水线执行到了query object结束位置的时候，time变量才会被写入数值。而当你使用GL_TIME_ELAPSED查询方式的时候，OpenGL会计算出来glEndQuery指令执行结束时候的时间戳与glBeginQuery执行指令开始时的时间戳的时间差。很明显，结果不会时一样的。无论你采用哪种方式，你拿到的那个时间数据都是有意义的。

One important thing to note about the results of timer queries is that, because they are measured in nanoseconds, their values can get very large in a small amount of time. A single, unsigned 32-bit value can count as a little over 4 seconds’ worth of nanoseconds. If you expect to time operations that take longer than this (ideally over the course of many frames!), you might want to consider retrieving the full 64-bit results that query objects keep internally. To do this, call

关于timer queries的一个值得注意的事就是，由于它们的计量单位是纳秒，所以它们的数值可能会非常大。如果用一个无符号的32位整数来表示纳秒数据的话，那么它顶多能玩转4秒。所以如果你的某个操作出乎预料的执行时间超过了4秒，那么你就完蛋了。所以你应该使用64位的整数来获取和存储query object的数据。我们可以通过下面的API来实现这一操作。

void glGetQueryObjectui64v(GLuint id,GLenum pname,GLuint64 * params);

Just as with glGetQueryObjectuiv(), id is the name of the query object whose value you want to retrieve and pname can be GL_QUERY_RESULT or GL_QUERY_RESULT_AVAILABLE to retrieve the result of the query or just an indication of whether it’s available, respectively. Finally, although not techically a query, you can get an instantaneous, synchronous timestamp from OpenGL by calling

就像是glGetQueeryObjectuiv函数一样，id是query object的名字，pname参数要么是GL_QUERY_RESULT要么是GL_QUERY_RESULT_AVAILABLE，前者表示获取查询结果，后者用来询问这个结果是否可被查询。最后，还有一个 API，尽管技术上，跟query没啥关系，你可以从OpenGL通过这个API来立即采用同步的方式获取一个时间戳。

GLint64 t;
void glGetInteger64v(GL_TIMESTAMP, &t);

After this code has executed, t will contain the current time as OpenGL sees it. If you take this timestamp and then immediately launch a timestamp query, you can retrieve the result of the timestamp query and subtract t from it; the result will be the amount of time that it took the query to reach the end of the pipeline. This is known as the latency of the pipeline and is approximately equal to the amount of time that will pass between your application issuing a command and OpenGL fully executing it.

当这代码执行完毕后，t将会保存OpenGL返回的一个当前的时间。如果你先获取这个时间戳，然后立刻直接执行一个时间戳的查询，获得时间戳查询的结果后用它减掉t；这个结果将是OpenGL执行到查询操作结束位置时所花费的时间。这也就是我们所知道的OpenGL流水线的延迟，这个延迟基本上等同于从你的应用程序调用某个指令开始，到OpenGL完全执行完毕这个指令的时间。

本文转自：OpenGL-Timer Queries - 知乎 (zhihu.com)