Optimizing HLSL Shaders(minimum precision) And Debug DirectX Graphic（二）

警醒与鞭策

已于 2022-04-16 12:14:59 修改

阅读量529

点赞数

文章标签： unity HLSL 游戏引擎优化

于 2022-04-15 23:33:09 首次发布

原文链接：https://blog.csdn.net/Game_jqd/article/details/124194606

版权

///

Graphics Frame Analysis

Use Graphics Frame Analysis in Visual Studio Graphics Analyzer to analyze and optimize the rendering performance of your Direct3D game or app.

重要

Graphics Analyzer supports Frame Analysis for apps that use Direct3D 11 on supported platforms including Windows 10. Frame Analysis is not currently supported for apps that use Direct3D 12.

Frame analysis

Frame analysis uses the same information that's captured in a graphics log file for diagnostic purposes, but uses it to summarize rendering performance instead. Performance information is not recorded to the log during capture; instead the performance information is generated later, during frame analysis, by timing events and collecting statistics as the frame is played back. This approach has several advantages over recording performance information during capture:

Frame analysis can average results from multiple playbacks of the same frame to ensure that the performance summary is statistically sound.
Frame analysis can generate performance information for hardware configurations and devices other than the one where the information was captured.
Frame analysis can generate new performance summaries from previously captured information—for example, when GPU drivers are optimized or expose additional debugging features.

In addition to these advantages, frame analysis can also make changes to how the frame is rendered during playback so that it can present information about how those changes might impact the rendering performance of an app. You can use this information to decide among potential optimization strategies without having to implement them all and then capture and compare all of the results yourself.

Although frame analysis is primarily intended to help you achieve faster rendering performance, it can equally help you achieve better visual quality for a given performance target or reduce GPU power consumption.

To see a demonstration of what Frame Analysis can do for your app, you can watch the Visual Studio Graphics Frame Analysis video on Channel 9.

Using Frame Analysis

Before you can use Frame Analysis, you have to capture graphics information from your app as it runs, just as you would when you use any of the other Graphics Analyzer tools. Then, in the graphics log document (.vsglog) window, choose the Frame Analysis tab.

After the analysis is complete, the results are displayed. The top part of the frame analysis tab displays the timeline and summary table. The bottom part displays the details tables. If errors or warnings were generated during playback, they are summarized above the timeline; from there, you can follow the links to learn more about the errors and warnings.

Interpreting results

By interpreting the results of each variant, you can infer useful information about your app's rendering performance and behavior. For more information about rendering variants, see Variants later in this article.

Some results directly indicate how the variant affects rendering performance:

If the Bilinear Texture Filtering variant showed performance gains, then using bilinear texture filtering in your app will show similar performance gains.
If the 1x1 Viewport variant showed performance gains, then reducing the size of the render targets in your app will improve its rendering performance.
If the BC Texture Compression variant showed performance gains, then using BC texture compression in your app will show similar performance gains.
If the 2xMSAA variant has almost the same performance as the 0xMSAA variant, you can enable 2xMSAA in your app to improve its rendering quality without cost in performance.

Other results might suggest deeper, more subtle implications for your app's performance:
If the 1x1 Viewport variant shows very large performance gains, your app is probably consuming more fillrate than is available. If this variant shows no performance gains, the app is probably processing too many vertices.
If the 16bpp Render Target Format variant shows significant performance gains, your app is probably consuming too much memory bandwidth.
If the Half/Quarter Texture Dimensions variant shows significant performance gains, your textures probably occupy too much memory, consume too much bandwidth, or use the texture cache inefficiently. If this variant shows no change in performance, you can probably use larger, more-detailed textures without paying a performance cost.

When hardware counters are available, you can use them to gather very detailed information about why your app's rendering performance might be suffering. All feature-level 9.2 and higher devices support depth occlusion queries (pixels occluded counter) and timestamps. Other hardware counters may be available, depending on whether the GPU manufacturer has implemented hardware counters and exposed them in its driver. You can use these counters to confirm the precise cause of the results shown in the summary table—for example, you can determine whether overdraw is a factor by examining the percentage of pixels that were occluded by the depth test.

Timeline and Summary Table

By default, the Timeline and Summary Table are displayed and the other sections are collapsed.

Timeline

The timeline shows an overview of draw-call timings relative to one another. Because larger bars correspond to longer draw times, you can use it to quickly locate the most expensive draw calls in the frame. When the captured frame contains a very large number of draw calls, multiple draw calls are combined into one bar whose length is the sum of those draw calls.

You can rest the pointer on a bar to see which draw-call event the bar corresponds to. Selecting the bar causes the event list to synchronize to that event.

Table

The table of numbers below the timeline shows the relative performance of each rendering variant for each draw call with respect to your app's default rendering. Each column displays a different rendering variant and each row represents a different draw call that's identified in the left-most column; from here you can follow a link to the event in the Graphics Event List window.

The second left-most column in the Summary Table displays your app's baseline rendering time—that is, the length of time it takes for your app's default rendering to complete the draw call. The remaining columns show the relative performance of each rendering variant as a percentage of the Baseline so that it's easier to see whether performance is improved. Percentages larger than 100 percent took longer than the Baseline—that is, performance went down—and percentages smaller than 100 percent took less time—performance went up.

The values of both the absolute timing of the Baseline and the relative timing of the rendering variants are actually mean averages of multiple runs—5 by default. This averaging helps ensure that timing data is reliable and consistent. You can rest the pointer on each cell in the table to examine the minimum, maximum, mean, and median timing values that were observed when results for that draw call and rendering variant were generated. The Baseline timing is also displayed.

"Hot" draw calls

To bring attention to draw calls that consume a greater proportion of overall rendering time or that might be unusually slow for reasons that could be avoided, the row that contains these "hot" draw calls is shaded red when its own Baseline timing is more than one standard deviation longer than the mean Baseline timing of all draw calls in the frame.

Statistical significance

To bring attention to rendering variations that have the highest relevance, Frame Analysis determines the statistical significance of each rendering variant and displays the significant ones as boldface. It displays the ones that improve performance as green and the ones that reduce performance as red. It displays results that are not statistically significant as normal type.

To determine statistical relevance, Frame Analysis uses the Student's t-test.

Details table

Below the Summary table is the Details table, which is collapsed by default. The content of the Details table depends on the hardware platform of the playback machine. For information about supported hardware platforms, see Hardware support.

Platforms that do not support hardware counters

Most platforms don't fully support hardware GPU counters—these include all GPUs currently offered by Intel, AMD, and nVidia. When there are no hardware counters to collect, only one Details table is displayed and it contains the mean absolute timing of all variants.

Platforms that support hardware counters

For platforms that support hardware GPU counters—for example, the nVidia T40 SOC and all Qualcomm SOCs—several Details tables are displayed, one for each variant. Every available hardware counter is collected for each rendering variant and displayed in its own Details table.

The hardware counter information provides a very detailed view of specific hardware-platform behavior for each draw call, which can help you identify the cause of performance bottlenecks very precisely.

备注

Different hardware platforms support different counters; there is no standard. The counters and what they represent are determined solely by each GPU manufacturer.

Marker regions and events

Frame Analysis supports user-defined event markers and event groups. They are displayed in the Summary table and in the Detail tables.

You can use either the ID3DUserDefinedAnnotation APIs or the legacy D3DPERF_ family of APIs to create markers and groups. When you use the D3DPERF_ API family, you can assign each marker and group a color that Frame Analysis displays as a colored band in the rows that contain the event marker or the event group begin/end markers and their contents. This feature can help you quickly identify important rendering events or groups of events.

Warnings and errors

Frame Analysis occasionally completes with warnings or errors, which are summarized above the Timeline, and detailed at the bottom of the Frame Analysis tab.

Usually, warnings and errors are only for informational purposes and don't require any intervention.

Warnings typically indicate that hardware support is lacking but can be worked around, hardware counters cannot be collected, or certain performance data may not be reliable—for example, when a workaround adversely affects it.

Errors typically indicate that the frame analysis implementation has bugs, a driver has bugs, hardware support is lacking and can't be worked around, or the app tries something that's not supported by playback.

Retries

If the GPU undergoes a power-state transition during frame analysis, the affected analysis pass must be retried because the GPU clockspeed changed and thereby invalidated relative timing results.

Frame Analysis limits the number of retries to 10. If your platform has aggressive power management or clock-gating, it might cause Frame Analysis to fail and report an error because it has exceeded the retry limit. You might be able to mitigate this problem by resetting your platform's power management and clock speed throttling to be less aggressive, if the platform enables it.

Hardware support

Timestamps and occlusion queries

Timestamps are supported on all platforms that support Frame Analysis. Depth occlusion queries—required for the Pixels Occluded counter—are supported on platforms that support feature level 9.2 or higher.

备注

Although timestamps are supported on all platforms that support Frame Analysis, the accuracy and consistency of timestamps varies from platform to platform.

GPU counters

Support for GPU hardware counters is hardware-dependent.

Because no computer GPU currently offered by Intel, AMD, or nVidia supports GPU hardware counters reliably, Frame Analysis doesn't collect counters from them. However, Frame Analysis does collect hardware counters from these GPUs, which reliably support them:

Qualcomm SOCs (any that supports Windows Phone)
nVidia T40 (Tegra4).

No other platform that supports Frame Analysis collects GPU hardware counters.

备注

Because GPU hardware counters are hardware resources, it can take multiple passes to collect the complete set of hardware counters for each rendering variant. As a result, the order in which GPU counters are collected is unspecified.

Windows phone

Timestamps, occlusion queries, and GPU hardware counters are only supported on Windows Phone handsets that originally shipped with Windows Phone 8.1. Frame Analysis requires these in order to play back the graphics log file. Windows Phone handsets that were originally shipped with Windows Phone 8 do not support Frame Analysis, even for handsets that have been updated to Windows Phone 8.1.

Unsupported scenarios

Certain ways of using frame analysis are unsupported or are just a bad idea.

WARP

Frame analysis is intended to be used to profile and improve rendering performance on real hardware. Running frame analysis on WARP devices isn't prevented—the Windows Phone emulator runs on WARP—but it's not usually a worthwhile pursuit because WARP running on a high-end CPU is slower than even the least-capable modern GPUs, and because WARP performance can vary greatly depending on the particular CPU it's running on.

Playback of high-feature-level captures on down-level devices

In Graphics Analyzer, when you play back a graphics log file that uses a higher feature level than the playback machine supports, it automatically falls back to WARP. In Frame Analysis it explicitly does not fall back to WARP and it generates an error—WARP is useful for examining the correctness of your Direct3D app, but not for examining its performance.

备注

Although it's important to keep the feature-level issues in mind, you can capture and play back graphics log files on different hardware configurations and devices. For example, you can capture graphics information on a Windows Phone and play it back on a desktop computer, and the reverse is also supported. In both cases, the graphics log can be played back as long as the log file doesn't contain APIs or use feature levels that aren't supported on the playback machine.

Direct3D 10 and lower

Frame Analysis is only supported for the Direct3D 11 API. If your app calls the Direct3D 10 API, Frame Analysis won't recognize or profile them even though they're recognized and used by other Graphics Analyzer tools. If your app uses both the Direct3D11 and the Direct3D 10 APIs, only the Direct3D 11 calls are profiled.

备注

This applies only to the Direct3D API calls that you're using, not feature levels. As long as you're using the Direct3D 11, Direct3D 11.1, or Direct3D 11.2 API, you can use whatever feature level you like and Frame Analysis will just work.

Variants

Each change that Frame Analysis makes to the way a frame is rendered during playback is known as a variant. The variants that Frame Analysis examines correspond to common, relatively easy changes that you could make to improve the rendering performance or visual quality of your app—for example, reducing the size of textures, using texture compression, or enabling different kinds of anti-aliasing. Variants override the usual rendering context and parameters of your app. Here's a summary:

Variant	Description
1x1 Viewport Size	Reduces the viewport dimensions on all render targets to 1x1 pixels. For more information, see 1x1 Viewport Size Variant
0x MSAA	Disables multi-sample anti-aliasing (MSAA) on all render targets. For more information, see 0x/2x/4x MSAA Variants
2x MSAA	Enables 2x multi-sample anti-aliasing (MSAA) on all render targets. For more information, see 0x/2x/4x MSAA Variants
4x MSAA	Enables 4x multi-sample anti-aliasing (MSAA) on all render targets. For more information, see 0x/2x/4x MSAA Variants
Point Texture Filtering	Sets the filtering mode to `DXD11_FILTER_MIN_MAG_MIP_POINT` (point texture filtering) for all appropriate texture samples. For more information, see Point, Bilinear, Trilinear, and Anisotropic Texture Filtering Variants.
Bilinear Texture Filtering	Sets the filtering mode to `DXD11_FILTER_MIN_MAG_LINEAR_MIP_POINT` (bilinear texture filtering) for all appropriate texture samples. For more information, see Point, Bilinear, Trilinear, and Anisotropic Texture Filtering Variants.
Trilinear Texture Filtering	Sets the filtering mode to `DXD11_FILTER_MIN_MAG_MIP_LINEAR` (trilinear texture filtering) for all appropriate texture samples. For more information, see Point, Bilinear, Trilinear, and Anisotropic Texture Filtering Variants.
Anisotropic Texture Filtering	Sets the filtering mode to `DXD11_FILTER_ANISOTROPIC` and `MaxAnisotropy` to `16` (16x anisotropic texture filtering) for all appropriate texture samples. For more information, see Point, Bilinear, Trilinear, and Anisotropic Texture Filtering Variants.
16bpp Render Target Format	Sets the pixel format to `DXGI_FORMAT_B5G6R5_UNORM` (16bpp, 565 format) for all render targets and backbuffers. For more information, see 16bpp Render Target Format Variant
Mip-map Generation	Enables mip-maps on all textures that are not render targets. For more information, see Mip-map Generation Variant.
Half Texture Dimensions	Reduces the texture dimensions on all textures that are not render targets to half of their original size in each dimension. For example, a 256x128 texture is reduced to 128x64 texels. For more information, see Half/Quarter Texture Dimensions Variant.
Quarter Texture Dimensions	Reduces the texture dimensions on all textures that are not render targets to a quarter of their original size in each dimension. For example, a 256x128 texture is reduced to 64x32 texels. For more information, see Half/Quarter Texture Dimensions Variant.
BC Texture Compression	Enables block compression on all textures that have a B8G8R8X8, B8G8R8A8, or R8G8B8A8 pixel format variant. B8G8R8X8 format variants are compressed by using BC1; B8G8R8A8 and R8G8B8A8 format variants are compressed by using BC3. For more information, see BC Texture Compression Variant.

The result for most variants is prescriptive: "Reducing texture size by half is 25 percent faster" or "Enabling 2x MSAA is only 2 percent slower". Other variants might require more interpretation—for example, if the variant that changes the viewport dimensions to 1x1 shows a large performance gain, it might indicate that rendering is bottlenecked by a low fill rate; alternatively, if there's no significant change in performance, it might indicate that rendering is bottlenecked by vertex processing.

1x1 Viewport Size Variant

Reduces the viewport dimensions on all render targets to 1x1 pixels.

Interpretation

A smaller viewport reduces the number of pixels that must be shaded, but doesn't reduce the number of vertices that must be processed. Setting the viewport dimensions to 1x1 pixels effectively eliminates pixel-shading from your app.

If this variant shows a large performance gain, it might indicate that your app consumes too much fillrate. This can indicate that the resolution you have chosen is too high for the target platform or that your app spends significant time shading pixels that are later overwritten (overdraw). This result suggests that decreasing the size of your framebuffer or reducing the amount of overdraw will improve your app's performance.

Remarks

The viewport dimensions are reset to 1x1 pixels after every call to ID3D11DeviceContext::OMSetRenderTargets or ID3D11DeviceContext::RSSetViewports.

Example

This variant can be reproduced by using code like this:

D3D11_VIEWPORT viewport;
viewport.TopLeftX = 0;
viewport.TopLeftY = 0;
viewport.Width = 1;
viewport.Height = 1;
d3d_context->RSSetViewports(1, &viewport);

0x/2x/4x MSAA Variants

Overrides multi-sample anti-aliasing (MSAA) settings on all render targets and swap chains.

Interpretation

Multi-sample anti-aliasing increases visual quality by taking samples at multiple locations in each pixel; greater levels of MSAA take more samples, and without MSAA, only one sample is taken from the pixel's center. Enabling MSAA in your app usually has a modest but noticeable cost in rendering performance, but under certain workloads or on certain GPUs, it can be had with almost no impact.

If your app already has MSAA enabled, then the lesser MSAA variants indicate the relative performance cost that the existing, higher-level MSAA incurs. In particular, the 0x MSAA variant indicates the relative performance of your app without MSAA.

If your app doesn't already have MSAA enabled, then the 2x MSAA and 4x MSAA variants indicate the relative performance cost of enabling them in your app. When the cost is acceptably low, consider enabling MSAA to enhance the image quality of your app.

备注

Your hardware might not fully support MSAA for all formats. If any of these variants encounter a hardware limitation that can't be worked around, its column in the performance summary table is blank and an error message is produced.

Remarks

These variants override the sample count and sample-quality arguments on calls to ID3DDevice::CreateTexture2D that create render targets. Specifically, these parameters are overridden when:

The D3D11_TEXTURE2D_DESC object passed in pDesc describes a render target; that is:
- The BindFlags member has either the D3D11_BIND_TARGET flag or D3D11_BIND_DEPTH_STENCIL flag set.
- The Usage member is set to D3D11_USAGE_DEFAULT.
- The CPUAccessFlags member is set to 0.
- The MipLevels member is set to 1.
The device supports the requested sample count (0, 2, or 4) and sample quality (0) for the requested render target format (D3D11_TEXTURE2D_DESC::Format member), as determined by ID3D11Device::CheckMultisampleQualityLevels.

If the D3D11_TEXTURE2D_DESC::BindFlags member has the D3D_BIND_SHADER_RESOURCE or D3D11_BIND_UNORDERED_ACCESS flags set, then two versions of the texture are created; the first has these flags cleared for use as the render target, and the other is a non-MSAA texture that has these flags left intact to act as a resolve buffer for the first version. This is necessary because using an MSAA texture as a shader resource or for unordered access is unlikely to be valid—for example, a shader acting on it would generate incorrect results because it would expect a non-MSAA texture. If the variant has created the secondary non-MSAA texture, then whenever the MSAA render target is unset from the device context, its contents are resolved into the non-MSAA texture. Likewise, whenever the MSAA render target should be bound as a shader resource, or is used in an unordered access view, the resolved non-MSAA texture is bound instead.

These variants also override MSAA settings on all swap chains created by using IDXGIFactory::CreateSwapChain, IDXGIFactory2::CreateSwapChainForHwnd, IDXGIFactory2::CreateSwapChainForCoreWindow, IDXGIFactory2::CreateSwapChainForComposition, and ID3D11CreateDeviceAndSwapChain.

The net effect of these changes is that all rendering is done to an MSAA render target, but if your application uses one of these render targets or swap-chain buffers as a shader resource view or unordered access view, then data is sampled from the resolved, non-MSAA copy of the render target.

Restrictions and limitations

In Direct3D11, MSAA textures are more restricted than non-MSAA textures. For example, you can't call ID3D11DeviceContext::UpdateSubresource on an MSAA texture, and calling ID3D11DeviceContext::CopySubresourceRegion fails if the sample count and sample quality of the source resource and destination resource don't match, which can occur when this variant overrides the MSAA settings of one resource but not the other.

When playback detects these kinds of conflicts, it makes a best effort to replicate the intended behavior, but it might not be possible to match its results exactly. Although it's uncommon for this to affect the performance of these variants in a way that misrepresents their impact, it is possible—for example, when flow control in a pixel shader is determined by the precise content of a texture—because the replicated texture might not have identical contents.

Example

These variants can be reproduced for render targets created by using ID3D11Device::CreateTexture2D by using code like this:

D3D11_TEXTURE2D_DESC target_description;
target_description.BindFlags = D3D11_BIND_RENDER_TARGET;
target_description.SampleDesc.Count = 4; // 4x MSAA, can be 2 or 0 instead
target_description.SampleDesc.Quality = 0;
d3d_device->CreateTexture2D(&target_description, nullptr, &render_target);

Example

Or for swap chains created by using IDXGISwapChain::CreateSwapChain or D3D11CreateDeviceAndSwapChain by using code like this:

DXGI_SWAP_CHAIN_DESC chain_description;
chain_description.SampleDesc.Count = 4; // 4x MSAA, can be 2 or 0 instead
chain_description.SampleDesc.Quality = 0;

// Call IDXGISwapChain::CreateSwapChain or D3D11CreateDeviceAndSwapChain, etc.

Point, Bilinear, Trilinear, and Anisotropic Texture Filtering Variants

Overrides the filtering mode on appropriate texture samplers.

Interpretation

Different methods of texture sampling have different performance costs and image quality. In order of increasing cost—and increasing visual quality—the filter modes are:

Point filtering (least expensive, worst visual quality)
Bilinear filtering
Trilinear filtering
Anisotropic filtering (most expensive, best visual quality)

If the performance cost of each variant is significant or increases with more-intensive filtering modes, you can weigh its cost against its increased image quality. Based on your assessment, you might accept additional performance costs to increase visual quality, or you might accept decreased visual quality to achieve a higher frame-rate or to reclaim performance that you can use in other ways.

If you find that performance cost is negligible or steady regardless of the filtering mode—for example, when the GPU that you're targeting has an abundance of shader throughput and memory bandwidth—consider using anisotropic filtering to achieve the best image quality in your app.

Remarks

These variants override the sampler states on calls to ID3D11DeviceContext::PSSetSamplers in which the application-provided sampler's filter mode is one of these:

D3D11_FILTER_MIN_MAG_MIP_POINT
D3D11_FILTER_MIN_MAG_POINT_MIP_LINEAR
D3D11_FILTER_MIN_POINT_MAG_LINEAR_MIP_POINT
D3D11_FILTER_MIN_POINT_MAG_MIP_LINEAR
D3D11_FILTER_MIN_LINEAR_MAG_MIP_POINT
D3D11_FILTER_MIN_LINEAR_MAG_POINT_MIP_LINEAR
D3D11_FILTER_MIN_MAG_LINEAR_MIP_POINT
D3D11_FILTER_MIN_MAG_MIP_LINEAR
D3D11_FILTER_ANISOTROPIC

In the Point Texture Filtering variant, the application-provided filter mode is replaced with D3D11_FILTER_MIN_MAG_MIP_POINT; in the Bilinear Texture Filtering variant, it's replaced with D3D11_FILTER_MIN_MAG_LINEAR_MIP_POINT; and in the Trilinear Texture Filtering variant, it's replaced with D3D11_FILTER_MIN_MAG_MIP_LINEAR.

In the Anisotropic Texture Filtering variant, the application-provided filter mode is replaced with D3D11_FILTER_ANISOTROPIC, and the Max Anisotropy is set to 16.

Restrictions and limitations

In Direct3D, feature level 9.1 specifies a maximum anisotropy of 2x. Because the Anisotropic Texture Filtering variant attempts to use 16x anisotropy exclusively, playback fails when frame analysis is run on a feature-level 9.1 device. Contemporary devices that are affected by this limitation include the ARM-based Surface RT and Surface 2 Windows tablets. Older GPUs that might still be found in some computers can also be affected, but they're widely considered to be obsolete and are increasingly uncommon.

Example

The Point Texture Filtering variant can be reproduced by using code like this:

D3D11_SAMPLER_DESC sampler_description;

// ... other sampler description setup ...

sampler_description.Filter = D3D11_FILTER_MIN_MAG_MIP_POINT;

d3d_device->CreateSamplerState(&sampler_desc, &sampler);
d3d_context->PSSetSamplers(0, 1, &sampler

Example

The Bilinear Texture Filtering variant can be reproduced by using code like this:

D3D11_SAMPLER_DESC sampler_description;

// ... other sampler description setup ...

sampler_description.Filter = D3D11_FILTER_MIN_MAG_LINEAR_MIP_POINT;

d3d_device->CreateSamplerState(&sampler_desc, &sampler);
d3d_context->PSSetSamplers(0, 1, &sampler

Example

The Trilinear Texture Filtering variant can be reproduced by using code like this:

D3D11_SAMPLER_DESC sampler_description;

// ... other sampler description setup ...

sampler_description.Filter = D3D11_FILTER_MIN_MAG_MIP_LINEAR;

d3d_device->CreateSamplerState(&sampler_desc, &sampler);
d3d_context->PSSetSamplers(0, 1, &sampler

Example

The Anisotropic Texture Filtering variant can be reproduced by using code like this:

D3D11_SAMPLER_DESC sampler_description;

// ... other sampler description setup ...

sampler_description.Filter = D3D11_FILTER_ANISOTROPIC;
sampler_description.MaxAnisotropy = 16;

d3d_device->CreateSamplerState(&sampler_desc, &sampler);
d3d_context->PSSetSamplers(0, 1, &sampler

16bpp Render Target Format Variant

Sets the pixel format to DXGI_FORMAT_B5G6R5_UNORM for all render targets and back buffers.

Interpretation

A render target or back buffer typically uses a 32bpp (32 bits per pixel) format such as B8G8R8A8_UNORM. 32bpp formats can consume a lot of memory bandwidth. Because the B5G6R5_UNORM format is a 16bpp format that's half the size of 32bpp formats, using it can relieve pressure on memory bandwidth, but at the cost of reduced color fidelity.

If this variant shows a large performance gain, it likely indicates that your app consumes too much memory bandwidth. Performance gains can be especially pronounced when the profiled frame suffers from a significant amount of overdraw or contains a lot of alpha-blending.

If the kinds of scenes that are rendered by your app don't require high-fidelity color reproduction, don't require the render target to have an alpha channel, and don't often contain smooth gradients—which are susceptible to banding artifacts under reduced color fidelity—consider using a 16bpp render target format to reduce memory bandwidth usage.

If the scenes that are rendered in your app require high-fidelity color reproduction or an alpha channel, or smooth gradients are common, consider other strategies to reduce memory bandwidth usage—for example, reducing the amount of overdraw or alpha-blending, reducing the dimensions of the framebuffer, or modifying texture resources to consume less memory bandwidth by enabling compression or reducing their dimensions. As usual, you have to consider the image quality trade-offs that come with any of these optimizations.

If your app would benefit from switching to a 16bpp back buffer but it's a part of your swap chain, you have to take additional steps because DXGI_FORMAT_B5G6R5_UNORM is not a supported back buffer format for swap chains created by using D3D11CreateDeviceAndSwapChain or IDXGIFactory::CreateSwapChain. Instead, you have to create a B5G6R5_UNORM format render target by using CreateTexture2D and render to that instead. Then, before you call Present on your swap chain, copy the render target onto the swap-chain backbuffer by drawing a full-screen quad with the render target as your source texture. Although this is an extra step that will consume some memory bandwidth, most rendering operations will consume less bandwidth because they affect the 16bpp render target; if this saves more bandwidth than is consumed by copying the render target to the swap-chain backbuffer, then rendering performance is improved.

GPU architectures that use tiled rendering techniques can see significant performance benefits by using a 16bpp framebuffer format because a larger portion of the framebuffer can fit in each tile's local framebuffer cache. Tiled rendering architectures are sometimes found in GPUs in mobile handsets and tablet computers; they rarely appear outside of this niche.

Remarks

The render target format is reset to DXGI_FORMAT_B5G6R5_UNORM on every call to ID3D11Device::CreateTexture2D that creates a render target. Specifically, the format is overridden when the D3D11_TEXTURE2D_DESC object passed in pDesc describes a render target; that is:

The BindFlags member has the D3D11_BIND_REDNER_TARGET flag set.
The BindFlags member has the D3D11_BIND_DEPTH_STENCIL flag cleared.
The Usage member is set to D3D11_USAGE_DEFAULT.

Restrictions and limitations

Because the B5G6R5 format doesn't have an alpha channel, alpha content is not preserved by this variant. If your app's rendering requires an alpha channel in your render target, you can't just switch to the B5G6R5 format.

Example

The 16bpp Render Target Format variant can be reproduced for render targets created by using CreateTexture2D by using code like this:

D3D11_TEXTURE2D_DESC target_description;

target_description.BindFlags = D3D11_BIND_RENDER_TARGET;
target_description.Format = DXGI_FORMAT_B5G6R5_UNORM;
d3d_device->CreateTexture2D(&target_description, nullptr, &render_target);

Mip-map Generation Variant

Enables mip-maps on textures that are not render targets.

Interpretation

Mip-maps are primarily used to eliminate aliasing artifacts in textures under minification by pre-calculating smaller versions of the texture. Although these additional textures consume GPU memory—about 33 percent more than the original texture—they're also more efficient because more of their surface area fits in the GPU texture cache and its contents achieve higher utilization.

For 3-D scenes, we recommend mip-maps when memory is available to store the additional textures because they increase both rendering performance and image quality.

If this variant shows a significant performance gain, it indicates that you are using textures without enabling mip-maps and thereby not getting the most from the texture cache.

Remarks

Mip-map generation is forced on every call to ID3D11Device::CreateTexture2D that creates a source texture. Specifically, mip-map generation is forced when the D3D11_TEXTUR2D_DESC object passed in pDesc describes an unchanging shader resource; that is:

The BindFlags member has only the D3D11_BIND_SHADER_RESOURCE flag set.
The Usage member is set to either D3D11_USAGE_DEFUALT or D3D11_USAGE_IMMUTABLE.
The CPUAccessFlags member is set to 0 (no CPU access).
The SampleDesc member has its Count member set to 1 (no Multi-Sample Anti-Aliasing (MSAA)).
The MipLevels member is set to 1 (no existing mip-map).

When initial data is supplied by the application, the texture format must support automatic mip-map generation—as determined by D3D11_FORMAT_SUPPORT_MIP_AUTOGEN—unless the format is BC1, BC2, or BC3; otherwise, the texture is not modified and no mip-maps are generated when initial data is supplied.

If mip-maps have been automatically generated for a texture, calls to ID3D11Device::CreateShaderResourceView are modified during playback to use the mip-chain during texture sampling.

Example

The Mip-map Generation variant can be reproduced by using code like this:

D3D11_TEXTURE2D_DESC texture_description;

// ...

texture_description.MipLevels = 0; // generate a full mipchain

std::vector<D3D11_SUBRESOURCE_DATA> initial_data(num_mips);

for (auto&& mip_level : initial_data)
{
// fill mip_level with the application-supplied initial data
}

d3d_device->CreateTexture2D(&texture_description, initial_data.data(), &texture)

To create a texture that has a full mip-chain, set D3D11_TEXTURE2D_DESC::MipLevels to 0. The number of mip levels in a full mip-chain is floor(log2(n) + 1), where n is the largest dimension of the texture.

Remember that when you provide initial data to CreateTexture2D, you must provide a D3D11_SUBRESOURCE_DATA object for each mip level.

备注

If you want to provide your own mip level contents instead of generating them automatically, you must create your textures by using an image editor that supports mip-mapped textures and then load the file and pass the mip levels to CreateTexture2D.

///

Half/Quarter Texture Dimensions Variant

Reduces the texture dimensions on textures that are not render targets.

Interpretation

Smaller textures occupy less memory and therefore consume less memory bandwidth and reduce pressure on the GPU's texture cache. However, their lesser detail can cause reduced image quality, especially when they're viewed closely in a 3-D scene or viewed under magnification.

If this variant shows a large performance gain, it can indicate that your app consumes too much memory bandwidth, uses the texture cache inefficiently, or both. It can also indicate that your textures occupy more GPU memory than is available, which causes textures to be paged to system memory.

If your app consumes too much memory bandwidth or uses the texture cache inefficiently, consider reducing the size of your textures, but only after you consider enabling mip-maps for appropriate textures. Like smaller textures, mip-mapped textures consume less memory bandwidth—although they occupy more GPU memory—and increase cache utilization, but they don't reduce texture detail. We recommend mip-maps whenever the increased memory usage doesn't cause textures to be paged to system memory.

If your textures occupy more GPU memory than is available, consider reducing the size of the textures, but only after you consider compressing appropriate textures. Like smaller textures, compressed textures occupy less memory and reduce the need to page to system memory, but their color fidelity is reduced. Compression isn't appropriate for all textures, depending on their content—for example, those that have significant color variation in a small area—but for many textures, compression can retain better overall image quality than reducing their size.

Remarks

Texture dimensions are reduced on every call to ID3D11Device::CreateTexture2D that creates a source texture. Specifically, texture dimensions are reduced when the D3D11_TEXTURE2D_DESC object passed in pDesc describes a texture that's used in rendering; that is:

The BindFlags member has only the D3D11_BIND_SHADER_RESOURCE flag set.
The MiscFlags member does not have the D3D11_RESOURCE_MISC_TILE_POOL flag or the D3D11_RESOURCE_MISC_TILED flag set (tiled resources are not resized).
The texture format is supported as a render target—as determined by D3D11_FORMAT_SUPPORT_RENDER_TARGET—which is required for reducing the texture size. BC1, BC2, and BC3 formats are also supported, even though they're not supported as render targets.

If initial data is supplied by the application, this variant scales the texture data to the appropriate size before it creates the texture. If initial data is supplied in a block-compressed format such as BC1, BC2, or BC3, it is decoded, scaled, and re-encoded before it's used to create the smaller texture. (The nature of block-based compression means that the extra decode-scale-encode process almost always causes lower image quality than when a block-compressed texture is generated from a scaled version of the texture that had not previously been encoded.)

If mip-maps are enabled for the texture, the variant reduces the number of mip levels accordingly—one fewer when scaling to half-size or two fewer when scaling to quarter-size.

Example

This variant resizes textures at runtime before the call to CreateTexture2D. We recommend against this approach for production code because the full-size textures consume more disk space and because the additional step can increase loading times in your app—especially for compressed textures, which require significant computational resources to encode. Instead, we recommend that you resize your textures offline by using an image editor or image processor that's part of your build pipeline. These approaches reduce disk-space requirements and eliminate runtime overhead in your app, and afford more processing time so that you can retain the best image quality while shrinking or compressing your textures.

BC Texture Compression Variant

Enables block compression on textures that have a pixel format that's a variation of B8G8R8X8, B8G8R8A8, or R8G8B8A8.

Interpretation

Block-based compression formats like BC1, BC2, and BC3 occupy significantly less memory than uncompressed image formats and therefore consume significantly less memory bandwidth. Compared to an uncompressed format that uses 32 bits per pixel, BC1 (formerly known as DXT1) achieves 8:1 compression and BC3 (formerly known as DXT5) achieves 4:1. The difference between BC1 and BC3 is that BC1 doesn't support an alpha channel, while BC3 supports a block-compressed alpha channel. Despite the high compression ratios, there's only a minor reduction in image quality for typical textures. However, block compression of certain kinds of textures—for example, those that have significant color variation in a small area—can have unacceptable results.

If your textures are suitable for block-based compression and don't need perfect color fidelity, consider using a block-compressed format to reduce memory usage and consume less bandwidth.

Remarks

You compress textures by using a block-based compression format on every call to ID3DDevice::CreateTexture2D that creates a source texture. Specifically, textures are compressed when:

The D3D11_TEXTURE2D_DESC object passed in pDesc describes an unchanging shader resource; that is:
- The BindFlags member has only the D3D11_BIND_SHADER_RESOURCE flag set.
- The Usage member is set to either D3D11_USAGE_DEFAULT or D3D11_USAGE_IMMUTABLE.
- The CPUAccessFlags member is set to 0 (no CPU access).
- The SamplerDesc member has its Count member set to 1 (no Multi-Sample Anti-Aliasing (MSAA)).
Initial data is provided to the call to CreateTexture2D.

Here are the supported source formats and their block-compressed formats.

Original format (from)	Compressed format (to)
`DXGI_FORMAT_B8G8R8X8_UNORM`	BC1 (formerly DXT1)
`DXGI_FORMAT_B8G8R8X8_UNORM_SRGB`	BC1
`DXGI_FORMAT_B8G8R8X8_TYPELESS`	BC1
`DXGI_FORMAT_B8G8R8A8_UNORM`	BC3 (formerly DXT5)
`DXGI_FORMAT_B8G8R8A8_UNORM_SRGB`	BC3
`DXGI_FORMAT_B8G8R8A8_TYPELESS`	BC3
`DXGI_FORMAT_R8G8B8A8_UNORM`	BC3
`DXGI_FORMAT_R8G8B8A8_UNORM_SRGB`	BC3
`DXGI_FORMAT_R8G8B8A8_TYPELESS`	BC3

If your texture has a format that's not listed, the texture is not modified.

Restrictions and limitations

Sometimes textures that are created with a variation of the B8G8R8A8 or R8G8B8A8 image formats don't actually use the alpha channel, but there's no way for the variant to know whether it's used or not. To maintain correctness in case the alpha channel is used, the variant always encodes these formats into the less-efficient BC3 format. You can help Graphics Frame Analysis better understand your app's potential rendering performance with this variant by using a variation of the B8G8R8X8 image format when you are not using the alpha channel so that the variant can use the more-efficient BC1 format.

Example

This variant block-compresses textures at run time, before the call to CreateTexture2D. We recommend against this approach for production code because the uncompressed textures consume more disk space and because the additional step can significantly increase loading times in your app because block-based compression requires significant computational resources to encode. Instead, we recommend that you compress your textures offline by using an image editor or image processor that's part of your build pipeline. These approaches reduce disk-space requirements, eliminate run-time overhead in your app, and afford more processing time so that you can retain the best image quality.

///

Graphics Event List

Use the Graphics Event List in Visual Studio Graphics Analyzer to explore the Direct3D events that were recorded while rendering a frame of your game or app.

This is the Event List:

Using the event list

When you select an event in the event list, it's reflected in the information that's displayed by other Graphics Analysis tools; by using the event list in concert with these other tools you can examine a rendering problem in detail to determine its cause. To learn more about how you can solve rendering problems by using the event list together with other Graphics Analysis tools, see Examples.

Using the features of the event list effectively is important for getting around complex frames that might contain thousands of events. To use the event list effectively, choose the view works best for you, use search to filter the event list, follow links to learn more about the Direct3D objects that are associated with an event, and use the arrow buttons to move between draw calls quickly.

Color-coded events in Direct3D 12

Direct3D 12 exposes multiple queues that correspond to different hardware functionality. To help identify the queue that's associated with a particular graphics event in Direct3D 12, events are color-coded in the Event List according to their queue when you're working with a capture of a Direct3D 12 app.

Direct3D 12 Queue	Color
Render queue	Green
Compute queue	Yellow
Copy queue	Orange

Direct3D 11 doesn't expose multiple queues, so events aren't color-coded in the Event List when you're working with a capture of a Direct3D 11 app.

Event list views

The event list supports two different views that organize graphics events in different ways to support your workflow and preferences. The first view is the draw calls view which organizes events and their associated state hierarchically. The second view is the timeline view which organizes events chronologically, in a flat list.

The Draw Calls view
Displays captured events and their state in a hierarchy. The top level of the hierarchy is made up of events such as draw calls, clears, present, and those dealing with views. In the event list, you can expand draw calls to display the device state that was current at the time of the draw call; and you can further expand each kind of state to display the events that set their values. At this level, you can also see whether a particular state was set in a previous frame, or if it has been set more than once since the last draw call.

The Timeline view
Displays each captured event in chronological order. This way of organizing the event list is the same as in previous versions of Visual Studio.

To change the event list view mode

In the Graphics Event List window, above the list of events, locate the View dropdown and chose either the Timeline view or the Draw calls view.

Filtering events

You can use the Search box—located in the upper-right corner of the Graphics Event List window—to filter the events list to include only events whose names contain specific keywords. You can specify single keywords like Vertex—as shown in the previous illustration—or multiple keywords by using a semicolon-delimited list like Draw;Primitive—which matches events that have either Draw or Primitive in their names. Searches are sensitive to whitespace—for example, VSSet and VS Set are different searches—so make sure to form searches carefully.

Moving between draw calls

Because examining Draw calls is especially important, you can use the Go to the next draw call and Go to the previous draw call buttons—located in the upper-left corner of the Graphics Event List window—to find and move between draw calls quickly.

Links to graphics objects

To understand certain graphics events, you might need additional information about the current state of Direct3D or about Direct3D objects that are referenced by the event. Many events provide links to this information that you can follow for more detail.

Kinds of events and event markers

The events that are displayed in the event list are organized into four categories: general events, draw events, user-defined event groups, and user-defined event markers. Except for general events, each event is displayed together with an icon that indicates the category that it belongs to.

Icon	Event description
(no icon)	General event Any event which is not a user-defined event, user-defined event group, or draw event.
	Draw event Marks a draw event that occurred during the captured frame.
	User-defined event group Groups related events, as defined by the app.
	User-defined event marker Marks a specific location, as defined by the app.

Marking user-defined events in your app

User-defined events are specific to your app. You can use them to correlate significant events that occur in your app with events in the Graphics Event List. For example, you can create user-defined event groups to organize related events—such as those that render your user interface—into groups or hierarchies so that you can browse the event list more easily, or you can create markers when a certain kinds of objects are drawn so that you can easily find their graphics events in the event list.

To create groups and markers in your app, you use the same APIs that Direct3D provides for use by other Direct3D debugging tools. These APIs sometimes change between versions of Direct3D, but the basic functionality is the same.

User-defined events in Direct3D 12

To create groups and markers in Direct3D 12, use the APIs described in this section. The table below summarizes the APIs that you can use depending on whether you are marking events in a command queue or command list.

API Description	ID3D12CommandQueue	ID3D12GraphicsCommandList
Check user-defined event availability	PIXGetStatus	PIXGetStatus
Begin an event group	PIXBeginEvent	PIXBeginEvent
End an event group	PIXEndEvent	PIXEndEvent
Create an event marker	PIXSetMarker	PIXSetMarker

User-defined events in Direct3D 11 and earlier

To create groups and markers in Direct3D 11 or earlier, use the APIs described in this section. The table below summarizes the APIs that you can use for different versions of Direct3D 11 and earlier versions of Direct3D.

API Description	ID3D11DeviceContext2 (Direct3D 11.2)	ID3DUserDefinedAnnotation (Direct3D 11.1)	D3DPerf_ API family (Direct3D 11.0 and earlier)
Begin an event group	`BeginEventInt`	`BeginEvent`	`D3DPerf_BeginEvent`
End an event group	`EndEventInt`	`EndEvent`	`D3DPerf_EndEvent`
Create an event marker	`SetMarkerInt`	`SetMarker`	`D3DPerf_SetMarker`

You can use any of these APIs that your version of Direct3D supports—for example, if you are targeting the Direct3D 11.1 API, you can use either SetMarker or D3DPerf_SetMarker to create an event marker, but not SetMarkerInt because its only available in Direct3D 11.2—and you can even mix those that support different versions of Direct3D together in the same app.

Graphics State

The State window in Visual Studio graphics Diagnostics helps you understand the graphics state that is active at the time of the current event, such as a draw call.

Understanding the State window

The state window collects together the state that affects rendering and presents it hierarchically, in one place. Depending on the version of Direct3D your app uses, the information presented in the state window might have differences.

State views

You can view the state table in several different ways:

View	Description
API input state view	This view presents the state in a similar layout to the Direct3D objects that make up the state.
Logical input state view	This view presents the state in a logical view that does not mirror the layout of the Direct3D objects that make up the state.
Pinned state view	Instead of a hierarchy, the Pinned state view presents pinned state items in a flat list with fully-qualified names. This view makes is possible to view many state items from different bundles of state in a small number of lines.

To change the state view

In the State window, in the upper left-hand just below the titlebar, choose the button that corresponds to the state view style you want to use.
- Show API input state view
- Show Logical state view
- Show Pinned state view

重要

You must pin state in the Show API input state or Show Logical state views for it to be displayed in the Show Pinned state view.

State table format

The State window presents several columns of information.

Column	Description
Name	The name of the state item. If this item represents a bundle of state, the item can be expanded to display it. In the API input state view and Logical state view states, names are indented to show the hierarchical relationship between states. In the Pinned state view state, fully-qualified names are displayed in a flat list.
Value	The value of the state item.
Type	The type of the state item.

Changed state

Graphics state typically changes incrementally between subsequent draw calls, and many kinds of rendering problems are caused when state is changed incorrectly. To help you find which state has changed since the previous draw call, state that's been changed is marked with an asterisk and displayed in red—this applies not just to the state itself, but to its parent state item as well, so that you can easily spot changed state at the highest level and then drill-down to the details.

Pinning state

Because many apps render similar objects sequentially, changing a known set of state, it's sometimes useful to pin the changing states in place so that you can watch how it changes as you move from draw call to draw call.

This can also be useful if you've isolated the source of a problem to be due to a change in a particular state.

To pin state in place

In the State window, locate the state that you're interested in. You might have to expand higher-level state to locate the details you're interested in.
Place the cursor over the state that you're interested in. A Pin icon appears to the left of the state item.
Choose the Pin icon to pin the state item in place.

///

Graphics Pipeline Stages

The Graphics Pipeline Stages window helps you understand how an individual draw call is transformed by each stage of the Direct3D graphics pipeline.

This is the Pipeline Stages window:

Understanding the Graphics Pipeline Stages window

The Pipeline Stages window visualizes the result of each stage of the graphics pipeline separately, for each draw call. Normally, the results of stages in the middle of the pipeline are hidden, making it difficult to tell where a rendering problem started. By visualizing each stage separately, the Pipeline Stages window makes it easy to see where the problem starts—for example, you can easily see when the vertex shader stage unexpectedly causes an object to be drawn off-screen.

After you've identified the stage in which the problem occurs, you can use the other Graphics Analyzer tools to examine how the data was interpreted or transformed. Rendering problems that appear in the pipeline stages are often related to incorrect vertex format descriptors, buggy shader programs, or misconfigured state.

Sometimes additional context is needed to determine why a draw call interacts in a particular way with the graphics pipeline. To make this additional context easier to find, the Graphics Pipeline Stages window links to one or more objects that provide additional context related to what's happening in the graphics pipeline.

In Direct3D 12 this object is usually a command list.
In Direct3D 11 this object is usually a graphics device context.

These links are part of the current graphics event signature that's located in the upper left-hand corner of the Graphics Pipeline Stages window. Follow any of these links to examine additional details about the object.

Viewing and debugging shader code

You can examine and debug code for vertex, hull, domain, geometry and pixel shaders by using the controls at the bottom of their respective stages in the Pipeline Stages window.

To view a shader's source code

In the Graphics Pipeline Stages window, locate the shader stage that corresponds to the shader you want to examine. Then, below the preview image, follow the shader stage title link—for example, follow the link Vertex Shader obj:30 to view the vertex shader source code.

提示

The object number, obj:30, identifies this shader throughout the Graphics Analyzer interface such as in the object table and pixel history window.

To debug a shader

In the Graphics Pipeline Stages window, locate the shader stage that corresponds to the shader you want to debug. Then, below the preview image, choose Start Debugging. This entry point into the HLSL debugger defaults to the first invocation of the shader for the corresponding stage—that is, the first pixel, vertex, or primitive that's processed by the shader during this draw call. Invocations of this shader for a specific pixel or vertex can be accessed through the Graphics Pixel History.

The pipeline stages

The Pipeline Stages window visualizes only the stages of the pipeline that were active during the draw call. Each stage of the graphics pipeline transforms input from the previous stage and passes the result to the next stage. The very first stage—the Input Assembler—takes index and vertex data from your app as its input; the very last stage—the Output Merger—combines newly rendered pixels together with the current contents of the framebuffer or render target as its output to produce the final image you see on your screen.

备注

Compute shaders are not supported in the Graphics Pipeline Stages window.

Input Assembler
The Input Assembler reads index and vertex data specified by your app and assembles it for the graphics hardware.

In the Pipeline Stages window, the Input Assembler output is visualized as a wireframe model. To take a closer look at the result, select Input Assembler in the Graphics Pipeline Stages window to view the assembled vertices in full 3D using the Model Editor.

备注

If the POSITION semantic is not present in the input assembler output, then nothing is displayed in the Input Assembler stage.

Vertex Shader
The vertex shader stage processes vertices, typically performing operations such as transformation, skinning, and lighting. Vertex shaders produce the same number of vertices that they takes as input.

In the Pipeline Stages window, the Vertex Shader output is visualized as a wireframe raster image. To take a closer look at the result, select Vertex Shader in the Graphics Pipeline Stages windows to view the processed vertices in the Image Editor.

备注

If the POSITION or SV_POSITION semantics are not present in the vertex shader output, then nothing is displayed in the Vertex Shader stage.

Hull Shader (Direct3D 11 and Direct3D 12 only)
The hull shader stage processes control points that define a low-order surface such as a line, triangle, or quad. As output it produces a higher-order geometry patch and patch constants that are passed to the fixed-function tessellation stage.

The hull shader stage is not visualized in the Pipeline Stages window.

Tessellator Stage (Direct3D 11 and Direct3D 12 only)
The tessellator stage is a fixed function (non-programmable) hardware unit that preprocesses the domain represented by the output of the hull shader. As output, it creates a sampling pattern of the domain and a set of smaller primitives—points, lines, triangles—that connect these samples.

The tessellator stage is not visualized in the Pipeline Stages window.

Domain Shader (Direct3D 11 and Direct3D 12 only)
The domain shader stage processes higher-order geometry patches from the Hull shader, together tessellation factors from the tessellation stage. The tessellation factors can be include tessellator input factors as well as output factors. As output, it calculates the vertex position of a point on the output patch according the tessellator factors.

The domain shader stage is not visualized in the Pipeline Stages window.

Geometry Shader
The geometry shader stage processes entire primitives—points, lines, or triangles—along with optional vertex data for edge-adjacent primitives. Unlike vertex shaders, geometry shaders can produce more or fewer primitives than they take as input.

In the Pipeline Stages window, geometry shader output is visualized as a wireframe raster image. To take a closer look at the result, select Geometry Shader in the Graphics Pipeline Stages window to view the processed primitives in the Image Editor.

Stream Output Stage
The stream output stage can intercept transformed primitives prior to rasterization and write them to memory; from there the data can be recirculated as input to earlier stages of the graphics pipeline or be read back by the CPU.

The stream output stage is not visualized in the Pipeline Stages window.

Rasterizer Stage
The rasterizer stage is a fixed function (non-programmable) hardware unit that converts vector primitives—points, lines, triangles—into a raster image by performing scan-line conversion. During rasterization vertices are transformed into the homogenous clip-space and clipped. As output, pixel shaders are mapped and per-vertex attributes are interpolated across the primitive and made ready for the pixel shader.

The rasterizer stage is not visualized in the Pipeline Stages window.

Pixel Shader
The pixel shader stage processes rasterized primitives together with interpolated vertex data to generate per-pixel values such as color and depth.

In the Pipeline Stages window, pixel shader output is visualized as a full-color raster image. To take a closer look at the result, select Pixel Shader in the Graphics Pipeline Stages window to view the processed primitives in the Image Editor.

Output Merger
The output merger stage combines the effect of newly-rendered pixels together with the existing contents of their corresponding buffers—color, depth, and stencil—to produce new values in these buffers.

In the Pipeline Stages window, output merger output is visualized as a full-color raster image. To take a closer look at the results, select Output Merger in the Graphics Pipeline Stages window to view the merged framebuffer.

Vertex shader preview

When you select the vertex shader stage in the Graphics Pipeline Stages window, the Input Buffers panel is displayed. Here, you'll find details about the list of vertices supplied to the vertex shader after they have been assembled by the input assembler stage.

To view the result of the vertex shader stage, choose the Vertex Shader stage thumbnail to view a full-size, rasterized wireframe of the mesh after its been transformed by the vertex shader.

Graphics Event Call Stack

The Graphics Event Call Stack in Visual Studio Graphics Analyzer helps you map the relationship between problematic graphics events and your app's source code.

This is the Event Call Stack window:

Understanding the graphics event call stack

You can use the Event Call Stack to understand the flow of execution that led to a particular Direct3D event. It resembles the Visual Studio call stack window, except that instead of displaying the current call stack of the active thread in a running app, it displays the call stack as it existed when the selected Direct3D event occurred. From the Event Call Stack, you can jump to the call site of the selected Direct3D event to inspect the surrounding code.

By using the Event Call Stack to identify the code path from which a problem event originates, you can use your knowledge of the codebase to deduce potential sources of the problem, or you can add breakpoints in your app's source code so that you can use traditional debugging techniques to examine how the state of the app or event parameters are causing the event to misbehave. This examination can help you find problems in the source code that are only manifested as rendering problems.

Graphics Event Call Stack information

The event call stack doesn't support pre-frame events or user-defined events. The graphics event call stack is displayed in a table format.

Column	Description
Name	A symbol that uniquely identifies the function that contains the call site. The debug symbol for the function is displayed when it's available; otherwise, the function offset is displayed.
File	The file name of the source code file or library file that contains the call site.
Location	The line number of the call site.

Links to graphics objects

To understand the selected graphics event, you might need information about the Direct3D objects that are associated with it. The Graphics Event Call Stack window provides links to this information.

Graphics Pixel History

The Graphics Pixel History window in Visual Studio Graphics Analyzer helps you understand how a specific pixel is affected by the Direct3D events that occur during a frame of your game or app.

This is the Pixel History window:

Understanding the Pixel History window

By using Pixel History, you can analyze how a specific pixel of the render target is affected by Direct3D events during a frame. You can pinpoint a rendering problem to a specific Direct3D event, even when subsequent events—or subsequent primitives in the same event—continue to change the pixel's final color value. For example, a pixel might be rendered incorrectly and then obscured by another, semi-transparent pixel so that their colors are blended together in the framebuffer. This kind of problem would be difficult to diagnose if you only had the final contents of the render target to guide you.

The Pixel History window displays the complete history of a pixel over the course of the selected frame. The Final Frame Buffer at the top of the window displays the color that's written to the framebuffer at the end of the frame, together with additional information about the pixel such as the frame that it comes from and its screen coordinates. This area also contains the Render Alpha check box. When this check box is selected, the Final Frame Buffer color and intermediate color values are displayed with transparency over a checkerboard pattern. If the check box is cleared, the alpha channel of the color values is ignored.

The bottom part of the window displays the events that had a chance to affect the color of the pixel, together with the Initial and Final pseudo-events that represent the initial and final color values of the pixel in the framebuffer. The initial color value is determined by the first event that changed the color of the pixel (typically a Clear event). A pixel always has these two pseudo-events in its history, even when no other events affected it. When other events had a chance to affect the pixel, they are displayed between the Initial and Final events. The events can be expanded to show their details. For simple events such as those that clear a render target, the effect of the event is just a color value. More complex events such as draw calls generate one or more primitives that might contribute to the color of the pixel.

Primitives that were drawn by the event are identified by their primitive type and index, along with the total primitive count for the object. For example, an identifier such as Triangle (1456) of (6214) means that the primitive corresponds to the 1456th triangle in an object that's made up of 6214 triangles. To the left of each primitive identifier is an icon that summarizes the effect that the primitive had on the pixel. Primitives that affect the pixel color are represented by a rounded rectangle that's filled with the result color. Primitives that are excluded from having an effect on the pixel color are represented by icons that indicate the reason that the pixel was excluded. These icons are described in the section Primitive exclusion later in this article.

You can expand each primitive to examine how the pixel shader output was merged with the existing pixel color to produce the result color. From here you can also examine or debug the pixel shader code that's associated with the primitive, and you can further expand the vertex shader node to examine the vertex shader input.

Primitive exclusion

If a primitive is excluded from affecting the pixel color, the exclusion could occur for a variety of reasons. Each reason is represented by an icon that's described in this table:

Icon	Reason for exclusion
	The pixel was excluded because it failed the depth test.
	The pixel was excluded because it failed the scissor test.
	The pixel was excluded because it failed the stencil test.

Draw Call Exclusion

If all of the primitives in a draw call are excluded from affecting the render target because they fail a test, then the draw call cannot be expanded and an icon that corresponds to the reason for exclusion is displayed next to it. The reasons for draw-call exclusion resemble the reasons for primitive exclusion, and their icons are similar.

Viewing and debugging shader code

You can examine and debug code for vertex, hull, domain, geometry and pixel shaders by using the controls below the primitive that's associated with the shader.

To view a shader's source code

In the Graphics Pixel History window, locate the draw call that corresponds to the shader you want to examine and expand it.
Under the draw call you just expanded, select a primitive that demonstrates the problem you're interested in and expand it.
Under the primitive you're interested in, follow the shader title link—for example, follow the link Vertex Shader obj:30 to view the vertex shader source code.

提示

The object number, obj:30, identifies this shader throughout the Graphics Analyzer interface such as in the object table and pipeline stages window.

To debug a shader

In the Graphics Pixel History window, locate the draw call that corresponds to the shader you want to examine and expand it.
Then, under the draw call you just expanded, select a primitive that demonstrates the problem you're interested in and expand it.
Under the primitive you're interested in, choose Start Debugging. This entry point into the HLSL debugger defaults to the first invocation of the shader for the corresponding primitive—that is, the first pixel or vertex that's processed by the shader. There's only one pixel associated with the primitive, but there's more than one vertex shader invocations for lines and triangles.

To debug the vertex shader invocation for a specific vertex, expand the VertexShader title link and locate the vertex you're interested in, then choose Start Debugging next to it.

Links to graphics objects

To understand the graphics events in the pixel history, you might need information about the device state at the time of the event or about the Direct3D objects that are referenced by the event. For each event in the pixel history, the Graphics Pixel History provides links to the then-current device state and to related objects.

Graphics Object Table

The Graphics Object Table in Visual Studio Graphics Analysis helps you understand the Direct3D objects that support a frame of your game or app.

This is the Object Table:

Understanding the Graphics Object Table

By using the Object Table, you can analyze the Direct3D objects that support the rendering of a particular frame. You can pinpoint a rendering problem to a specific object by examining its properties and data (by using other Graphics Diagnostics tools earlier in your diagnosis, you can narrow the list of objects that might not be what you expect.) When you've found the offending object, you can use a visualization that's specific to its type to examine it—for example, you can use the Image Editor to view textures, or the Buffer Visualizer to view buffer contents.

The Object Table supports copy and paste so that you can use another tool—for example, Microsoft Excel—to examine its contents.

Graphics Object Table format

The Object Table displays the Direct3D objects and resources that support the frame that's associated with the selected event—for example, state objects, buffers, shaders, textures, and other resources. Objects that were created in a previous frame but are not used during the captured frame are omitted from the object table. Objects that have been destroyed by previous events during the captured frame are omitted in subsequent events. Objects that are not set on the D3D10Device or D3D11DeviceContext are displayed as gray text. Objects are displayed in a table format.

Column	Description
Identifier	The object ID.
Name	Application-specific information that was set on the object by using the Direct3D function `SetPrivateData`—typically to provide additional identifying information about an object.
Type	The object type.
Active	Displays "*" for an object that was set on the D3D10Device or D3D11DeviceContext during the captured frame. This corresponds to objects that are displayed as grey text, but provides a column entry that you can use to help sort the object table.
Size	The size of the object in bytes.
Format	The format of the object. For example, the format of a texture object, or the shader model of a shader object.
Width	The width of a texture object. Does not apply to other object types.
Height	The height of a texture object. Does not apply to other object types.
Depth	The depth of a 3-D texture object. If a texture is not 3-D, then the value is 0. Does not apply to other object types.
Mips	The number of MIP levels that a texture object has. Does not apply to other object types.
ArraySize	The number of textures in a texture array. The range is from 1 to an upper-bound defined by the current feature level. For a cube-map, this value is 6 times the number of cube-maps in the array.
Samples	The number of multisamples per pixel.

Graphics object viewers

To view details about an object, open it by choosing its name in the Object Table. Details about the object are displayed in different formats, depending on the type of the object. For example, textures are displayed using the texture viewer and device state such as D3D11 Device Context is displayed as a formatted list. Different versions of Direct3D make use of different objects, and there are often specific visualizers for the most important objects of each version.

Here's the texture viewer showing the contents of the Output Merger pipeline stage.

D3D12 Command List

In Direct3D 12 a command list is an object that records commands into a command allocator so that they can be submitted to the GPU as a single request. Command lists usually perform a series of state-setting, draw, clear and copy commands. They're particularly important because they're the preferred method of rendering in Direct3D 12, and can be re-used between frames to help increase performance. Command List details are displayed in a new document window, with information related to each pipeline stage presented on its own tab.

D3D12 Pipeline State Object (PSO)

In Direct3D 12 a pipeline state object represents a significant portion of the GPU state, including all currently set shaders and certain fixed-function state objects. Once created, a pipeline state object is immutable—an application can only change the configuration of the pipeline by binding a different pipeline state object. PSO details are displayed in a new document window, with details of the pipeline configuration laid out hierarchically.

D3D12 Root Signature

In Direct3D 12, the root signature defines all the resources that are bound to a graphics or compute pipeline, and it links command lists to the resources that the shaders requires. Typically there's one root signature for graphics and one for compute in an app. Root signature details are displayed in a new document window, with details of the root signature laid out hierarchically.

D3D12 Resources

In Direct3D 12, resources are catch-all objects that provide data to the rendering pipeline; this is in contrast to Direct3D11 which defined many specific objects for different kinds and dimensions of resources. A Direct3D 12 resource can contain texture data, vertex data, shader data, and more—they can even represent render targets such as the depth buffer. Details of a Direct3D 12 resource are displayed in a new document window; Graphics Analysis will use the appropriate viewer for the contents of the resource object if it's able to determine its type. For example, a resource object that contains texture data is displayed using the texture viewer, just like a D3D11 Texture2D object is.

Device context object

In Direct3D 11 and Direct3D 10, the device context (D3D11 Device Context or D3D10 Device) object is particularly important because it holds the most important state information, and it links to other state objects that are currently set. Device context details are displayed in a new document window, and each category of information is presented there on its own tab. The device context changes when a new event is selected to reflect the current device state.

Buffer object

Buffer object details (D3D11 Buffer or D3D10 Buffer) are displayed in a new document window that presents the buffer contents in a table and provides an interface for changing how the buffer contents are displayed. The buffer data table supports copy and paste so that you can use another tool—for example, Microsoft Excel—to examine its contents. The content of the buffer is interpreted according to the value of the format combo box, which is located above the buffer data table. In the box, you can enter a composite data format that's made up of the data types that are listed in the following table. For example, "float int" displays a list of structures that contain a 32-bit floating-point value followed by a 32-bit signed integer value. Composite data formats that you have specified are added to the combo box for later use.

You can also toggle the Show Offsets checkbox to hide or display the offset of each element in the buffer.

Type	Description
float	A 32-bit floating-point value.
float2	A vector that contains two 32-bit floating-point values.
float3	A vector that contains three 32-bit floating-point values.
float4	A vector that contains four 32-bit floating-point values.
byte	An 8-bit signed integer value.
2byte	A 16-bit signed integer value.
4byte	A 32-bit signed integer value. Same as int.
8byte	A 64-bit signed integer value. Same as int64.
xbyte	An 8-bit hexadecimal value.
x2byte	A 16-bit hexadecimal value.
x4byte	A 32-bit hexadecimal value. Same as xint.
x8byte	A 64-bit hexadecimal value. Same as xint64.
ubyte	An 8-bit unsigned integer value.
u2byte	A 16-bit unsigned integer value.
u4byte	A 32-bit unsigned integer value. Same as uint.
u8byte	A 64-bit unsigned integer value. Same as uint64.
half	A 16-bit floating-point value.
half2	A vector that contains two 16-bit floating-point values.
half3	A vector that contains three 16-bit floating-point values.
half4	A vector that contains four 16-bit floating-point values.
double	A 64-bit floating-point value.
int	A 32-bit signed integer value. Same as 4byte.
int64	A 64-bit signed integer value. Same as 8byte.
xint	A 32-bit hexadecimal value. Same as x4byte.
xint64	A 64-bit hexadecimal value. Same as x8byte.
uint	A 32-bit unsigned integer value. Same as u4byte.
uint64	A 64-bit unsigned integer value. Same as u8byte.
bool	A Boolean (`true` or `false`) value. Each Boolean value is represented by a 32-bit value.

HLSL Shader Debugger

The HLSL debugger in Visual Studio Graphics Analyzer helps you understand how your HLSL shader code operates under real conditions of your app.

This is the HLSL debugger:

Understanding the HLSL debugger

The HLSL debugger can help you understand problems that arise in your shader code. Debugging HLSL code in Visual Studio resembles debugging code that's written in other languages—for example, C++, C#, or Visual Basic. You can inspect the contents of variables, set break points, step through code, and walk up the call-stack, just like you can when you debug other languages.

However, because GPUs achieve high performance by running shader code on hundreds of threads simultaneously, the HLSL debugger is designed to work together with the other Graphics Analyzer tools to present all of this information in a way that helps you make sense of it. Graphics Analyzer recreates captured frames by using information that was recorded in a graphics log; the HLSL debugger does not monitor GPU execution in real time as it runs shader code. Because a graphics log contains enough information to recreate any part of the output, and because Graphics Analysis provides tools that can help you pinpoint the exact pixel and event where an error occurs, the HLSL debugger only has to simulate the exact shader thread that you are interested in. This means that the work of the shader can be simulated on the CPU, where its inner workings are in full view. This is what gives the HLSL debugger a CPU-like debugging experience.

However, the HLSL debugger is currently limited in the following ways:

The HLSL debugger doesn't support edit-and-continue, but you can make changes to your shaders and then regenerate the frame to see the results.
It's not possible to debug an app and its shader code at the same time. However, you can alternate between them.
You can add variables and registers to the Watch window, but expressions are not supported.

Nevertheless, the HLSL debugger provides a better, more CPU-like debugging experience than would be possible otherwise.

HLSL Shader Edit & Apply

The HLSL shader debugger doesn't support Edit & Continue in the same way that the CPU debugger does because the GPU execution model doesn't allow shader state to be undone. Instead, the HLSL debugger supports Edit & Apply, which allows you to edit HLSL source files and then choose Apply to regenerate the frame to see the effect of your changes. Your modified shader code is stored in a separate file to preserve the integrity of your project's original HLSL source file, but when you're satisfied with your changes you can choose Copy to… to copy the changes into your project. Using this feature, you can quickly iterate on shader code that contains errors and eliminate costly rebuild and capture steps from your HLSL debugging workflow.

HLSL Disassembly

The HLSL shader debugger provides a listing of HLSL shader assembly to the right of the HLSL source code listing.

Debugging HLSL code

You can access the HLSL debugger from the Pipeline Stages or Pixel History windows.

To start the HLSL debugger from the Graphics Pipeline Stages window

In the Graphics Pipeline Stages window, locate the pipeline stage that's associated with the shader that you want to debug.
Below the title of the pipeline stage, choose Start Debugging, which appears as a small green arrow.

备注

This entry point into the HLSL debugger debugs only the first shader thread for the corresponding stage—that is, the first vertex or pixel that is processed. You can use Pixel History to access other threads of these shader stages.

To start the HLSL debugger from the Graphics Pixel History

In the Graphics Pixel History window, expand the draw call that's associated with the shader that you want to debug. Each draw call can correspond to multiple primitives.
In the draw call details, expand a primitive whose resulting color contribution suggests a bug in its shader code. If multiple primitives suggest a bug, choose the first primitive that suggests it so that you can avoid an accumulation of errors that can make diagnosis of the problem more difficult.
In the primitive details, choose whether to debug the Vertex Shader or the Pixel Shader. Debug the vertex shader when you suspect that the pixel shader is correct but is generating an incorrect color contribution because the vertex shader is passing incorrect constants to it. Otherwise, debug the pixel shader.

To the right of the chosen shader, choose Start Debugging, which appears as a small green arrow.

备注

This entry point into the HLSL debugger debugs either the pixel shader thread that corresponds to the chosen draw call, primitive, and pixel that you have chosen, or to the vertex shader threads whose results are interpolated by the draw call, primitive, and pixel that you have chosen. In the case of vertex shaders, you can further refine the entry point to a specific vertex by expanding the vertex shader details.

For examples about how to use the HLSL Debugger to debug shader errors, see Examples or the walkthroughs linked to in the See Also section.

Command-Line Capture Tool

DXCap.exe is a command-line tool for graphics diagnostics capture and playback. It supports Direct3D 10 through Direct3D 12 across all feature levels.

Syntax

DXCap.exe [-file filename] [-frame frames | -period periods | -manual] -c app [args...]
DXCap.exe -p [filename] [-debug | -warp | -hw] [-config] [-rawmode]
DXCap.exe –p [filename] –screenshot [-frame frames]
DXCap.exe –p [filename] –toXML [xml_filename]
DXCap.exe –v [–file filename] [-examine events] [-haltonfail | -exitonfail] [-showprogress]
DXCap.exe -e [search_string]
DXCap.exe –info

Parameters

-file filename
Under capture mode (-c), filename specifies the name of the graphics log file that graphics information is recorded to. If filename is not specified, graphics information is recorded to a file named <appname>-<date>-<time>.vsglog by default.

Under validation (-v) mode, filename specifies the name of the graphics log file to be validated. If filename is not specified, the graphics log that was last validated is used again.

-frame frames
Under capture mode, frames specifies the frames that you want to capture. The first frame is 1. You can specify multiple frames by using commas and ranges. For example, if frames is 2, 5, 7-9, 15, then frames 2, 5, 7, 8, 9, and 15 are captured.

-period periods
Under capture mode, periods specifies the ranges of time, in seconds, during which you want to capture frames. You can specify multiple periods by using commas and ranges. For example if periods is 2.1-5, 7.0-9.3, then frames that are rendered between 2.1 and 5 seconds, and between7 and 9.3 seconds are captured.

-manual
Under capture mode, -manual specifies that frames will be captured manually by pressing the Print Screen key. Frames can be captured when the app starts; to stop capturing frames, return to the command line interface and press enter.

-c app [args...]
Capture mode. Under capture mode, app specifies the name of the app that you want to capture graphics information from; args... specifies additional command-line parameters to that app.

-p [filename]
Playback mode (-p). Under playback mode, filename specifies the name of the graphics log file to be played back. If filename is not specified, the graphics log that was last played back is used again.

-debug
Under playback mode, -debug specifies that playback should be performed with the Direct3D debug layer enabled.

-warp
Under playback mode, -warp specifies that playback should be performed using the WARP software renderer.

-hw
Under playback mode, -hw specifies that playback should be performed using GPU hardware.

-config
Under playback mode, -config displays information about the machine that was used to capture the graphics log file, if this information was recorded to the log.

-rawmode
Under playback mode, -rawmode specifies that playback should be performed without modification to the recorded events. Under normal operation, playback mode might make minor changes to playback to simplify debugging and speed up playback. For example, it may simulate swap chain output rather than executing swap chain commands. Usually this is not a problem, but you might need playback to occur in a way that's more faithful to the recorded events; for example, you can use this option to restore full-screen rendering behavior to an app that was captured while running in full-screen mode.

-toXML [xml_filename]
Under playback mode, xml_filename specifies the name of the file where an XML representation of playback is written to. If xml_filename is not specified, the XML representation is written to a file named the same as the file being played back, but given an .xml extension.

-v
Validation mode. Under validation mode, captured frames are played back on both hardware and WARP, and their results are compared using an image comparison function. You can use this feature to quickly identify driver issues that affect your rendering.

-examine events
Under validation mode, events specifies the set of graphics events whose immediate results are compared. For example, -examine present,draw,copy,clear limits the comparison to only the events belonging to those categories.

提示

We recommend starting with -examine present,draw,copy,clear because this will reveal most issues but take significantly less time than a more extensive set of events. If necessary, you can specify a larger or different set of events to validate those events and reveal other kinds of issues.

-haltonfail
Under validation mode, -haltonfail halts validation when differences are detected between the hardware and WARP renderer. Validation resumes after a key is pressed.

-exitonfail
Under validation mode, -exitonfail exits validation immediately when differences are detected between the hardware and WARP renderer. When the program exits in this way, it returns 0 to the environment; otherwise it returns 1.

-showprogress
Under validation mode, -showprogress displays progress information about the validation session. WARP progress is displayed on the left; hardware progress is displayed on the right.

-e search_string
Enumerates the Windows Store apps that are installed. You can use this information to perform command-line captures with Windows Store apps.

-info
Displays information about the machine and capture DLLs.

Remarks

DXCap.exe operates in three modes:

Capture mode (-c)
Capture graphics information from a running app and record it to a graphics log file. The capture capabilities and file format are identical to those of Visual Studio.

Playback mode (-p)
Play back previously captured graphics events from an existing graphics log file. By default, playback occurs in a window, even when the graphics log file was captured from a fullscreen app. Playback occurs in full-screen only when the graphics log file was captured from a fullscreen app and –rawmode is specified.

Validation mode (-v)
Validates rendering behavior by playing back captured frames on both hardware and WARP, then comparing their results by using an image comparison function. You can use this feature to quickly identify driver issues that affect your rendering.

In addition to these modes, dxcap.exe performs two other functions that do not perform capture or playback of graphics information.

Enumeration function (-e)
Displays details about the Windows Store apps that are installed on the machine. These details include the package name and appid that identify the executable file in a Windows Store app. To capture graphics information from a windows store app using DXCap.exe, use the package name and appid instead of the executable filename that's used when you capture a desktop app.

Info function (-info)
Displays details about the machine and capture DLLs.

Examples

Capture graphics information from a desktop app

Use –c to specify the app from which you want to capture graphics information.

DXCap.exe –c BasicHLSL11.exe

By default, graphics information is recorded to a file named <appname>-<date>-<time>.vsglog. Use –file to specify a different file to record to.

DXCap.exe –file regression_test_12.vsglog –c BasicHLSL11.exe

Specify additional command-line parameters to the app that you're capturing from by including them after the app's filename.

DXCap.exe –c "C:\Program Files\Internet Explorer\iexplorer.exe" "www.fishgl.com"

The command in the example above captures graphics information from the desktop version of Internet Explorer while viewing the webpage located at www.fishgl.com which uses the WebGL API to render 3-D content.

备注

Because command line arguments that appear after the app are passed to it, you must specify the arguments intended for DXCap.exe before using the –c option.

Capture graphics information from a Windows Store app.

You can capture graphics information from a Windows Store app.

备注

Because command line arguments that appear after the app are passed to it, you must specify the arguments intended for DXCap.exe before using the –c option.

Capture graphics information from a Windows Store app.

You can capture graphics information from a Windows Store app.

DXCap.exe –c Microsof.BingMaps_2.1.2914.1734_x64__8wekyb3d8bbwe,AppexMaps

Using DXCap.exe to capture from a Windows Store app is similar to using it to capture from a Windows desktop app, but instead identifying a desktop app by its filename, you identify a Windows Store app by its package name and the name or ID of the executable inside that package that you want to capture from. To make it easier to find out how to identify the Windows Store apps that are installed on your machine, use the –e option with DXCap.exe to enumerate them:

DXCap.exe -e

You can provide an optional search string to help find the app that you're looking for. When the search string is provided, DXCap.exe enumerates the Windows Store apps whose package name, app name or app IDs match the search string. The search is case-insensitive.

DXCap.exe –e map

The command above enumerates Windows Store apps that match "map"; here is the output:

Package "Microsoft.BingMaps":
InstallDirectory : C:\Program Files\WindowsApps\Microsoft.BingMaps_2.1.2914.1734_x64__8wekyb3d8bbwe
FullName : Microsoft.BingMaps_2.1.2914.1734_x64__8wekyb3d8bbwe
UserSID : S-1-5-21-2127521184-1604012920-1887927527-5603533
Name : Microsoft.BingMaps
Publisher : CN=Microsoft Corporation, O=Microsoft Corporation, L=Redmond, S=Washington, C=US
Version : 2.1.2914.1734
Launchable Applications:
Id : AppexMaps
Exe : C:\Program Files\WindowsApps\Microsoft.BingMaps_2.1.2914.1734_x64__8wekyb3d8bbwe\Map.exe
IsWWA: No
AppSpec (to launch): DXCap.exe -c Microsoft.BingMaps_2.1.2914.1734_x64__8wekyb3d8bbwe,AppexMaps The last line of output for each enumerated app displays the command you can use to capture graphics information from it.

Capture specific frames or frames between specific times.

Use –frame to specify the frames that you want to capture using commas and ranges:

DXCap.exe –frame 2,5,7-9,15 –c SimpleBezier11.exe

Or, use –period to specify a set of time ranges during which to capture frames. Time ranges are specified in seconds, and multiple ranges can be specified:

DXCap.exe –period 2.1-5, 7.0-9.3 –c SimpleBezier11.exe

Capture frames interactively.

Use –manual to capture frames interactively. Press the Enter key to start capture, and press the Enter key again to stop.

DXCap.exe –manual -c SimpleBezier11.exe

Play back a graphics log file

Use -p to play back a previously captured graphics log file.

DXCap.exe –p regression_test_12.vsglog

Leave out the filename to play back the graphics log that was captured most recently.

DXCap.exe –p

Play back in raw mode

Use -rawmode to play back captured commands exactly as they occurred. Under normal playback, certain commands are emulated, for example, a graphics log file captured from a full screen app will play back in a window; with raw mode enabled, the same file will attempt to play back in full screen.

DXCap.exe –p regression_test_12.vsglog -rawmode

Play back using WARP or a hardware device

You might want to force play back of a graphics log file captured on a hardware device to use WARP, or force playback of a log captured on WARP to use a hardware device. Use -warp to play back using WARP.

DXCap.exe –p regression_test_12.vsglog -warp

Use -hw to play back using hardware.

DXCap.exe –p regression_test_12.vsglog -hw

Validate a graphics log file against WARP

Under validation mode, the graphics log file is played back on both hardware and WARP, and their results are compared. This can help you identify rendering errors that are caused by the driver. Use –v to validate correct behavior of graphics hardware against WARP.

DXCap.exe -v regression_test_12.vsglog

To reduce the amount of comparisons, you can specify a subset of commands for validation to compare and other commands will be ignored. Use –examine to specify the commands whose results you want to compare.

DXCap.exe -v regression_test_12.vsglog –examine present,draw,copy,clear

Convert a Graphics Log file to PNGs

To view or analyze frames from a graphics log file, DXCap.exe can save captured frames as .png (Portable Network Graphics) image files. Use -screenshot to under playback mode to output captured frames as .png files.

DXCap.exe -p BasicHLSL11.vsglog -screenshot

Use –frame with –screenshot to specify the frames that you want to output.

DXCap.exe -p BasicHLSL11.vsglog -screenshot –frame 5, 7-9

Convert a Graphics Log file to XML

To process and analyze graphics logs using familiar tools like FindStr or XSLT, DXCap.exe can convert a graphics log file to XML. Use -toXML under playback mode to convert the log to XML instead of playing it back.

DXCap.exe –p regression_test_12.vsglog –toXML

By default, the XML output is written to a file with the same name as the graphics log, but which has been given a .xml extension. In the example above, the XML file will be named regression_test_12.xml. To give the XML file a different name, specify it after -toXML.

DXCap.exe –p regression_test_12.vsglog –toXML temp.xml

The resulting file will contain XML that looks similar to this: