Summary: This article provides in-depth answers to frequently asked development questions regarding Microsoft DirectX, version 8.0, including sections on Direct3D, DirectSound, and DirectPlay. (10 printed pages)
Why do I get so many error messages when I try to compile the samples?
You probably don't have your include path set correctly. Many compilers, including Microsoft? Visual C++?, include an earlier version of the SDK, so if your include path searches the standard compiler include directories first, you'll get incorrect versions of the header files. To remedy this issue, make sure the include path and library paths are set to search the DirectX include and library paths first. See also the dxreadme.txt file in the SDK. If you install the DirectX SDK and you are using Visual C++, the installer can optionally set up the include paths for you.
I get linker errors about multiple or missing symbols for globally unique identifiers (GUIDs), what do I do?
The various GUIDs you use should be defined once and only once. The definition for the GUID will be inserted if you #define the INITGUID symbol before including the DirectX header files. Therefore, you should make sure that this only occurs for one compilation unit. An alternative to this method is to link with the dxguid.lib library, which contains definitions for all of the DirectX GUIDs. If you use this method (which is recommended), then you should never #define the INITGUID symbol.
Can I cast a pointer to a DirectX interface to a lower version number?
No. DirectX interfaces are COM interfaces. This means that there is no requirement for higher numbered interfaces to be derived from corresponding lower numbered ones. Therefore, the only safe way to obtain a different interface to a DirectX object is to use the QueryInterface method of the interface. This method is part of the standard IUnknown interface, from which all COM interfaces must derive.
Can I mix the use of DirectX 8 components and DirectX 7 or earlier components within the same application?
You can freely mix different components of differing version; for example, you could use DirectPlay 8 with DirectDraw 7 in the same application. However, you generally cannot mix different versions of the same component within the same application; for example, you cannot mix DirectDraw 7 with Direct3D 8 (since these are effectively the same component as DirectDraw has been subsumed into Direct3D in DirectX 8).
What do the return values from the Release or AddRef methods mean?
The return value will be the current reference count of the object. However, the COM specification states that you should not rely on this and the value is generally only available for debugging purposes. The values you observe may be unexpected since various other system objects may be holding references to the DirectX objects you create. For this reason, you should not write code that repeatedly calls Release until the reference count is zero, as the object may then be freed even though another component may still be referencing it.
Does it matter in which order I release DirectX interfaces?
It shouldn't matter because COM interfaces are reference counted. However, there are some known bugs with the release order of interfaces in some versions of DirectX. For safety, you are advised to release interfaces in reverse creation order when possible.
What is a smart pointer, and should I use them?
A smart pointer is a C++ template class designed to encapsulate pointer functionality. In particular, there are standard smart pointer classes designed to encapsulate COM interface pointers. These pointers automatically perform QueryInterface instead of a cast, and handle AddRef and Release for you. Whether you should use them is largely a matter of taste. If your code contains lots of copying of interface pointers, with multiple AddRefs and Releases, then smart pointers can probably make your code neater and less error prone. Otherwise, you can do without them. Visual C++ includes a standard Microsoft COM smart pointer, defined in the "comdef.h" header file (look up com_ptr_t in the help).
I have trouble debugging my DirectX application, any tips?
The most common problem with debugging DirectX applications is attempting to debug while a DirectDraw surface is locked. This situation can cause a "Win16 Lock" on Microsoft Windows? 9x systems, which prevents the debugger window from painting. Specifying the D3DLOCK_NOSYSLOCK flag when locking the surface can usually eliminate this. Windows 2000 does not suffer from this problem. When developing an application, it is useful to be running with the debugging version of the DirectX runtime (selected when you install the SDK), which performs some parameter validation and outputs useful messages to the debugger output.
What's the correct way to check return codes?
Use the SUCCEEDED and FAILED macros. DirectX methods can return multiple success and failure codes, so a simple "==D3D_OK" or similar test will not always suffice.
What happened to DirectDraw?
Much of the functionality of DirectDraw has now been subsumed into the new Direct3D8 interfaces. Developers working on purely 2D applications may wish to continue using the old DirectX 7 interfaces. Developers working on 3D applications with some 2D elements are encouraged to use Direct3D alternatives (point sprites and billboard textures, for example) as this will result in improved performance and flexibility.
How do I disable ALT+TAB and other task switching?
You don't. Really.
Is there a recommended book explaining COM?
Inside COM by Dale Rogerson, published by Microsoft Press, is an excellent introduction to COM. For a more detailed look at COM, the book Essential COM by Don Box, published by Longman, is also highly recommended.
What books are there about general Windows programming?
Lots. However, the ones that are highly recommended are:
- Programming Windows by Charles Petzold (Microsoft Press)
- Advanced Windows by Jeffrey Richter (Microsoft Press)
Where can I find information about 3D graphics techniques?
The standard book on the subject is Computer Graphics: Principles and Practice by Foley, Van Dam et al., and is a valuable resource for anyone wanting to understand the mathematical foundations of geometry, rasterization, and lighting techniques. The FAQ for the comp.graphics.algorithms Usenet group also contains useful material.
Does Direct3D emulate functionality not provided by hardware?
It depends. Direct3D has a fully featured software vertex processing pipeline (including support for custom vertex shaders). However, no emulation is provided for pixel level operations; applications must check the appropriate caps bits and use the ValidateDevice API to determine support.
Is there a software rasterizer included with Direct3D?
No. Direct3D now supports plug-in software rasterizers. However, there is currently no software rasterizer supplied by default.
Does the Direct3D geometry code utilize 3DNow! and/or Pentium III SIMD instructions?
Yes. The Direct3D geometry pipeline has several different code paths, depending on the processor type, and will utilize the special floating-point operations provided by the 3DNow! or Pentium III SIMD instructions where these are available. This includes processing of custom vertex shaders.
How do I prevent transparent pixels being written to the z-buffer?
You can filter out pixels with an alpha value above or below a given threshold. You control this behavior by using the renderstates ALPHATESTENABLE, ALPHAREF, and ALPHAFUNC.
What is a stencil buffer?
A stencil buffer is an additional buffer of per-pixel information, much like a z-buffer. In fact, it resides in some of the bits of a z-buffer. Common stencil/z-buffer formats are 15-bit z and 1-bit stencil, or 24-bit z and 8-bit stencil. It is possible to perform simple arithmetic operations on the contents of the stencil buffer on a per-pixel basis as polygons are rendered. For example, the stencil buffer can be incremented or decremented, or the pixel can be rejected if the stencil value fails a simple comparison test. This is useful for effects that involve marking out a region of the frame buffer, and then performing rendering only the marked (or unmarked) region. Good examples are volumetric effects like shadow volumes.
How do I use a stencil buffer to render shadow volumes?
The key to this, and other volumetric stencil buffer effects, is the interaction between the stencil buffer and the z-buffer. A scene with a shadow volume is rendered in three stages. First, the scene without the shadow is rendered as usual, using the z-buffer. Next, the shadow is marked out in the stencil buffer as follows. The front faces of the shadow volume are drawn using invisible polygons, with z-testing enabled but z-writes disabled, and the stencil buffer incremented at every pixel passing the z-test. The back faces of the shadow volume are rendered similarly, but decrementing the stencil value instead.
Now, consider a single pixel. Assuming the camera is not in the shadow volume there are four possibilities for the corresponding point in the scene. If the ray from the camera to the point does not intersect the shadow volume, then no shadow polygons will have been drawn there and the stencil buffer is still zero. Otherwise, if the point lies in front of the shadow volume the shadow polygons will be z-buffered out and the stencil again remains unchanged. If the points lies behind the shadow volume then the same number of front shadow faces as back faces will have been rendered and the stencil will be zero, having been incremented as many times as decremented.
The final possibility is that the point lies inside the shadow volume. In this case the back face of the shadow volume will be z-buffered out, but not the front face, so the stencil buffer will be a non-zero value. The result is portions of the frame buffer lying in shadow have non-zero stencil value. Finally, to actually render the shadow, the whole scene is washed over with an alpha-blended polygon set to only affect pixels with non-zero stencil value. An example of this technique can been seen in the "Shadow Volume" sample that comes with the DirectX SDK.
What are the texel alignment rules? How do I get a one-to-one mapping?
This is explained fully in the DirectX 8 documentation (under the article titled Directly Mapping Texels to Pixels). However, the executive summary is that you should bias your screen coordinates by –0.5 of a pixel in order to align properly with texels. Most cards now conform properly to the texel alignment rules, however there are some older cards or drivers that do not. To handle these cases, the best advice is to contact the hardware vendor in question and request updated drivers or their suggested workaround.
What is the purpose of the D3DCREATE_PUREDEVICE flag?
If the D3DCREATE_PUREDEVICE flag is specified during device creation, Direct3D will create a "pure" device. This disables the use of the Get* family of methods, and forces vertex processing to be hardware only. This allows the Direct3D runtime to make certain optimizations that afford the best possible path to the driver without having to track so much internal state. In other words, you can see a performance advantage using PUREDEVICE, at the cost of some convenience.
How do I use color keying?
DirectX 8 does not support color keying. You should use alpha blending/testing instead, which is in general a more flexible technique that does not suffer from some of the problems associated with color keying.
How do I enumerate the display devices in a multi-monitor system?
In common with other enumeration functionality, this has moved from being callback based to a simple iteration by the application using methods of the IDirect3D8 interface. Call GetAdapterCount to determine the number of display adapters in the system. Call GetAdapterMonitor to determine which physical monitor an adapter is connected to (this method returns an HMONITOR, which you can then use in the Win32 API GetMonitorInfo to determine information about the physical monitor). Determining the characteristics of a particular display adapter or creating a Direct3D device on that adapter is as simple as passing the appropriate adapter number in place of D3DADAPTER_DEFAULT when calling GetDeviceCaps, CreateDevice, or other methods.
What happened to the old vertex types like D3DVERTEX?
The "pre-canned" vertex types are no longer explicitly supported. The multiple vertex stream system allows for more flexible assembly of vertex data. If you want to use one of the "classic" vertex formats, you can build an appropriate FVF code.
Vertex streams confuse me, how do they work?
Direct3D assembles each vertex that is fed into the processing portion of the pipeline from one or more vertex streams. Having only one vertex stream corresponds to the old pre-DirectX 8 model, in which vertices come from a single source. With DirectX 8, different vertex components can come from different sources; for example, one vertex buffer could hold positions and normals, while a second held color values and texture coordinates.
What is a vertex shader?
A vertex shader is a procedure for processing a single vertex. It is defined using a simple assembly-like language that is assembled by the D3DX utility library into a token stream that Direct3D accepts. The vertex shader takes as input a single vertex, and a set of constant values, and outputs a vertex position (in clip-space) and optionally a set of colors and texture coordinates which are used in rasterization. Notice that when you have a custom vertex shader, the vertex components no longer have any semantics applied to them by Direct3D and vertices are simply arbitrary data that is interpreted by the vertex shader you create.
Does a vertex shader perform perspective division or clipping?
No. The vertex shader outputs a homogenous coordinate in clip-space for the transformed vertex position. Perspective division and clipping is performed automatically post-shader.
Can I generate geometry with a vertex shader?
A vertex shader cannot create or destroy vertices; it operates on a single vertex at a time, taking one unprocessed vertex as input and outputting a single processed vertex. It can therefore be used to manipulate existing geometry (applying deformations, or performing skinning operations) but cannot actually generate new geometry per se.
Can I apply a custom vertex shader to the results of the fixed-function geometry pipeline (or vice-versa)?
No. You have to choose one or the other. If you are using a custom vertex shader, then you are responsible for performing the entire vertex transformation.
Can I use a custom vertex shader if my hardware does not support it?
Yes. The Direct3D software vertex-processing engine fully supports custom vertex shaders with a surprisingly high level of performance.
How do I determine if the hardware supports my custom vertex shader?
Devices capable of supporting vertex shaders in hardware are required to fill out the D3DCAPS8::VertexShaderVersion field to indicate the version level of vertex shader they support. Any device claiming to support a particular level of vertex shader must support all legal vertex shaders that meet the specification for that level or below.
How many constant registers are available for vertex shaders?
Devices supporting DX8 vertex shaders are required to support a minimum of 96 constant registers. Devices may support more than this minimum number, and can report this through the D3DCAPS8::MaxVertexShaderConst field.
Can I share position data between vertices with different texture coordinates?
The usual example of this situation is a cube in which you want to use a different texture for each face. Unfortunately the answer is no, it's not currently possible to index the vertex components independently. Even with multiple vertex streams, all streams are indexed together.
When I submit an indexed list of primitives, does Direct3D process all of the vertices in the buffer, or just the ones I indexed?
When using the software geometry pipeline, Direct3D first transforms all of the vertices in the range you submitted, rather than transforming them "on demand" as they are indexed. For densely packed data (that is, where most of the vertices are used) this is more efficient, particularly when SIMD instructions are available. If your data is sparsely packed (that is, many vertices are not used) then you may want to consider rearranging your data to avoid too many redundant transformations. When using the hardware geometry acceleration, vertices are typically transformed on demand as they are required.
What is an index buffer?
An index buffer is exactly analogous to a vertex buffer, but instead it contains indices for use in DrawIndexedPrimitive calls. It is highly recommended that you use index buffers rather than raw application-allocated memory when possible, for the same reasons as vertex buffers.
I notice that 32-bit indices are now a supported type; can I use them on all devices?
No. You must check the D3DCAPS8::MaxVertexIndex field to determine the maximum index value that is supported by the device. This value must be greater than 216-1 (0xffff) in order for index buffers of type D3DFMT_INDEX32 to be supported. In addition, note that some devices may support 32-bit indices but support a maximum index value less than 232-1 (0xffffffff); in this case the application must respect the limit reported by the device.
What restrictions are there on using multiple vertex streams with the fixed-function pipeline?
The fixed-function pipeline requires that each vertex stream be a strict FVF subset, ordered as per a full FVF declaration. Also, note that you must respect any restriction on the number of streams as reported by the D3DCAPS8::MaxStreams field (many current devices and/or drivers only support a single stream).
How can I improve the performance of my Direct3D application?
The following are key areas to look at when optimizing performance:
- Batch size. Direct3D is optimized for large batches of primitives. The more polygons that can be sent in a single call, the better. A good rule of thumb is to aim to average over 100 polygons per call. Below that level you're probably not getting optimal performance, above that and you're into diminishing returns and potential conflicts with concurrency considerations (see below).
- State changes. Changing render state can be an expensive operation, particularly when changing texture. For this reason, it is important to minimize as much as possible the number of state changes made per frame. Also, try to minimize changes of vertex or index buffer.
Note Changing vertex buffer is no longer as expensive in DirectX 8 as it was with previous versions, but it is still good practice to avoid vertex buffer changes where possible.
- Concurrency. If you can arrange to perform rendering concurrently with other processing, then you will be taking full advantage of system performance. This goal can conflict with the goal of reducing renderstate changes. You need to strike a balance between batching to reduce state changes and pushing data out to the driver early to help achieve concurrency. Using multiple vertex buffers in round-robin fashion can help with concurrency.
- Texture uploads. Uploading textures to the device consumes bandwidth and causes a bandwidth competition with vertex data. Therefore, it is important to not to over commit texture memory, which would force your caching scheme to upload excessive quantities of textures each frame.
- Vertex and index buffers. You should always use vertex and index buffers, rather than plain blocks of application allocated memory. At a minimum, the locking semantics for vertex and index buffers can avoid a redundant copy operation. With some drivers, the vertex or index buffer may be places in more optimal memory (perhaps in video or AGP memory) for access by the hardware.
- State macro blocks. These were introduced in DX 7.0, and provide a mechanism for recording a series of state changes (including lighting, material and matrix changes) into a macro, which can then be replayed by a single call. This has two advantages:
- You reduce the call overhead by making one call instead of many.
- An aware driver can pre-parse and pre-compile the state changes, making it much faster to submit to the graphics hardware.
State changes can still be expensive, but using state macros can help reduce at least some of the cost.
- You reduce the call overhead by making one call instead of many.
- Use only a single Direct3D device. If you need to render to multiple targets, use SetRenderTarget. If you are creating a windowed application with multiple 3D windows, use the CreateAdditionalSwapChain API. The runtime is optimized for a single device and there is a considerable speed penalty for using multiple devices.
Which primitive types (strips, fans, lists, and so on) should I use?
Many meshes encountered in real data feature vertices that are shared by multiple polygons. To maximize performance it is desirable to reduce the duplication in vertices transformed and sent across the bus to the rendering device. It is clear that using simple triangle lists achieves no vertex sharing, making it the least optimal method. The choice is then between using strips and fans, which imply a specific connectivity relationship between polygons, and using indexed lists. Where the data naturally falls into strips and fans, these are the most appropriate choice, since they minimize the data sent to the driver. However, decomposing meshes into strips and fans often results in a large number of separate pieces, implying a large number of DrawPrimitive calls. For this reason, the most efficient method is usually to use a single DrawIndexedPrimitive call with a triangle list. An additional advantage of using an indexed list is that a benefit can be gained even when consecutive triangles only share a single vertex. In summary, if your data naturally falls into large strips or fans, use strips or fans; otherwise use indexed lists.
What's a good usage pattern for vertex buffers if I'm generating dynamic data?
- Create a vertex buffer using the D3DUSAGE_DYNAMIC and D3DUSAGE_WRITEONLY usage flags, and the D3DPOOL_DEFAULT pool flag. (Also specify D3DUSAGE_SOFTWAREPROCESSING if you are using software vertex processing.)
- I = 0.
- Set state (textures, renderstates, and so on).
- Check if there is space in the buffer, that is, i.e. I + M <= N? (Where M is the number of new vertices).
- If yes, then Lock the VB with D3DLOCK_NOOVERWRITE. This tells Direct3D and the driver that you will be adding vertices and won't be modifying the ones that you previously batched. Therefore, if a DMA operation was in progress, it isn't interrupted. If no, goto 11.
- Fill in the M vertices at I.
- Call Draw[Indexed]Primitive. For non-indexed primitives use I as the StartVertex parameter. For indexed primitives, ensure the indices point to the correct portion of the vertex buffer (it may be easiest to use the BaseVertexIndex parameter of the SetIndices call to achieve this).
- I += M.
- Goto 3.
- Ok, so we are out of space, so let us start with a new VB. We don't want to use the same one because there might be a DMA operation in progress. We communicate to this to Direct3D and the driver by locking the same VB with the D3DLOCK_DISCARD flag. What this means is "you can give me a new pointer because I am done with the old one and don't really care about the old contents any more."
- I = 0.
- Goto 4 (or 6).
What file formats are supported by the D3DX image file loader functions?
The D3DX image file loader functions support BMP, TGA, PNG, JPG, DIB, PPM, and DDS files.
The text rendering functions in D3DX don't seem to work, what am I doing wrong?
A common mistake when using the ID3DXFont::DrawText functions is to specify a zero alpha component for the color parameter; resulting in completely transparent (that is, invisible) text. For fully opaque text, ensure that the alpha component of the color parameter is fully saturated (255).
Should I use ID3DXFont or the SDK framework CD3DFont class for font rendering?
The ID3DXFont class is capable of handling kerning and complex international fonts because it uses GDI to draw the string. It can therefore be a little slow since it needs to invoke GDI each time.
CD3DFont is designed for speed and uses textured primitives to draw the characters. It can only handle simple fonts and does not support the full array of formatting options available to ID3DXFont, but is useful for simple fast displays such as framerate counters and so forth.
For production code, you may well want to implement your own font rendering using textured primitives and/or a GDI based scheme with caching to avoid re-drawing.
Why do I get a burst of static when my application starts up? I notice this problem with other applications too.
You probably installed the debug DirectX runtime. The debug version of the runtime fills buffers with static in order to help developers catch bugs with uninitialized buffers. You cannot guarantee the contents of a DirectSound buffer after creation; in particular, you cannot assume that a buffer with be zeroed out.
How do I ensure my game will work properly with Network Address Translators (NATs) and Internet Connection Sharing (ICS) setups?
NATs and ICS are complex topics that are covered in greater detail in a separate article on MSDN. However, the following tips are good general guidelines:
- Use a client-server, not peer-to-peer network topology, using the IDirectPlay8Client and IDirectPlay8Server interfaces.
- Host servers on the clear Internet, not behind a NAT.
- Enumerate the game port directly, without using DPNSVR.
- Do not embed IP addresses or port numbers in your messages.
For issues regarding peer-to-peer games, hosting servers behind NATs, and specific advice for ICS on various different Windows operating systems, refer to the more detailed documentation.
What is DPNSVR for?
DPNSVR is a forwarding service for enumeration requests that eliminates problems caused by conflicts between port usages for multiple DirectPlay applications. Using DPNSVR allows DirectPlay to select the port to use automatically, while allowing clients to enumerate your game. By default, DirectPlay will use DPNSVR as this generally provides the most flexibility for applications; however, you can disable it by specifying the DPNSESSION_NODPNSVR flag when you create your session. The use of DPNSVR can cause some issues with NATs on the client-side, specifically, if the client enumerates the host using the DPNSVR port, and the host responds using its own port the NAT may deny forwarding the packet to the client because it didn't come from the same port the request was send to.
Why does IDirectPlay8LobbyClient::Initialize return DPNERR_NOTALLOWED?
DirectPlay does not allow more than one lobby client or application per process and attempting to create multiple clients will cause this error to be returned.