DirectX12学习笔记（七）Drawing in Direct3D PartⅡ

最新推荐文章于 2024-02-23 18:07:41 发布

Calette

最新推荐文章于 2024-02-23 18:07:41 发布

阅读量832

点赞数

分类专栏： DirectX12

本文链接：https://blog.csdn.net/Calette/article/details/103973144

版权

本文深入探讨DirectX12中的帧资源优化，如何避免CPU和GPU同步造成的空闲，并介绍了渲染项、Pass Constants的概念。通过生成椭球、球体和圆柱的几何图形，展示了如何在Direct3D中绘制复杂场景。同时，文章还讨论了Root Signatures的多样性和在实际应用中的权衡，以及如何在着色器中高效地管理资源。

摘要由CSDN通过智能技术生成

本章介绍了将在本书的其余部分使用的一些绘图模式。首先介绍一个绘图的优化，我们称之为“帧资源”。我们用帧资源修改了渲染循环，这样我们就不必每帧都刷新命令队列，提高了CPU和GPU的利用率。接下来我们将介绍一个渲染概念，并解释如何根据更新频率划分常量数据。此外我们将更详细地研究root signatures，了解其他根参数类型:root descriptors和root constants。最后我们展示如何绘制一些更复杂的对象，本章结束时你将能够绘制一个类似于山和谷、圆柱体、球体和动画波模拟的表面。

7.1 FRAME RESOURCES

我们回顾4.2，CPU和GPU是并行工作的。CPU构建并提交命令列表(以及其他CPU工作)，GPU处理命令队列中的命令，这是为了让CPU和GPU都忙起来，以充分利用系统上可用的硬件资源。到目前为止，在我们的演示里每帧同步一次CPU和GPU。有两个必要的理由:

命令分配器直到GPU完成执行命令前不能被重置。假设我们不同步，CPU可以继续进行下一帧n + 1，GPU完成前处理当前帧n。如果CPU在n+1帧重置了命令分配器，但GPU仍然在处理帧n中的命令，那么我们就清除了GPU仍在处理的命令。
在GPU执行完引用constant buffer的绘图命令之前，CPU无法更新constant buffer。这个例子对应于4.2.2描述的情况。

因此，我们一直在每一帧的末尾调用D3DApp::FlushCommandQueue来确保GPU已经完成了帧的所有命令的执行。这个解决方案有效但效率低下，原因如下:

在一帧的开始，GPU自从清空命令队列开始将没有任何可以处理的命令，必须等待直到CPU构建并提交一些命令才能执行。
在一帧的末尾，CPU必须等待GPU处理完命令。

所以每一帧，CPU和GPU都会在某一时刻空闲。

这个问题的一个解决方案是创建一个循环数组，其中包含CPU每帧所需修改的资源，我们称这样的资源为frame resources。我们通常使用由三个frame resource元素组成的循环数组。其思想是对于帧n, CPU将循环通过frame resource数组获得下一个可用的(不被GPU使用)frame resource。然后CPU将执行任何的资源更新，构建和提交第n帧的命令列表，而GPU则处理之前的帧。然后CPU继续到n+1帧重复这个过程。如果frame resource数组有三个元素，CPU比GPU多出两帧，确保GPU一直处于忙碌状态。

下面是我们在本章“Shapes”演示中使用的frame resource类的一个示例。在这个演示中，CPU只需要修改constant buffers，所以frame resource类只包含constant buffers：

// Stores the resources needed for the CPU to build the command lists
// for a frame. The contents here will vary from app to app based on
// the needed resources.
struct FrameResource
{
public:

    FrameResource(ID3D12Device* device, UINT passCount, UINT objectCount);
    FrameResource(const FrameResource& rhs) = delete;
    FrameResource& operator=(const FrameResource& rhs) = delete;
    ˜FrameResource();

    // We cannot reset the allocator until the GPU is done processing the
    // commands. So each frame needs their own allocator.
    Microsoft::WRL::ComPtr<ID3D12CommandAllocator> CmdListAlloc;

    // We cannot update a cbuffer until the GPU is done processing the
    // commands that reference it. So each frame needs their own cbuffers.
    std::unique_ptr<UploadBuffer<PassConstants>> PassCB = nullptr;
    std::unique_ptr<UploadBuffer<ObjectConstants>> ObjectCB = nullptr;

    // Fence value to mark commands up to this fence point. This lets us
    // check if these frame resources are still in use by the GPU.
    UINT64 Fence = 0;
};

FrameResource::FrameResource(ID3D12Device* device, UINT passCount, UINT objectCount)
{
    ThrowIfFailed(device->CreateCommandAllocator(
        D3D12_COMMAND_LIST_TYPE_DIRECT,
        IID_PPV_ARGS(CmdListAlloc.GetAddressOf())));

    PassCB = std::make_unique<UploadBuffer<PassConstants>>(device, passCount, true);
    ObjectCB = std::make_unique<UploadBuffer<ObjectConstants>>(device, objectCount, true);
}
FrameResource::˜FrameResource() { }

然后我们的应用程序类将实例化一个由三个frame resources组成的向量，并保留成员变量来跟踪当前框架资源:

std::vector<std::unique_ptr<FrameResource>> mFrameResources;
FrameResource* mCurrFrameResource = nullptr;
int mCurrFrameResourceIndex = 0;

void ShapesApp::BuildFrameResources()
{
    for(int i = 0; i < gNumFrameResources; ++i)
    {
        mFrameResources.push_back(std::make_unique<FrameResource> (md3dDevice.Get(), 1, (UINT)mAllRitems.size()));
    }
}

对于CPU第n帧的算法如下:

void ShapesApp::Update(const GameTimer& gt)
{
    // Cycle through the circular frame resource array.
    mCurrFrameResourceIndex = (mCurrFrameResourceIndex + 1) % NumFrameResources;
    mCurrFrameResource = mFrameResources[mCurrFrameResourceIndex];

    // Has the GPU finished processing the commands of the current frame
    // resource. If not, wait until the GPU has completed commands up to
    // this fence point.
    if(mCurrFrameResource->Fence != 0 && mCommandQueue->GetLastCompletedFence() < mCurrFrameResource->Fence)
    {
        HANDLE eventHandle = CreateEventEx(nullptr, false, false, EVENT_ALL_ACCESS);
        ThrowIfFailed(mCommandQueue->SetEventOnFenceCompletion(
            mCurrFrameResource->Fence, eventHandle));
            WaitForSingleObject(eventHandle, INFINITE);
        CloseHandle(eventHandle);
    }

    // […] Update resources in mCurrFrameResource (like cbuffers).
}

void ShapesApp::Draw(const GameTimer& gt)
{
    // […] Build and submit command lists for this frame.
    // Advance the fence value to mark commands up to this fence point.
    mCurrFrameResource->Fence = ++mCurrentFence;

    // Add an instruction to the command queue to set a new fence point.
    // Because we are on the GPU timeline, the new fence point won’t be
    // set until the GPU finishes processing all the commands prior to
    // this Signal().
    mCommandQueue->Signal(mFence.Get(), mCurrentFence);

    // Note that GPU could still be working on commands from previous
    // frames, but that is okay, because we are not touching any frame
    // resources associated with those frames.
}

7.2 RENDER ITEMS

绘制对象需要设置多个参数，如绑定vertex和index buffers、绑定对象constants、设置图元类型和指定DrawIndexedInstanced参数。当我们开始在场景中绘制更多的对象时，创建一个轻量级结构来存储绘制对象所需的数据是很有帮助的。这些数据会因应用程序的不同而不同，因为我们添加了需要不同绘图数据的新功能。我们将提交完整绘制所需的数据集称为呈现管道中的呈现项。在这个演示中，我们的渲染项结构是这样的：

// Lightweight structure stores parameters to draw a shape. This will
// vary from app-to-app.
struct RenderItem
{
    RenderItem() = default;

    // World matrix of the shape that describes the object’s local space
    // relative to the world space, which defines the position,
    // orientation, and scale of the object in the world.
    XMFLOAT4X4 World = MathHelper::Identity4x4();

    // Dirty flag indicating the object data has changed and we need
    // to update the constant buffer. Because we have an object
    // cbuffer for each FrameResource, we have to apply the
    // update to each FrameResource. Thus, when we modify obect data we should set
    // NumFramesDirty = gNumFrameResources so that each frame resource gets the update.
    int NumFramesDirty = gNumFrameResources;

    // Index into GPU constant buffer corresponding to the ObjectCB
    // for this render item.
    UINT ObjCBIndex = -1;

    // Geometry associated with this render-item. Note that multiple
    // render-items can share the same geometry.
    MeshGeometry* Geo = nullptr;

    // Primitive topology.
    D3D12_PRIMITIVE_TOPOLOGY PrimitiveType = D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST;

    // DrawIndexedInstanced parameters.
    UINT IndexCount = 0;
    UINT StartIndexLocation = 0;
    int BaseVertexLocation = 0;
};

我们的应用程序将维护列表渲染项目的基础上，他们需要如何绘制;也就是说，需要不同PSOs的渲染项目将被保存在不同的列表中。

// List of all the render items.
std::vector<std::unique_ptr<RenderItem>> mAllRitems;

// Render items divided by PSO.
std::vector<RenderItem*> mOpaqueRitems;
std::vector<RenderItem*> mTransparentRitems;

7.3 PASS CONSTANTS

从上一节可以看出，我们在FrameResource类中引入了一个新的常量缓冲区:

std::unique_ptr<UploadBuffer<PassConstants>> PassCB = nullptr;

在接下来的演示中，这个缓冲区储存了在给定的rendering pass上固定的常量数据，比如眼睛位置、view和projection矩阵，以及关于屏幕(render target)维度的信息;它还包括游戏计时信息，这是在着色程序中可以访问的有用数据。请注意，我们的演示并不一定使用所有的这些常量数据，但是这样使用这些常量数据很方便，而且提供额外数据的成本也很低。例如虽然我们现在不需要render target的大小，但当我们去实现一些后处理效果时，就需要这些信息。

cbuffer cbPass : register(b1)
{
    float4x4 gView;
    float4x4 gInvView;
    float4x4 gProj;
    float4x4 gInvProj;
    float4x4 gViewProj;
    float4x4 gInvViewProj;
    float3 gEyePosW;
    float cbPerObjectPad1;
    float2 gRenderTargetSize;
    float2 gInvRenderTargetSize;
    float gNearZ;
    float gFarZ;
    float gTotalTime;
    float gDeltaTime;
};

我们还修改了每个对象constant buffer，使其只存储与该对象关联的常量。到目前为止，我们唯一与一个对象相关联的用于绘制的常量数据是它的世界矩阵:

cbuffer cbPerObject : register(b0)
{
    float4x4 gWorld;
};

我们可以根据更新频率对常量进行分组。每一帧传递一次的常量只需要更新一次，而对象常量只需要在对象的世界矩阵改变时才需要改变。如果我们在场景中有一个静态对象，比如一棵树，我们只需要将它的

最低0.47元/天解锁文章

Calette

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录