Introduction to 3D Game Programming with DirectX 12 学习笔记之 --- 第七章：在Direct3D中绘制（二）

最新推荐文章于 2024-06-04 11:10:53 发布

贾宝蛋@

最新推荐文章于 2024-06-04 11:10:53 发布

阅读量1.2k

点赞数 1

分类专栏： DirectX 文章标签： Direct 游戏开发

本文链接：https://blog.csdn.net/weixin_42441849/article/details/83272233

版权

本篇博客详细介绍了在Direct3D 12中使用帧资源提高性能，避免CPU等待GPU，以及如何通过渲染物体（RENDER ITEMS）、PASS CONSTANTS和形状几何创建动态3D场景。博主分享了如何创建和更新常量缓冲，绘制圆柱体、球体等几何形状，并给出了动态顶点缓冲的实现方法，探讨了根签名的使用和优化。此外，还提供了陆地和波浪效果的实现以及相关练习，帮助读者深入理解Direct3D 12的游戏开发技巧。

摘要由CSDN通过智能技术生成

代码工程地址：

https://github.com/jiabaodan/Direct12BookReadingNotes

学习目标

理解本章中针对命令队列的更新（不再需要每帧都flush命令队列），提高性能；
理解其他两种类型的根信号参数类型：根描述和根常量；
熟悉如何通过程序方法来绘制通用的几何形状：盒子，圆柱体和球体；
学习如何在CPU做顶点动画，并且通过动态顶点缓冲将顶点数据上传到GPU内存。

1 帧资源

在之前的代码中，我们在每帧结束的时候调用D3DApp::FlushCommandQueue方法来同步CPU和GPU，这个方法可以使用，但是很低效：

在每帧开始的时候，GPU没有任何命令可以执行，所以它一直在等待，直到CPU提交命令；
每帧的结尾，CPU需要等待GPU执行完命令。

这个问题的其中一个解决方案是针对CPU更新的资源创建一个环形数组，我们叫它帧资源（frame resources），通常情况下数组中使用3个元素。该方案中，CPU提交资源后，将会获取下一个可使用的资源（GPU没有在执行的）继续数据的更新，使用3个元素可以确保CPU提前2个元素更新，这样就可以保证GPU一直的高效运算。下面的例子是使用在Shape示例中的，因为CPU只需要更新常量缓冲，所以帧数据只包含常量缓冲：

// Stores the resources needed for the CPU to build the command lists
// for a frame. The contents here will vary from app to app based on
// the needed resources.
struct FrameResource
{ 
public:
	FrameResource(ID3D12Device* device, UINT passCount, UINT objectCount);
	FrameResource(const FrameResource& rhs) = delete;
	FrameResource& operator=(const FrameResource& rhs) = delete;
	˜FrameResource();
	
	// We cannot reset the allocator until the GPU is done processing the
	// commands. So each frame needs their own allocator.
	Microsoft::WRL::ComPtr<ID3D12CommandAllocator> CmdListAlloc;
	
	// We cannot update a cbuffer until the GPU is done processing the
	// commands that reference it. So each frame needs their own cbuffers.
	std::unique_ptr<UploadBuffer<PassConstants>> PassCB = nullptr;
	std::unique_ptr<UploadBuffer<ObjectConstants>> ObjectCB = nullptr;
	
	// Fence value to mark commands up to this fence point. This lets us
	// check if these frame resources are still in use by the GPU.
	UINT64 Fence = 0;
};

FrameResource::FrameResource(ID3D12Device* device, UINT passCount, UINT objectCount)
{
	ThrowIfFailed(device->CreateCommandAllocator(
		D3D12_COMMAND_LIST_TYPE_DIRECT,
		IID_PPV_ARGS(CmdListAlloc.GetAddressOf())));
		
	PassCB = std::make_unique<UploadBuffer<PassConstants>> (device, passCount, true);
	ObjectCB = std::make_unique<UploadBuffer<ObjectConstants>> (device, objectCount, true);
} 

FrameResource::˜ FrameResource() 
{
}

在我们的应用中使用Vector来实例化3个资源，并且跟踪当前的资源：

static const int NumFrameResources = 3;
std::vector<std::unique_ptr<FrameResource>> mFrameResources;
FrameResource* mCurrFrameResource = nullptr;
int mCurrFrameResourceIndex = 0;
void ShapesApp::BuildFrameResources()
{
	for(int i = 0; i < gNumFrameResources; ++i)
	{
		mFrameResources.push_back(std::make_unique<FrameResource> (
			md3dDevice.Get(), 1,
			(UINT)mAllRitems.size()));
	}
}

现在对于CPU第N帧，执行算法是：

void ShapesApp::Update(const GameTimer& gt)
{
	// Cycle through the circular frame resource array.
	mCurrFrameResourceIndex = (mCurrFrameResourceIndex + 1) % NumFrameResources;
	mCurrFrameResource = mFrameResources[mCurrFrameResourceIndex];
	
	// Has the GPU finished processing the commands of the current frame
	// resource. If not, wait until the GPU has completed commands up to
	// this fence point.
	if(mCurrFrameResource->Fence != 0 
		&& mCommandQueue->GetLastCompletedFence() < mCurrFrameResource->Fence)
	{
		HANDLE eventHandle = CreateEventEx(nullptr, false, false, EVENT_ALL_ACCESS);
		
		ThrowIfFailed(mCommandQueue->SetEventOnFenceCompletion(
			mCurrFrameResource->Fence, eventHandle));
			
		WaitForSingleObject(eventHandle, INFINITE);
		CloseHandle(eventHandle);
	}
	
	// […] Update resources in mCurrFrameResource (like cbuffers).
}

void ShapesApp::Draw(const GameTimer& gt)
{
	// […] Build and submit command lists for this frame.
	// Advance the fence value to mark commands up to this fence point.
	mCurrFrameResource->Fence = ++mCurrentFence;
	
	// Add an instruction to the command queue to set a new fence point.
	// Because we are on the GPU timeline, the new fence point won’t be
	// set until the GPU finishes processing all the commands prior to
	// this Signal().
	mCommandQueue->Signal(mFence.Get(), mCurrentFence);
	
	// Note that GPU could still be working on commands from previous
	// frames, but that is okay, because we are not touching any frame
	// resources associated with those frames.
}

这个方案并没有完美解决等待，如果其中一个处理器处理太快，它还是要等待另一个处理器。

2 渲染物体（RENDER ITEMS）

绘制一个物体需要设置大量参数，比如创建顶点和索引缓存，绑定常量缓冲，设置拓扑结构，指定DrawIndexedInstanced参数。如果我们要绘制多个物体，设计和创建一个轻量级结构用来保存上述所有数据就很有用。我们对这一组单个绘制调用需要的所有数据称之为一个渲染物体（render item），当前Demo中，我们RenderItem结构如下：

// Lightweight structure stores parameters to draw a shape. This will
// vary from app-to-app.
struct RenderItem
{
	RenderItem() = default;
	
	// World matrix of the shape that describes the object’s local space
	// relative to the world space, which defines the position,
	// orientation, and scale of the object in the world.
	XMFLOAT4X4 World = MathHelper::Identity4x4();
	
	// Dirty flag indicating the object data has changed and we need
	// to update the constant buffer. Because we have an object
	// cbuffer for each FrameResource, we have to apply the
	// update to each FrameResource. Thus, when we modify obect data we
	// should set
	// NumFramesDirty = gNumFrameResources so that each frame resource
	// gets the update.
	int NumFramesDirty = gNumFrameResources;
	
	// Index into GPU constant buffer corresponding to the ObjectCB
	// for this render item.
	UINT ObjCBIndex = -1;
	
	// Geometry associated with this render-item. Note that multiple
	// render-items can share the same geometry.
	MeshGeometry* Geo = nullptr;
	
	// Primitive topology.
	D3D12_PRIMITIVE_TOPOLOGY PrimitiveType = D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST;
	
	// DrawIndexedInstanced parameters.
	UINT IndexCount = 0;
	UINT StartIndexLocation = 0;
	int BaseVertexLocation = 0;
};

我们的应用将包含一个渲染物体列表来表示他们如何渲染；需要不同PSO的物体会放置到不同的列表中：

// List of all the render items.
std::vector<std::unique_ptr<RenderItem>> mAllRitems;

// Render items divided by PSO.
std::vector<RenderItem*> mOpaqueRitems;
std::vector<RenderItem*> mTransparentRitems;

3 PASS CONSTANTS

之前的章节中我们介绍了一个新的常量缓冲：

std::unique_ptr<UploadBuffer<PassConstants>> PassCB = nullptr;

它主要包含一些各个物体通用的常量，比如眼睛位置，透视投影矩阵，屏幕分辨率数据，还包括时间数据等。目前我们的Demo不需要所有这些数据，但是都实现他们会很方便，并且只会消耗很少的额外数据空间。比如我们如果要做一些后期特效，渲染目标尺寸数据就很有用：

cbuffer cbPass : register(b1)
{
	float4x4 gView;
	float4x4 gInvView;
	float4x4 gProj;
	float4x4 gInvProj;
	float4x4 gViewProj;
	float4x4 gInvViewProj;
	float3 gEyePosW;
	float cbPerObjectPad1;
	float2 gRenderTargetSize;
	float2 gInvRenderTargetSize;
	float gNearZ;
	float gFarZ;
	float gTotalTime;
	float gDeltaTime;
};

我们也需要修改之和每个物体关联的常量缓冲。目前我们只需要世界变换矩阵：

cbuffer cbPerObject : register(b0)
{
	float4x4 gWorld;
};

这样做的好处是可以将常量缓冲分组进行更新，每一个pass更新的常量缓冲需要每一个渲染Pass的时候更新；物体常量只需要当物体世界矩阵变换的时候更新；静态物体只需要在初始化的时候更新一下。在我们Demo中，实现了下面的方法来更新常量缓冲，它们每帧在Update中调用一次：

void ShapesApp::UpdateObjectCBs(const GameTimer& gt)
{
	auto currObjectCB = mCurrFrameResource->ObjectCB.get();
	
	for(auto& e : mAllRitems)
	{
		// Only update the cbuffer data if the constants have changed.
		// This needs to be tracked per frame resource.
		if(e->NumFramesDirty > 0)
		{
			XMMATRIX world = XMLoadFloat4x4(&e->World);
			ObjectConstants objConstants;
			XMStoreFloat4x4(&objConstants.World, XMMatrixTranspose(world));
			currObjectCB->CopyData(e->ObjCBIndex, objConstants);
			
			// Next FrameResource need to be updated too.
			e->NumFramesDirty--;
		}
	}
}

void ShapesApp::UpdateMainPassCB(const GameTimer& gt)
{
	XMMATRIX view = XMLoadFloat4x4(&mView);
	XMMATRIX proj = XMLoadFloat4x4(&mProj);
	XMMATRIX viewProj = XMMatrixMultiply(view, proj);
	XMMATRIX invView = XMMatrixInverse(&XMMatrixDeterminant(view), view);
	XMMATRIX invProj = XMMatrixInverse(&XMMatrixDeterminant(proj), proj);
	XMMATRIX invViewProj = 	XMMatrixInverse(&XMMatrixDeterminant(viewProj), viewProj);
	XMStoreFloat4x4(&mMainPassCB.View, XMMatrixTranspose(view));
	XMStoreFloat4x4(&mMainPassCB.InvView, XMMatrixTranspose(invView));
	XMStoreFloat4x4(&mMainPassCB.Proj, XMMatrixTranspose(proj));
	XMStoreFloat4x4(&mMainPassCB.InvProj, XMMatrixTranspose(invProj));
	XMStoreFloat4x4(&mMainPassCB.ViewProj, XMMatrixTranspose(viewProj));
	XMStoreFloat4x4(&mMainPassCB.InvViewProj, XMMatrixTranspose(invViewProj));
	mMainPassCB.EyePosW = mEyePos;
	mMainPassCB.RenderTargetSize = XMFLOAT2((float)mClientWidth, (float)mClientHeight);
	mMainPassCB.InvRenderTargetSize = XMFLOAT2(1.0f / mClientWidth, 1.0f / mClientHeight);
	mMainPassCB.NearZ = 1.0f;
	mMainPassCB.FarZ = 1000.0f;
	mMainPassCB.TotalTime = gt.TotalTime();
	mMainPassCB.DeltaTime = gt.DeltaTime();
	auto currPassCB = mCurrFrameResource->PassCB.get();
	currPassCB->CopyData(0, mMainPassCB);
}

我们更新顶点着色器相应的支持这个缓冲变换：

VertexOut VS(VertexIn vin)
{
	VertexOut vout;
	
	// Transform to homogeneous clip space.
	float4 posW = mul(float4(vin.PosL, 1.0f), gWorld);
	vout.PosH = mul(posW, gViewProj);
	
	// Just pass vertex color into the pixel shader.
	vout.Color = vin.Color;
	
	return vout;
}

这里额外的逐顶点矩阵相乘，在现在强大的GPU上是微不足道的。
着色器需要的资源发生变化，所以需要更新根签名相应的包含两个描述表：

CD3DX12_DESCRIPTOR_RANGE cbvTable0;
cbvTable0.Init(D3D12_DESCRIPTOR_RANGE_TYPE_CBV, 1, 0);

CD3DX12_DESCRIPTOR_RANGE cbvTable1;
cbvTable1.Init(D3D12_DESCRIPTOR_RANGE_TYPE_CBV, 1, 1);

// Root parameter can be a table, root descriptor or root constants.
CD3DX12_ROOT_PARAMETER slotRootParameter[2];

// Create root CBVs.
slotRootParameter[0].InitAsDescriptorTable(1, &cbvTable0);
slotRootParameter[1].InitAsDescriptorTable(1, &cbvTable1);

// A root signature is an array of root parameters.
CD3DX12_ROOT_SIGNATURE_DESC rootSigDesc(2,
	slotRootParameter, 0, nullptr,
	D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT__LAYOUT);

不要在着色器中使用太多的常量缓冲，为了性能[Thibieroz13]建议保持在5个以下。

4 形状几何

这节将会展示如何创建椭球体，球体，圆柱体和圆锥体。这些形状对于绘制天空示例，Debugging，可视化碰撞检测和延时渲染非常有用。
我们将在程序中创建几何体的代码放在GeometryGenerator（GeometryGenerator.h/.cpp）类中，该类创建的数据保存在内存中，所以我们还需要将它们赋值到顶点/索引缓冲中。MeshData结构是一个内嵌在GeometryGenerator中用来保存顶点和索引列表的简单结构：

class GeometryGenerator
{ 
public:
	using uint16 = std::uint16_t;
	using uint32 = std::uint32_t;
	
	struct Vertex
	{
		Vertex(){}
		Vertex(
			const DirectX::XMFLOAT3& p,
			const DirectX::XMFLOAT3& n,
			const DirectX::XMFLOAT3& t,
			const DirectX::XMFLOAT2& uv) :
			Position(p),
			Normal(n),
			TangentU(t),
			TexC(uv){}
			
		Vertex(
			float px, float py, float pz,
			float nx, float ny, float nz,
			float tx, float ty, float tz,
			float u, float v) :
			Position(px,py,pz),
			Normal(nx,ny,nz),
			TangentU(tx, ty, tz),
			TexC(u,v){}
			
		DirectX::XMFLOAT3 Position;
		DirectX::XMFLOAT3 Normal;
		DirectX::XMFLOAT3 TangentU;
		DirectX::XMFLOAT2 TexC;
	};
	
	struct MeshData
	{
		std::vector<Vertex> Vertices;
		std::vector<uint32> Indices32;
		std::vector<uint16>& GetIndices16()
		{
			if(mIndices16.empty())
			{
				mIndices16.resize(Indices32.size());
				for(size_t i = 0; i < Indices32.size(); ++i)
					mIndices16[i] = static_cast<uint16> (Indices32[i]);
			}
			return mIndices16;
		}
		
	private:
		std::vector<uint16> mIndices16;
	};
	…
};

4.1 创建圆柱体网格

我们通过定义底面和顶面半径，高度，切片（slice）和堆叠（stack）个数来定义一个圆柱体网格，如下图，我们将圆柱体划分成侧面，底面和顶面：
在这里插入图片描述

4.1.1 圆柱体侧面几何

我们创建的圆柱体中心的原点，平行于Y轴，所有顶点依赖于环（rings）。每个圆柱体有stackCount + 1环，每一环有sliceCount个独立的顶点。每一环半径的变化为(topRadius – bottomRadius)/stackCount；所以基本的创建圆柱体的思路就是遍历每一环创建顶点：

GeometryGenerator::MeshData
GeometryGenerator::CreateCylinder(
	float bottomRadius, float topRadius,
	float height, uint32 sliceCount, uint32
	stackCount)
{
	MeshData meshData;
	
	//
	// Build Stacks.
	//
	float stackHeight = height / stackCount;
	
	// Amount t

最低0.47元/天解锁文章

贾宝蛋@

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
Introduction to 3D Game Programming with DirectX 12 学习笔记之 --- 第七章：在Direct3D中绘制（二）

代码工程地址：https://github.com/jiabaodan/Direct12BookReadingNotes学习目标理解本章中针对命令队列的更新（不再需要每帧都flush命令队列），提高性能；理解其他两种类型的根信号参数类型：根描述和根常量；熟悉如何通过程序方法来绘制通用的几何形状：盒子，圆柱体和球体；学习如何在CPU做顶点动画，并且通过动态顶点缓冲将顶点数据上传到G...
复制链接

扫一扫