Introduction to 3D Game Programming with DirectX 12 学习笔记之 --- 第七章:在Direct3D中绘制(二)

代码工程地址:

https://github.com/jiabaodan/Direct12BookReadingNotes



学习目标

  1. 理解本章中针对命令队列的更新(不再需要每帧都flush命令队列),提高性能;
  2. 理解其他两种类型的根信号参数类型:根描述和根常量;
  3. 熟悉如何通过程序方法来绘制通用的几何形状:盒子,圆柱体和球体;
  4. 学习如何在CPU做顶点动画,并且通过动态顶点缓冲将顶点数据上传到GPU内存。


1 帧资源

在之前的代码中,我们在每帧结束的时候调用D3DApp::FlushCommandQueue方法来同步CPU和GPU,这个方法可以使用,但是很低效:

  1. 在每帧开始的时候,GPU没有任何命令可以执行,所以它一直在等待,直到CPU提交命令;
  2. 每帧的结尾,CPU需要等待GPU执行完命令。

这个问题的其中一个解决方案是针对CPU更新的资源创建一个环形数组,我们叫它帧资源(frame resources),通常情况下数组中使用3个元素。该方案中,CPU提交资源后,将会获取下一个可使用的资源(GPU没有在执行的)继续数据的更新,使用3个元素可以确保CPU提前2个元素更新,这样就可以保证GPU一直的高效运算。下面的例子是使用在Shape示例中的,因为CPU只需要更新常量缓冲,所以帧数据只包含常量缓冲:

// Stores the resources needed for the CPU to build the command lists
// for a frame. The contents here will vary from app to app based on
// the needed resources.
struct FrameResource
{ 
public:
	FrameResource(ID3D12Device* device, UINT passCount, UINT objectCount);
	FrameResource(const FrameResource& rhs) = delete;
	FrameResource& operator=(const FrameResource& rhs) = delete;
	˜FrameResource();
	
	// We cannot reset the allocator until the GPU is done processing the
	// commands. So each frame needs their own allocator.
	Microsoft::WRL::ComPtr<ID3D12CommandAllocator> CmdListAlloc;
	
	// We cannot update a cbuffer until the GPU is done processing the
	// commands that reference it. So each frame needs their own cbuffers.
	std::unique_ptr<UploadBuffer<PassConstants>> PassCB = nullptr;
	std::unique_ptr<UploadBuffer<ObjectConstants>> ObjectCB = nullptr;
	
	// Fence value to mark commands up to this fence point. This lets us
	// check if these frame resources are still in use by the GPU.
	UINT64 Fence = 0;
};

FrameResource::FrameResource(ID3D12Device* device, UINT passCount, UINT objectCount)
{
	ThrowIfFailed(device->CreateCommandAllocator(
		D3D12_COMMAND_LIST_TYPE_DIRECT,
		IID_PPV_ARGS(CmdListAlloc.GetAddressOf())));
		
	PassCB = std::make_unique<UploadBuffer<PassConstants>> (device, passCount, true);
	ObjectCB = std::make_unique<UploadBuffer<ObjectConstants>> (device, objectCount, true);
} 

FrameResource::˜ FrameResource() 
{
}

在我们的应用中使用Vector来实例化3个资源,并且跟踪当前的资源:

static const int NumFrameResources = 3;
std::vector<std::unique_ptr<FrameResource>> mFrameResources;
FrameResource* mCurrFrameResource = nullptr;
int mCurrFrameResourceIndex = 0;
void ShapesApp::BuildFrameResources()
{
	for(int i = 0; i < gNumFrameResources; ++i)
	{
		mFrameResources.push_back(std::make_unique<FrameResource> (
			md3dDevice.Get(), 1,
			(UINT)mAllRitems.size()));
	}
}

现在对于CPU第N帧,执行算法是:

void ShapesApp::Update(const GameTimer& gt)
{
	// Cycle through the circular frame resource array.
	mCurrFrameResourceIndex = (mCurrFrameResourceIndex + 1) % NumFrameResources;
	mCurrFrameResource = mFrameResources[mCurrFrameResourceIndex];
	
	// Has the GPU finished processing the commands of the current frame
	// resource. If not, wait until the GPU has completed commands up to
	// this fence point.
	if(mCurrFrameResource->Fence != 0 
		&& mCommandQueue->GetLastCompletedFence() < mCurrFrameResource->Fence)
	{
		HANDLE eventHandle = CreateEventEx(nullptr, false, false, EVENT_ALL_ACCESS);
		
		ThrowIfFailed(mCommandQueue->SetEventOnFenceCompletion(
			mCurrFrameResource->Fence, eventHandle));
			
		WaitForSingleObject(eventHandle, INFINITE);
		CloseHandle(eventHandle);
	}
	
	// […] Update resources in mCurrFrameResource (like cbuffers).
}

void ShapesApp::Draw(const GameTimer& gt)
{
	// […] Build and submit command lists for this frame.
	// Advance the fence value to mark commands up to this fence point.
	mCurrFrameResource->Fence = ++mCurrentFence;
	
	// Add an instruction to the command queue to set a new fence point.
	// Because we are on the GPU timeline, the new fence point won’t be
	// set until the GPU finishes processing all the commands prior to
	// this Signal().
	mCommandQueue->Signal(mFence.Get(), mCurrentFence);
	
	// Note that GPU could still be working on commands from previous
	// frames, but that is okay, because we are not touching any frame
	// resources associated with those frames.
}

这个方案并没有完美解决等待,如果其中一个处理器处理太快,它还是要等待另一个处理器。



2 渲染物体(RENDER ITEMS)

绘制一个物体需要设置大量参数,比如创建顶点和索引缓存,绑定常量缓冲,设置拓扑结构,指定DrawIndexedInstanced参数。如果我们要绘制多个物体,设计和创建一个轻量级结构用来保存上述所有数据就很有用。我们对这一组单个绘制调用需要的所有数据称之为一个渲染物体(render item),当前Demo中,我们RenderItem结构如下:

// Lightweight structure stores parameters to draw a shape. This will
// vary from app-to-app.
struct RenderItem
{
	RenderItem() = default;
	
	// World matrix of the shape that describes the object’s local space
	// relative to the world space, which defines the position,
	// orientation, and scale of the object in the world.
	XMFLOAT4X4 World = MathHelper::Identity4x4();
	
	// Dirty flag indicating the object data has changed and we need
	// to update the constant buffer. Because we have an object
	// cbuffer for each FrameResource, we have to apply the
	// update to each FrameResource. Thus, when we modify obect data we
	// should set
	// NumFramesDirty = gNumFrameResources so that each frame resource
	// gets the update.
	int NumFramesDirty = gNumFrameResources;
	
	// Index into GPU constant buffer corresponding to the ObjectCB
	// for this render item.
	UINT ObjCBIndex = -1;
	
	// Geometry associated with this render-item. Note that multiple
	// render-items can share the same geometry.
	MeshGeometry* Geo = nullptr;
	
	// Primitive topology.
	D3D12_PRIMITIVE_TOPOLOGY PrimitiveType = D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST;
	
	// DrawIndexedInstanced parameters.
	UINT IndexCount = 0;
	UINT StartIndexLocation = 0;
	int BaseVertexLocation = 0;
};

我们的应用将包含一个渲染物体列表来表示他们如何渲染;需要不同PSO的物体会放置到不同的列表中:

// List of all the render items.
std::vector<std::unique_ptr<RenderItem>> mAllRitems;

// Render items divided by PSO.
std::vector<RenderItem*> mOpaqueRitems;
std::vector<RenderItem*> mTransparentRitems;


3 PASS CONSTANTS

之前的章节中我们介绍了一个新的常量缓冲:

std::unique_ptr<UploadBuffer<PassConstants>> PassCB = nullptr;

它主要包含一些各个物体通用的常量,比如眼睛位置,透视投影矩阵,屏幕分辨率数据,还包括时间数据等。目前我们的Demo不需要所有这些数据,但是都实现他们会很方便,并且只会消耗很少的额外数据空间。比如我们如果要做一些后期特效,渲染目标尺寸数据就很有用:

cbuffer cbPass : register(b1)
{
	float4x4 gView;
	float4x4 gInvView;
	float4x4 gProj;
	float4x4 gInvProj;
	float4x4 gViewProj;
	float4x4 gInvViewProj;
	float3 gEyePosW;
	float cbPerObjectPad1;
	float2 gRenderTargetSize;
	float2 gInvRenderTargetSize;
	float gNearZ;
	float gFarZ;
	float gTotalTime;
	float gDeltaTime;
};

我们也需要修改之和每个物体关联的常量缓冲。目前我们只需要世界变换矩阵:

cbuffer cbPerObject : register(b0)
{
	float4x4 gWorld;
};

这样做的好处是可以将常量缓冲分组进行更新,每一个pass更新的常量缓冲需要每一个渲染Pass的时候更新;物体常量只需要当物体世界矩阵变换的时候更新;静态物体只需要在初始化的时候更新一下。在我们Demo中,实现了下面的方法来更新常量缓冲,它们每帧在Update中调用一次:

void ShapesApp::UpdateObjectCBs(const GameTimer& gt)
{
	auto currObjectCB = mCurrFrameResource->ObjectCB.get();
	
	for(auto& e : mAllRitems)
	{
		// Only update the cbuffer data if the constants have changed.
		// This needs to be tracked per frame resource.
		if(e->NumFramesDirty > 0)
		{
			XMMATRIX world = XMLoadFloat4x4(&e->World);
			ObjectConstants objConstants;
			XMStoreFloat4x4(&objConstants.World, XMMatrixTranspose(world));
			currObjectCB->CopyData(e->ObjCBIndex, objConstants);
			
			// Next FrameResource need to be updated too.
			e->NumFramesDirty--;
		}
	}
}

void ShapesApp::UpdateMainPassCB(const GameTimer& gt)
{
	XMMATRIX view = XMLoadFloat4x4(&mView);
	XMMATRIX proj = XMLoadFloat4x4(&mProj);
	XMMATRIX viewProj = XMMatrixMultiply(view, proj);
	XMMATRIX invView = XMMatrixInverse(&XMMatrixDeterminant(view), view);
	XMMATRIX invProj = XMMatrixInverse(&XMMatrixDeterminant(proj), proj);
	XMMATRIX invViewProj = 	XMMatrixInverse(&XMMatrixDeterminant(viewProj), viewProj);
	XMStoreFloat4x4(&mMainPassCB.View, XMMatrixTranspose(view));
	XMStoreFloat4x4(&mMainPassCB.InvView, XMMatrixTranspose(invView));
	XMStoreFloat4x4(&mMainPassCB.Proj, XMMatrixTranspose(proj));
	XMStoreFloat4x4(&mMainPassCB.InvProj, XMMatrixTranspose(invProj));
	XMStoreFloat4x4(&mMainPassCB.ViewProj, XMMatrixTranspose(viewProj));
	XMStoreFloat4x4(&mMainPassCB.InvViewProj, XMMatrixTranspose(invViewProj));
	mMainPassCB.EyePosW = mEyePos;
	mMainPassCB.RenderTargetSize = XMFLOAT2((float)mClientWidth, (float)mClientHeight);
	mMainPassCB.InvRenderTargetSize = XMFLOAT2(1.0f / mClientWidth, 1.0f / mClientHeight);
	mMainPassCB.NearZ = 1.0f;
	mMainPassCB.FarZ = 1000.0f;
	mMainPassCB.TotalTime = gt.TotalTime();
	mMainPassCB.DeltaTime = gt.DeltaTime();
	auto currPassCB = mCurrFrameResource->PassCB.get();
	currPassCB->CopyData(0, mMainPassCB);
}

我们更新顶点着色器相应的支持这个缓冲变换:

VertexOut VS(VertexIn vin)
{
	VertexOut vout;
	
	// Transform to homogeneous clip space.
	float4 posW = mul(float4(vin.PosL, 1.0f), gWorld);
	vout.PosH = mul(posW, gViewProj);
	
	// Just pass vertex color into the pixel shader.
	vout.Color = vin.Color;
	
	return vout;
}

这里额外的逐顶点矩阵相乘,在现在强大的GPU上是微不足道的。
着色器需要的资源发生变化,所以需要更新根签名相应的包含两个描述表:

CD3DX12_DESCRIPTOR_RANGE cbvTable0;
cbvTable0.Init(D3D12_DESCRIPTOR_RANGE_TYPE_CBV, 1, 0);

CD3DX12_DESCRIPTOR_RANGE cbvTable1;
cbvTable1.Init(D3D12_DESCRIPTOR_RANGE_TYPE_CBV, 1, 1);

// Root parameter can be a table, root descriptor or root constants.
CD3DX12_ROOT_PARAMETER slotRootParameter[2];

// Create root CBVs.
slotRootParameter[0].InitAsDescriptorTable(1, &cbvTable0);
slotRootParameter[1].InitAsDescriptorTable(1, &cbvTable1);

// A root signature is an array of root parameters.
CD3DX12_ROOT_SIGNATURE_DESC rootSigDesc(2,
	slotRootParameter, 0, nullptr,
	D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT__LAYOUT);

不要在着色器中使用太多的常量缓冲,为了性能[Thibieroz13]建议保持在5个以下。



4 形状几何

这节将会展示如何创建椭球体,球体,圆柱体和圆锥体。这些形状对于绘制天空示例,Debugging,可视化碰撞检测和延时渲染非常有用。
我们将在程序中创建几何体的代码放在GeometryGenerator(GeometryGenerator.h/.cpp)类中,该类创建的数据保存在内存中,所以我们还需要将它们赋值到顶点/索引缓冲中。MeshData结构是一个内嵌在GeometryGenerator中用来保存顶点和索引列表的简单结构:

class GeometryGenerator
{ 
public:
	using uint16 = std::uint16_t;
	using uint32 = std::uint32_t;
	
	struct Vertex
	{
		Vertex(){}
		Vertex(
			const DirectX::XMFLOAT3& p,
			const DirectX::XMFLOAT3& n,
			const DirectX::XMFLOAT3& t,
			const DirectX::XMFLOAT2& uv) :
			Position(p),
			Normal(n),
			TangentU(t),
			TexC(uv){}
			
		Vertex(
			float px, float py, float pz,
			float nx, float ny, float nz,
			float tx, float ty, float tz,
			float u, float v) :
			Position(px,py,pz),
			Normal(nx,ny,nz),
			TangentU(tx, ty, tz),
			TexC(u,v){}
			
		DirectX::XMFLOAT3 Position;
		DirectX::XMFLOAT3 Normal;
		DirectX::XMFLOAT3 TangentU;
		DirectX::XMFLOAT2 TexC;
	};
	
	struct MeshData
	{
		std::vector<Vertex> Vertices;
		std::vector<uint32> Indices32;
		std::vector<uint16>& GetIndices16()
		{
			if(mIndices16.empty())
			{
				mIndices16.resize(Indices32.size());
				for(size_t i = 0; i < Indices32.size(); ++i)
					mIndices16[i] = static_cast<uint16> (Indices32[i]);
			}
			return mIndices16;
		}
		
	private:
		std::vector<uint16> mIndices16;
	};
	…
};

4.1 创建圆柱体网格

我们通过定义底面和顶面半径,高度,切片(slice)和堆叠(stack)个数来定义一个圆柱体网格,如下图,我们将圆柱体划分成侧面,底面和顶面:
在这里插入图片描述


4.1.1 圆柱体侧面几何

我们创建的圆柱体中心的原点,平行于Y轴,所有顶点依赖于环(rings)。每个圆柱体有stackCount + 1环,每一环有sliceCount个独立的顶点。每一环半径的变化为(topRadius – bottomRadius)/stackCount;所以基本的创建圆柱体的思路就是遍历每一环创建顶点:

GeometryGenerator::MeshData
GeometryGenerator::CreateCylinder(
	float bottomRadius, float topRadius,
	float height, uint32 sliceCount, uint32
	stackCount)
{
	MeshData meshData;
	
	//
	// Build Stacks.
	//
	float stackHeight = height / stackCount;
	
	// Amount to increment radius as we move up each stack level from
	// bottom to top.
	float radiusStep = (topRadius - bottomRadius) / stackCount;
	uint32 ringCount = stackCount+1;
	
	// Compute vertices for each stack ring starting at the bottom and
	// moving up.
	for(uint32 i = 0; i < ringCount; +&
  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值