UE4 Forward+流程分析 - 知乎
一、管线流程
UE的移动端Forward+渲染在MobileDeferred管线中会开启。在InitiViews
阶段,会初始化FForwardLightingViewResources
类,其中包含了Forward渲染需要的参数和Buffer。
class FForwardLightingViewResources
{
public:
FForwardLightData ForwardLightData;
const FLightSceneProxy *SelectedForwardDirectionalLightProxy = nullptr;
TUniformBufferRef<FForwardLightData> ForwardLightDataUniformBuffer;
FDynamicReadBuffer ForwardLocalLightBuffer; // LocalLight信息
FRWBuffer NumCulledLightsGrid; // 当前Cluster下的灯光数量
FRWBuffer CulledLightDataGrid; // 当前Cluster下的灯光索引(链表或直接Buffer)
// ...
};
其中,FForwardLightData是一个Shader参数结构体,里面包含了LocalLight和DirectionalLight的信息。
#define FORWARD_GLOBAL_LIGHT_DATA_UNIFORM_BUFFER_MEMBER_TABLE \\
SHADER_PARAMETER(uint32, NumLocalLights) \\
SHADER_PARAMETER(uint32, NumReflectionCaptures) \\
SHADER_PARAMETER(uint32, HasDirectionalLight) \\
SHADER_PARAMETER(uint32, NumGridCells) \\
SHADER_PARAMETER(FIntVector, CulledGridSize) \\
SHADER_PARAMETER(uint32, MaxCulledLightsPerCell) \\
/* ... */ \\
SHADER_PARAMETER_SRV(StrongTypedBuffer<float4>, ForwardLocalLightBuffer) \\
SHADER_PARAMETER_SRV(StrongTypedBuffer<uint>, NumCulledLightsGrid) \\
SHADER_PARAMETER_SRV(StrongTypedBuffer<uint>, CulledLightDataGrid)
BEGIN_GLOBAL_SHADER_PARAMETER_STRUCT_WITH_CONSTRUCTOR(FForwardLightData, )
FORWARD_GLOBAL_LIGHT_DATA_UNIFORM_BUFFER_MEMBER_TABLE
END_GLOBAL_SHADER_PARAMETER_STRUCT()
随后在管线中会收集并对当前灯光进行排序,然后通过ComputeShader生成每个Cluster的灯光索引,具体在管线的位置是在InitViews之后。
FSortedLightSetSceneInfo SortedLightSet;
if (bDeferredShading)
{
GatherAndSortLights(SortedLightSet);
int32 NumReflectionCaptures = Views[0].NumBoxReflectionCaptures + Views[0].NumSphereReflectionCaptures;
bool bCullLightsToGrid = (NumReflectionCaptures > 0 || GMobileUseClusteredDeferredShading != 0);
FRDGBuilder GraphBuilder(RHICmdList);
ComputeLightGrid(GraphBuilder, bCullLightsToGrid, SortedLightSet);
GraphBuilder.Execute();
}
二、灯光排序
这个过程(Source\Runtime\Renderer\Private\LightGridInjection.cpp)会收集场景中的灯光信息,保存在FSortedLightSetSceneInfo中,它记录了各种类型的灯光的数量以及SceneInfo。
struct FSortedLightSetSceneInfo
{
int SimpleLightsEnd;
int TiledSupportedEnd;
int ClusteredSupportedEnd;
/** First light with shadow map or */
int AttenuationLightStart;
FSimpleLightArray SimpleLights;
TArray<FSortedLightSceneInfo, SceneRenderingAllocator> SortedLights;
};
首先如果包含SimpeLight(例如粒子的LightRenderer)的话,会首先收集SimpleLight保存在SortedLight的SimpleLightArray中。
随后会遍历场景中的灯光Scene→Lights,检查当前View是否渲染灯光。随后根据FSortedLightSceneInfo
中的SortKey,收集灯光的信息:
// Check for shadows and light functions.
SortedLightInfo->SortKey.Fields.LightType = LightSceneInfoCompact.LightType;
SortedLightInfo->SortKey.Fields.bTextureProfile = ViewFamily.EngineShowFlags.TexturedLightProfiles && LightSceneInfo->Proxy->GetIESTextureResource();
SortedLightInfo->SortKey.Fields.bShadowed = bDynamicShadows && CheckForProjectedShadows(LightSceneInfo);
SortedLightInfo->SortKey.Fields.bLightFunction = ViewFamily.EngineShowFlags.LightFunctions && CheckForLightFunction(LightSceneInfo);
SortedLightInfo->SortKey.Fields.bUsesLightingChannels = Views[ViewIndex].bUsesLightingChannels && LightSceneInfo->Proxy->GetLightingChannelMask() != GetDefaultLightingChannelMask();
这里如果灯光为直射光或者面光源,则不会支持Tiled或者ClusterDeferred渲染。在FSortedLightSceneInfo
中,主要保存了上述的SortKey,它是一个union,还包含一个Packed变量,用于灯光的排序。
最后根据上面的SortedLights数组排序的结果,输出到OutSortedLights中。
三、LightGrid
1. 数据传输
这一步(Source\Runtime\Renderer\Private\LightGridInjection.cpp)是Forward+中生成逐Cluster灯光信息的部分。首先是View相关的Buffer,也就是在InitViews中初始化的ForwardLightingResources
,这时只会初始化数据,但是其中的Buffer还是空的,在ValidateUniformBuffer
的时候会失败,所以这里生成了一个全局的资源避免此类错误。涉及到的Buffer有:
virtual void InitRHI()
{
if (GMaxRHIFeatureLevel >= ERHIFeatureLevel::SM5)
{
ForwardLightingResources.ForwardLocalLightBuffer.Initialize(sizeof(FVector4), sizeof(FForwardLocalLightData) / sizeof(FVector4), PF_A32B32G32R32F, BUF_Dynamic);
ForwardLightingResources.NumCulledLightsGrid.Initialize(sizeof(uint32), 1, PF_R32_UINT);
const bool bSupportFormatConversion = RHISupportsBufferLoadTypeConversion(GMaxRHIShaderPlatform);
if (bSupportFormatConversion)
{
ForwardLightingResources.CulledLightDataGrid.Initialize(sizeof(uint16), 1, PF_R16_UINT);
}
else
{
ForwardLightingResources.CulledLightDataGrid.Initialize(sizeof(uint32), 1, PF_R32_UINT);
}
ForwardLightingResources.ForwardLightData.ForwardLocalLightBuffer = ForwardLightingResources.ForwardLocalLightBuffer.SRV;
ForwardLightingResources.ForwardLightData.NumCulledLightsGrid = ForwardLightingResources.NumCulledLightsGrid.SRV;
ForwardLightingResources.ForwardLightData.CulledLightDataGrid = ForwardLightingResources.CulledLightDataGrid.SRV;
ForwardLightingResources.ForwardLightDataUniformBuffer = TUniformBufferRef<FForwardLightData>::CreateUniformBufferImmediate(ForwardLightingResources.ForwardLightData, UniformBuffer_MultiFrame);
}
}
ForwardLocalLightBuffer
里面保存了LocalLightData
,包括位置、颜色、半径等信息,按照先SimpleLight后其他灯光的顺序存储。所有的数据会暂存在一个数组ForwardLocalLight
中,在收集完成后会上传到ForwardLocalLightBuffer
中:
UpdateDynamicVector4BufferData(ForwardLocalLightData, View.ForwardLightingResources->ForwardLocalLightBuffer);
template <typename T>
void UpdateDynamicVector4BufferData(const TArray<T, SceneRenderingAllocator> &DataArray, FDynamicReadBuffer &Buffer)
{
const uint32 NumBytesRequired = DataArray.Num() * DataArray.GetTypeSize();
if (Buffer.NumBytes < NumBytesRequired)
{
Buffer.Release();
Buffer.Initialize(sizeof(FVector4), NumBytesRequired / sizeof(FVector4), PF_A32B32G32R32F, BUF_Volatile);
}
Buffer.Lock();
FPlatformMemory::Memcpy(Buffer.MappedBuffer, DataArray.GetData(), DataArray.Num() * DataArray.GetTypeSize());
Buffer.Unlock();
}
这里引擎默认使用的是FDynamicReadBuffer
,这个Buffer有BUF_Dynamic
标记,默认是每帧都会更新的。如果不需要每帧更新,则应该使用FReadBuffer
。
随后根据LightGrid参数填充ForwardLightData
数据,主要是NumCulledLightsGrid
和CulledLightDataGrid
两个Buffer。最终的数据也会输出到这两个Buffer上。
2. Shader
上面的流程过后就要对Shader进行传参了。这里涉及到了两个Shader:
class FLightGridInjectionCS : public FGlobalShader
{
DECLARE_GLOBAL_SHADER(FLightGridInjectionCS);
SHADER_USE_PARAMETER_STRUCT(FLightGridInjectionCS, FGlobalShader)
public:
class FUseLinkedListDim : SHADER_PERMUTATION_BOOL("USE_LINKED_CULL_LIST");
using FPermutationDomain = TShaderPermutationDomain<FUseLinkedListDim>;
BEGIN_SHADER_PARAMETER_STRUCT(FParameters, )
SHADER_PARAMETER_STRUCT_REF(FReflectionCaptureShaderData, ReflectionCapture)
SHADER_PARAMETER_STRUCT_REF(FForwardLightData, Forward)
SHADER_PARAMETER_STRUCT_REF(FViewUniformShaderParameters, View)
SHADER_PARAMETER_UAV(RWBuffer<uint>, RWNumCulledLightsGrid)
SHADER_PARAMETER_UAV(RWBuffer<uint>, RWCulledLightDataGrid)
SHADER_PARAMETER_RDG_BUFFER_UAV(RWBuffer<uint>, RWNextCulledLightLink)
SHADER_PARAMETER_RDG_BUFFER_UAV(RWBuffer<uint>, RWStartOffsetGrid)
SHADER_PARAMETER_RDG_BUFFER_UAV(RWBuffer<uint>, RWCulledLightLinks)
SHADER_PARAMETER_SRV(StrongTypedBuffer<float4>, LightViewSpacePositionAndRadius)
SHADER_PARAMETER_SRV(StrongTypedBuffer<float4>, LightViewSpaceDirAndPreprocAngle)
END_SHADER_PARAMETER_STRUCT()
static bool ShouldCompilePermutation(const FGlobalShaderPermutationParameters &Parameters)
{
return IsFeatureLevelSupported(Parameters.Platform, ERHIFeatureLevel::SM5) || IsMobileDeferredShadingEnabled(Parameters.Platform);
}
static void ModifyCompilationEnvironment(const FGlobalShaderPermutationParameters &Parameters, FShaderCompilerEnvironment &OutEnvironment)
{
FGlobalShader::ModifyCompilationEnvironment(Parameters, OutEnvironment);
OutEnvironment.SetDefine(TEXT("THREADGROUP_SIZE"), LightGridInjectionGroupSize);
FForwardLightingParameters::ModifyCompilationEnvironment(Parameters.Platform, OutEnvironment);
OutEnvironment.SetDefine(TEXT("LIGHT_LINK_STRIDE"), LightLinkStride);
OutEnvironment.SetDefine(TEXT("ENABLE_LIGHT_CULLING_VIEW_SPACE_BUILD_DATA"), ENABLE_LIGHT_CULLING_VIEW_SPACE_BUILD_DATA);
}
};
IMPLEMENT_GLOBAL_SHADER(FLightGridInjectionCS, "/Engine/Private/LightGridInjection.usf", "LightGridInjectionCS", SF_Compute);
class FLightGridCompactCS : public FGlobalShader
{
DECLARE_GLOBAL_SHADER(FLightGridCompactCS)
SHADER_USE_PARAMETER_STRUCT(FLightGridCompactCS, FGlobalShader)
public:
BEGIN_SHADER_PARAMETER_STRUCT(FParameters, )
SHADER_PARAMETER_STRUCT_REF(FForwardLightData, Forward)
SHADER_PARAMETER_STRUCT_REF(FViewUniformShaderParameters, View)
SHADER_PARAMETER_UAV(RWBuffer<uint>, RWNumCulledLightsGrid)
SHADER_PARAMETER_UAV(RWBuffer<uint>, RWCulledLightDataGrid)
SHADER_PARAMETER_RDG_BUFFER_UAV(RWBuffer<uint>, RWNextCulledLightData)
SHADER_PARAMETER_RDG_BUFFER_SRV(Buffer<uint>, StartOffsetGrid)
SHADER_PARAMETER_RDG_BUFFER_SRV(Buffer<uint>, CulledLightLinks)
END_SHADER_PARAMETER_STRUCT()
static bool ShouldCompilePermutation(const FGlobalShaderPermutationParameters &Parameters)
{
return IsFeatureLevelSupported(Parameters.Platform, ERHIFeatureLevel::SM5) || IsMobileDeferredShadingEnabled(Parameters.Platform);
}
static void ModifyCompilationEnvironment(const FGlobalShaderPermutationParameters &Parameters, FShaderCompilerEnvironment &OutEnvironment)
{
FGlobalShader::ModifyCompilationEnvironment(Parameters, OutEnvironment);
OutEnvironment.SetDefine(TEXT("THREADGROUP_SIZE"), LightGridInjectionGroupSize);
FForwardLightingParameters::ModifyCompilationEnvironment(Parameters.Platform, OutEnvironment);
OutEnvironment.SetDefine(TEXT("LIGHT_LINK_STRIDE"), LightLinkStride);
OutEnvironment.SetDefine(TEXT("MAX_CAPTURES"), GMaxNumReflectionCaptures);
OutEnvironment.SetDefine(TEXT("ENABLE_LIGHT_CULLING_VIEW_SPACE_BUILD_DATA"), ENABLE_LIGHT_CULLING_VIEW_SPACE_BUILD_DATA);
}
};
IMPLEMENT_GLOBAL_SHADER(FLightGridCompactCS, "/Engine/Private/LightGridInjection.usf", "LightGridCompactCS", SF_Compute);
FLightGridInjectionCS
负责向各个Cluster注入灯光数据。FLightGridCompactCS
则是选择性的,在使用链表方式存储灯光数据的时候才会开启。
FLightGridInjectionCS
FLightGridInjectionCS的参数设置如下:
FRDGBufferRef CulledLightLinksBuffer = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateBufferDesc(sizeof(uint32), CulledLightLinksElements), TEXT("CulledLightLinks"));
FRDGBufferRef StartOffsetGridBuffer = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateBufferDesc(sizeof(uint32), NumCells), TEXT("StartOffsetGrid"));
FRDGBufferRef NextCulledLightLinkBuffer = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateBufferDesc(sizeof(uint32), 1), TEXT("NextCulledLightLink"));
FRDGBufferRef NextCulledLightDataBuffer = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateBufferDesc(sizeof(uint32), 1), TEXT("NextCulledLightData"));
FLightGridInjectionCS::FParameters *PassParameters = GraphBuilder.AllocParameters<FLightGridInjectionCS::FParameters>();
PassParameters->View = View.ViewUniformBuffer;
PassParameters->ReflectionCapture = View.ReflectionCaptureUniformBuffer;
PassParameters->Forward = View.ForwardLightingResources->ForwardLightDataUniformBuffer;
PassParameters->RWNumCulledLightsGrid = View.ForwardLightingResources->NumCulledLightsGrid.UAV;
PassParameters->RWCulledLightDataGrid = View.ForwardLightingResources->CulledLightDataGrid.UAV;
PassParameters->RWNextCulledLightLink = GraphBuilder.CreateUAV(NextCulledLightLinkBuffer, PF_R32_UINT);
PassParameters->RWStartOffsetGrid = GraphBuilder.CreateUAV(StartOffsetGridBuffer, PF_R32_UINT);
PassParameters->RWCulledLightLinks = GraphBuilder.CreateUAV(CulledLightLinksBuffer, PF_R32_UINT);
#if ENABLE_LIGHT_CULLING_VIEW_SPACE_BUILD_DATA
PassParameters->LightViewSpacePositionAndRadius = ForwardLightingCullingResources.ViewSpacePosAndRadiusData.SRV;
PassParameters->LightViewSpaceDirAndPreprocAngle = ForwardLightingCullingResources.ViewSpaceDirAndPreprocAngleData.SRV;
#endif // ENABLE_LIGHT_CULLING_VIEW_SPACE_BUILD_DATA
这里解释一下用到的Buffer:
RWNumCulledLightsGrid
:记录每个Cluster中有多少灯光RWNextCulledLightLink
:RWStartOffsetGrid
:每个Cluster的起始灯光位置RWCulledLightLinks
:Cluster灯光链表信息,每两个一组,一个保存当前的灯光索引,下一个指向前一个灯光索引。所以是一个反向链表。
整个灯光诸如过程如下:
- 根据
DispathcID
(GridCoordinate)来计算当前的Grid的Index - 根据
GridCoordinate
来计算当前Grid的AABB用来检测光源和Grid的相交性。 - 遍历当前所有的灯光,计算ViewSpace下灯光和Grid是否相交,也就是光源范围是否影响到了Grid。
- 如果相交,那么在使用链表的情况下,代码如下:
for (uint LocalLightIndex = 0; LocalLightIndex < ForwardLightData.NumLocalLights; LocalLightIndex++)
{
/* ... */
if (BoxDistanceSq < LightRadius * LightRadius)
{
uint NextLink;
InterlockedAdd(RWNextCulledLightLink[0], 1U, NextLink);
if (NextLink < NumAvailableLinks)
{
uint PreviousLink;
InterlockedExchange(RWStartOffsetGrid[GridIndex], NextLink, PreviousLink);
RWCulledLightLinks[NextLink * LIGHT_LINK_STRIDE + 0] = LocalLightIndex;
RWCulledLightLinks[NextLink * LIGHT_LINK_STRIDE + 1] = PreviousLink;
}
}
}
RWNextCulledLightLink
是一个单元素的Buffer,在当前Grid,如果遇到相交的灯光,就会+1,并把原本的值输出到NextLink
,随后会将NextLink
存储在RWStartOffsetGrid
对应的Grid中,同时原本的值PreviousLink
作为原来的值输出到链表中。
FLightGridCompactCS
这个Shader的主要作用就是对上一步逆向输出的灯光链表排成正序,同时设置RWNumCulledLightsGrid
来索引灯光,它每两个索引为一个元素,一个存储灯光数量,另一个存储当前Cluster灯光在RWCulledLightDataGrid
中的起始位置。
四、着色
主要的着色过程在MobileDeferredShading.usf
中,如果启用了Cluster,那么在MobileDirectLightPS
中会根据当前的ViewSpace来计算GridIndex,进一步取出CulledLightGridData
,然后用来着色。