剖析虚幻渲染体系(06)- UE5特辑Part 2(Lumen和其它下篇)

6.5.7 Lumen非直接光照

6.5.7.1 RenderDiffuseIndirectAndAmbientOcclusion

此阶段就是利用之前Lumen计算生成的信息计算最终的非直接光照,以模拟全局光照效果,它的过程如下所示:

可知有SSGI降噪、屏幕空间探针收集、反射以及非直接光组合等几个阶段。对应的源码RenderDiffuseIndirectAndAmbientOcclusion如下:

// Engine\Source\Runtime\Renderer\Private\IndirectLightRendering.cpp

oid FDeferredShadingSceneRenderer::RenderDiffuseIndirectAndAmbientOcclusion(
    FRDGBuilder& GraphBuilder,
    FSceneTextures& SceneTextures,
    FRDGTextureRef LightingChannelsTexture,
    bool bIsVisualizePass)
{
    using namespace HybridIndirectLighting;

    if (ViewFamily.EngineShowFlags.VisualizeLumenIndirectDiffuse != bIsVisualizePass)
    {
        return;
    }

    RDG_EVENT_SCOPE(GraphBuilder, "DiffuseIndirectAndAO");

    FSceneTextureParameters SceneTextureParameters = GetSceneTextureParameters(GraphBuilder, SceneTextures.UniformBuffer);
    FRDGTextureRef SceneColorTexture = SceneTextures.Color.Target;

    const FRDGSystemTextures& SystemTextures = FRDGSystemTextures::Get(GraphBuilder);

    // 每个view都需要单独计算一次.
    for (FViewInfo& View : Views)
    {
        RDG_GPU_MASK_SCOPE(GraphBuilder, View.GPUMask);

        const FPerViewPipelineState& ViewPipelineState = GetViewPipelineState(View);

        int32 DenoiseMode = CVarDiffuseIndirectDenoiser.GetValueOnRenderThread();

        // 设置通用的漫反射参数.
        FCommonParameters CommonDiffuseParameters;
        SetupCommonDiffuseIndirectParameters(GraphBuilder, SceneTextureParameters, View, /* out */ CommonDiffuseParameters);

        // 为降噪器更新旧的光线追踪配置.
        IScreenSpaceDenoiser::FAmbientOcclusionRayTracingConfig RayTracingConfig;
        {
            RayTracingConfig.RayCountPerPixel = CommonDiffuseParameters.RayCountPerPixel;
            RayTracingConfig.ResolutionFraction = 1.0f / float(CommonDiffuseParameters.DownscaleFactor);
        }

        // 上一帧场景颜色
        ScreenSpaceRayTracing::FPrevSceneColorMip PrevSceneColorMip;
        if ((ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen || ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::SSGI) && View.PrevViewInfo.ScreenSpaceRayTracingInput.IsValid())
        {
            PrevSceneColorMip = ScreenSpaceRayTracing::ReducePrevSceneColorMip(GraphBuilder, SceneTextureParameters, View);
        }

        // 降噪器输入输出参数
        FSSDSignalTextures DenoiserOutputs;
        IScreenSpaceDenoiser::FDiffuseIndirectInputs DenoiserInputs;
        IScreenSpaceDenoiser::FDiffuseIndirectHarmonic DenoiserSphericalHarmonicInputs;
        FLumenReflectionCompositeParameters LumenReflectionCompositeParameters;
        bool bLumenUseDenoiserComposite = ViewPipelineState.bUseLumenProbeHierarchy;

        // 根据不同的非直接光方法获得降噪输入或输出结构.
        
        // Lumen探针层次结构
        if (ViewPipelineState.bUseLumenProbeHierarchy)
        {
            check(ViewPipelineState.DiffuseIndirectDenoiser == IScreenSpaceDenoiser::EMode::Disabled);
            DenoiserOutputs = RenderLumenProbeHierarchy(
                GraphBuilder,
                SceneTextures,
                CommonDiffuseParameters, PrevSceneColorMip,
                View, &View.PrevViewInfo);
        }
        // 屏幕空间全局光照
        else if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::SSGI)
        {
            RDG_EVENT_SCOPE(GraphBuilder, "SSGI %dx%d", CommonDiffuseParameters.TracingViewportSize.X, CommonDiffuseParameters.TracingViewportSize.Y);
            DenoiserInputs = ScreenSpaceRayTracing::CastStandaloneDiffuseIndirectRays(
                GraphBuilder, CommonDiffuseParameters, PrevSceneColorMip, View);
        }
        // 光线追踪全局光照
        else if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::RTGI)
        {
            // TODO: Refactor under the HybridIndirectLighting standard API.
            // TODO: hybrid SSGI / RTGI
            RenderRayTracingGlobalIllumination(GraphBuilder, SceneTextureParameters, View, /* out */ &RayTracingConfig, /* out */ &DenoiserInputs);
        }
        // Lumen全局光照
        else if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen)
        {
            check(ViewPipelineState.DiffuseIndirectDenoiser == IScreenSpaceDenoiser::EMode::Disabled);

            FLumenMeshSDFGridParameters MeshSDFGridParameters;

            DenoiserOutputs = RenderLumenScreenProbeGather(
                GraphBuilder, 
                SceneTextures,
                PrevSceneColorMip, 
                LightingChannelsTexture,
                View,
                &View.PrevViewInfo,
                bLumenUseDenoiserComposite,
                MeshSDFGridParameters);

            if (ViewPipelineState.ReflectionsMethod == EReflectionsMethod::Lumen)
            {
                DenoiserOutputs.Textures[2] = RenderLumenReflections(
                    GraphBuilder,
                    View,
                    SceneTextures, 
                    MeshSDFGridParameters,
                    LumenReflectionCompositeParameters);
            }

            if (!DenoiserOutputs.Textures[2])
            {
                DenoiserOutputs.Textures[2] = DenoiserOutputs.Textures[1];
            }
        }

        FRDGTextureRef AmbientOcclusionMask = DenoiserInputs.AmbientOcclusionMask;

        // 处理降噪.
        if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen)
        {
            // 由于Lumen全局输出的已经带了降噪, 所以此处不需要任何操作.
        }
        else if (ViewPipelineState.DiffuseIndirectDenoiser == IScreenSpaceDenoiser::EMode::Disabled)
        {
            DenoiserOutputs.Textures[0] = DenoiserInputs.Color;
            DenoiserOutputs.Textures[1] = SystemTextures.White;
        }
        else
        {
            const IScreenSpaceDenoiser* DefaultDenoiser = IScreenSpaceDenoiser::GetDefaultDenoiser();
            const IScreenSpaceDenoiser* DenoiserToUse = 
                ViewPipelineState.DiffuseIndirectDenoiser == IScreenSpaceDenoiser::EMode::DefaultDenoiser
                ? DefaultDenoiser : GScreenSpaceDenoiser;

            RDG_EVENT_SCOPE(GraphBuilder, "%s%s(DiffuseIndirect) %dx%d",
                DenoiserToUse != DefaultDenoiser ? TEXT("ThirdParty ") : TEXT(""),
                DenoiserToUse->GetDebugName(),
                View.ViewRect.Width(), View.ViewRect.Height());

            if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::RTGI)
            {
                // 对RTGI进行降噪.
                DenoiserOutputs = DenoiserToUse->DenoiseDiffuseIndirect(
                    GraphBuilder,
                    View,
                    &View.PrevViewInfo,
                    SceneTextureParameters,
                    DenoiserInputs,
                    RayTracingConfig);

                AmbientOcclusionMask = DenoiserOutputs.Textures[1];
            }
            else if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::SSGI)
            {
                // 对SSGI的结果降噪.
                DenoiserOutputs = DenoiserToUse->DenoiseScreenSpaceDiffuseIndirect(
                    GraphBuilder,
                    View,
                    &View.PrevViewInfo,
                    SceneTextureParameters,
                    DenoiserInputs,
                    RayTracingConfig);

                AmbientOcclusionMask = DenoiserOutputs.Textures[1];
            }
        }

        // 渲染AO
        bool bWritableAmbientOcclusionMask = true;
        if (ViewPipelineState.AmbientOcclusionMethod == EAmbientOcclusionMethod::Disabled)
        {
            ensure(!HasBeenProduced(SceneTextures.ScreenSpaceAO));
            AmbientOcclusionMask = nullptr;
            bWritableAmbientOcclusionMask = false;
        }
        else if (ViewPipelineState.AmbientOcclusionMethod == EAmbientOcclusionMethod::RTAO)
        {
            RenderRayTracingAmbientOcclusion(
                GraphBuilder,
                View,
                SceneTextureParameters,
                &AmbientOcclusionMask);
        }
        else if (ViewPipelineState.AmbientOcclusionMethod == EAmbientOcclusionMethod::SSGI)
        {
            check(AmbientOcclusionMask);
        }
        else if (ViewPipelineState.AmbientOcclusionMethod == EAmbientOcclusionMethod::SSAO)
        {
            // Fetch result of SSAO that was done earlier.
            if (HasBeenProduced(SceneTextures.ScreenSpaceAO))
            {
                AmbientOcclusionMask = SceneTextures.ScreenSpaceAO;
            }
            else
            {
                AmbientOcclusionMask = GetScreenSpaceAOFallback(SystemTextures);
                bWritableAmbientOcclusionMask = false;
            }
        }
        else
        {
            unimplemented();
            bWritableAmbientOcclusionMask = false;
        }

        // Extract the dynamic AO for application of AO beyond RenderDiffuseIndirectAndAmbientOcclusion()
        if (AmbientOcclusionMask && ViewPipelineState.AmbientOcclusionMethod != EAmbientOcclusionMethod::SSAO)
        {
            ensureMsgf(Views.Num() == 1, TEXT("Need to add support for one AO texture per view in FSceneTextures"));
            SceneTextures.ScreenSpaceAO = AmbientOcclusionMask;
        }

        if (HairStrands::HasViewHairStrandsData(View) && (ViewPipelineState.AmbientOcclusionMethod == EAmbientOcclusionMethod::SSGI || ViewPipelineState.AmbientOcclusionMethod == EAmbientOcclusionMethod::SSAO) && bWritableAmbientOcclusionMask)
        {
            RenderHairStrandsAmbientOcclusion(
                GraphBuilder,
                View,
                AmbientOcclusionMask);
        }

        // 应用漫反射非直接光和环境光AO到场景颜色.
        if ((DenoiserOutputs.Textures[0] || AmbientOcclusionMask) && (!bIsVisualizePass || ViewPipelineState.DiffuseIndirectDenoiser != IScreenSpaceDenoiser::EMode::Disabled || ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen)
            && !IsMetalPlatform(ShaderPlatform))
        {
            // 用的PS是FDiffuseIndirectCompositePS
            FDiffuseIndirectCompositePS::FParameters* PassParameters = GraphBuilder.AllocParameters<FDiffuseIndirectCompositePS::FParameters>();
            
            PassParameters->AmbientOcclusionStaticFraction = FMath::Clamp(View.FinalPostProcessSettings.AmbientOcclusionStaticFraction, 0.0f, 1.0f);

            PassParameters->ApplyAOToDynamicDiffuseIndirect = 0.0f;

            if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen)
            {
                PassParameters->ApplyAOToDynamicDiffuseIndirect = 1.0f;
            }

            const FIntPoint BufferExtent = SceneTextureParameters.SceneDepthTexture->Desc.Extent;

            {
                // Placeholder texture for textures pulled in from SSDCommon.ush
                FRDGTextureDesc Desc = FRDGTextureDesc::Create2D(
                    FIntPoint(1),
                    PF_R32_UINT,
                    FClearValueBinding::Black,
                    TexCreate_ShaderResource);
                FRDGTextureRef CompressedMetadataPlaceholder = GraphBuilder.CreateTexture(Desc, TEXT("CompressedMetadataPlaceholder"));

                PassParameters->CompressedMetadata[0] = CompressedMetadataPlaceholder;
                PassParameters->CompressedMetadata[1] = CompressedMetadataPlaceholder;
            }

            PassParameters->BufferUVToOutputPixelPosition = BufferExtent;
            PassParameters->EyeAdaptation = GetEyeAdaptationTexture(GraphBuilder, View);
            PassParameters->LumenReflectionCompositeParameters = LumenReflectionCompositeParameters;

            PassParameters->bVisualizeDiffuseIndirect = bIsVisualizePass;

            PassParameters->DiffuseIndirect = DenoiserOutputs;
            PassParameters->DiffuseIndirectSampler = TStaticSamplerState<SF_Point>::GetRHI();

            PassParameters->PreIntegratedGF = GSystemTextures.PreintegratedGF->GetRenderTargetItem().ShaderResourceTexture;
            PassParameters->PreIntegratedGFSampler = TStaticSamplerState<SF_Bilinear, AM_Clamp, AM_Clamp, AM_Clamp>::GetRHI();

            PassParameters->AmbientOcclusionTexture = AmbientOcclusionMask;
            PassParameters->AmbientOcclusionSampler = TStaticSamplerState<SF_Point>::GetRHI();
            
            if (!PassParameters->AmbientOcclusionTexture || bIsVisualizePass)
            {
                PassParameters->AmbientOcclusionTexture = SystemTextures.White;
            }

            // 设置降噪器的通用shader参数.
            Denoiser::SetupCommonShaderParameters(
                View, SceneTextureParameters,
                View.ViewRect,
                1.0f / CommonDiffuseParameters.DownscaleFactor,
                /* out */ &PassParameters->DenoiserCommonParameters);
            PassParameters->SceneTextures = SceneTextureParameters;
            PassParameters->ViewUniformBuffer = View.ViewUniformBuffer;

            PassParameters->RenderTargets[0] = FRenderTargetBinding(
                SceneColorTexture, ERenderTargetLoadAction::ELoad);

            {
                FRDGTextureDesc Desc = FRDGTextureDesc::Create2D(
                    SceneColorTexture->Desc.Extent,
                    PF_FloatRGBA,
                    FClearValueBinding::None,
                    TexCreate_ShaderResource | TexCreate_UAV);

                PassParameters->PassDebugOutput = GraphBuilder.CreateUAV(
                    GraphBuilder.CreateTexture(Desc, TEXT("DebugDiffuseIndirectComposite")));
            }

            const TCHAR* DiffuseIndirectSampling = TEXT("Disabled");
            FDiffuseIndirectCompositePS::FPermutationDomain PermutationVector;
            bool bUpscale = false;

            if (DenoiserOutputs.Textures[0])
            {
                if (bLumenUseDenoiserComposite)
                {
                    PermutationVector.Set<FDiffuseIndirectCompositePS::FApplyDiffuseIndirectDim>(2);
                    DiffuseIndirectSampling = TEXT("ProbeHierarchy");
                }
                else if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::RTGI)
                {
                    PermutationVector.Set<FDiffuseIndirectCompositePS::FApplyDiffuseIndirectDim>(3);
                    DiffuseIndirectSampling = TEXT("RTGI");
                }
                else if (ViewPipelineState.DiffuseIndirectMethod == EDiffuseIndirectMethod::Lumen)
                {
                    PermutationVector.Set<FDiffuseIndirectCompositePS::FApplyDiffuseIndirectDim>(4);
                    DiffuseIndirectSampling = TEXT("ScreenProbeGather");
                }
                else
                {
                    PermutationVector.Set<FDiffuseIndirectCompositePS::FApplyDiffuseIndirectDim>(1);
                    DiffuseIndirectSampling = TEXT("SSGI");
                    bUpscale = DenoiserOutputs.Textures[0]->Desc.Extent != SceneColorTexture->Desc.Extent;
                }

                PermutationVector.Set<FDiffuseIndirectCompositePS::FUpscaleDiffuseIndirectDim>(bUpscale);
            }

            TShaderMapRef<FDiffuseIndirectCompositePS> PixelShader(View.ShaderMap, PermutationVector);
            // 清理和优化无用的shader资源绑定.
            ClearUnusedGraphResources(PixelShader, PassParameters);

            FRHIBlendState* BlendState = TStaticBlendState<CW_RGBA, BO_Add, BF_One, BF_Source1Color, BO_Add, BF_One, BF_Source1Alpha>::GetRHI();

            if (bIsVisualizePass)
            {
                BlendState = TStaticBlendState<>::GetRHI();
            }

            // 组合非直接光Pass.
            FPixelShaderUtils::AddFullscreenPass(
                GraphBuilder,
                View.ShaderMap,
                RDG_EVENT_NAME(
                    "DiffuseIndirectComposite(DiffuseIndirect=%s%s%s%s) %dx%d",
                    DiffuseIndirectSampling,
                    PermutationVector.Get<FDiffuseIndirectCompositePS::FUpscaleDiffuseIndirectDim>() ? TEXT(" UpscaleDiffuseIndirect") : TEXT(""),
                    AmbientOcclusionMask ? TEXT(" ApplyAOToSceneColor") : TEXT(""),
                    PassParameters->ApplyAOToDynamicDiffuseIndirect > 0.0f ? TEXT(" ApplyAOToDynamicDiffuseIndirect") : TEXT(""),
                    View.ViewRect.Width(), View.ViewRect.Height()),
                PixelShader,
                PassParameters,
                View.ViewRect,
                BlendState);
        } // if (DenoiserOutputs.Color || bApplySSAO)

        // 应用环境cubemap.
        if (IsAmbientCubemapPassRequired(View) && !bIsVisualizePass && !ViewPipelineState.bUseLumenProbeHierarchy)
        {
            FAmbientCubemapCompositePS::FParameters* PassParameters = GraphBuilder.AllocParameters<FAmbientCubemapCompositePS::FParameters>();
            
            PassParameters->PreIntegratedGF = GSystemTextures.PreintegratedGF->GetRenderTargetItem().ShaderResourceTexture;
            PassParameters->PreIntegratedGFSampler = TStaticSamplerState<SF_Bilinear, AM_Clamp, AM_Clamp, AM_Clamp>::GetRHI();
            
            PassParameters->AmbientOcclusionTexture = AmbientOcclusionMask;
            PassParameters->AmbientOcclusionSampler = TStaticSamplerState<SF_Point>::GetRHI();
            
            if (!PassParameters->AmbientOcclusionTexture)
            {
                PassParameters->AmbientOcclusionTexture = SystemTextures.White;
            }

            PassParameters->SceneTextures = SceneTextureParameters;
            PassParameters->ViewUniformBuffer = View.ViewUniformBuffer;

            PassParameters->RenderTargets[0] = FRenderTargetBinding(
                SceneColorTexture, ERenderTargetLoadAction::ELoad);
        
            TShaderMapRef<FAmbientCubemapCompositePS> PixelShader(View.ShaderMap);
            GraphBuilder.AddPass(
                RDG_EVENT_NAME("AmbientCubemapComposite %dx%d", View.ViewRect.Width(), View.ViewRect.Height()),
                PassParameters,
                ERDGPassFlags::Raster,
                [PassParameters, &View, PixelShader](FRHICommandList& RHICmdList)
            {
                TShaderMapRef<FPostProcessVS> VertexShader(View.ShaderMap);
                
                RHICmdList.SetViewport(View.ViewRect.Min.X, View.ViewRect.Min.Y, 0.0f, View.ViewRect.Max.X, View.ViewRect.Max.Y, 0.0);

                FGraphicsPipelineStateInitializer GraphicsPSOInit;
                RHICmdList.ApplyCachedRenderTargets(GraphicsPSOInit);

                // set the state
                GraphicsPSOInit.BlendState = TStaticBlendState<CW_RGB, BO_Add, BF_One, BF_One, BO_Add, BF_One, BF_One>::GetRHI();
                GraphicsPSOInit.RasterizerState = TStaticRasterizerState<>::GetRHI();
                GraphicsPSOInit.DepthStencilState = TStaticDepthStencilState<false, CF_Always>::GetRHI();

                GraphicsPSOInit.BoundShaderState.VertexDeclarationRHI = GFilterVertexDeclaration.VertexDeclarationRHI;
                GraphicsPSOInit.BoundShaderState.VertexShaderRHI = VertexShader.GetVertexShader();
                GraphicsPSOInit.BoundShaderState.PixelShaderRHI = PixelShader.GetPixelShader();
                GraphicsPSOInit.PrimitiveType = PT_TriangleList;

                SetGraphicsPipelineState(RHICmdList, GraphicsPSOInit);

                uint32 Count = View.FinalPostProcessSettings.ContributingCubemaps.Num();
                for (const FFinalPostProcessSettings::FCubemapEntry& CubemapEntry : View.FinalPostProcessSettings.ContributingCubemaps)
                {
                    FAmbientCubemapCompositePS::FParameters ShaderParameters = *PassParameters;
                    SetupAmbientCubemapParameters(CubemapEntry, &ShaderParameters.AmbientCubemap);
                    SetShaderParameters(RHICmdList, PixelShader, PixelShader.GetPixelShader(), ShaderParameters);
                    
                    DrawPostProcessPass(
                        RHICmdList,
                        0, 0,
                        View.ViewRect.Width(), View.ViewRect.Height(),
                        View.ViewRect.Min.X, View.ViewRect.Min.Y,
                        View.ViewRect.Width(), View.ViewRect.Height(),
                        View.ViewRect.Size(),
                        GetSceneTextureExtent(),
                        VertexShader,
                        View.StereoPass, 
                        false, // TODO.
                        EDRF_UseTriangleOptimization);
                }
            });
        } // if (IsAmbientCubemapPassRequired(View))
    } // for (FViewInfo& View : Views)
}

6.5.7.2 RenderLumenScreenProbeGather

RenderLumenScreenProbeGather的功能是渲染Lumen屏幕空间的探针收集,其代码如下:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenScreenProbeGather.cpp

FSSDSignalTextures FDeferredShadingSceneRenderer::RenderLumenScreenProbeGather(
    FRDGBuilder& GraphBuilder,
    const FSceneTextures& SceneTextures,
    const ScreenSpaceRayTracing::FPrevSceneColorMip& PrevSceneColorMip,
    FRDGTextureRef LightingChannelsTexture,
    const FViewInfo& View,
    FPreviousViewInfo* PreviousViewInfos,
    bool& bLumenUseDenoiserComposite,
    FLumenMeshSDFGridParameters& MeshSDFGridParameters)
{
    LLM_SCOPE_BYTAG(Lumen);

    // 渲染Lumen辐照度场收集.
    if (GLumenIrradianceFieldGather != 0)
    {
        bLumenUseDenoiserComposite = false;
        return RenderLumenIrradianceFieldGather(GraphBuilder, SceneTextures, View);
    }

    RDG_EVENT_SCOPE(GraphBuilder, "LumenScreenProbeGather");
    RDG_GPU_STAT_SCOPE(GraphBuilder, LumenScreenProbeGather);

    check(ShouldRenderLumenDiffuseGI(Scene, View, true));
    const FRDGSystemTextures& SystemTextures = FRDGSystemTextures::Get(GraphBuilder);

    if (!LightingChannelsTexture)
    {
        LightingChannelsTexture = SystemTextures.Black;
    }

    // 如果没有启用LumenScreenProbeGather, 则直接清理降噪输入.
    if (!GLumenScreenProbeGather)
    {
        FSSDSignalTextures ScreenSpaceDenoiserInputs;
        ScreenSpaceDenoiserInputs.Textures[0] = SystemTextures.Black;
        FRDGTextureDesc RoughSpecularIndirectDesc = FRDGTextureDesc::Create2D(SceneTextures.Config.Extent, PF_FloatRGB, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV);
        ScreenSpaceDenoiserInputs.Textures[1] = GraphBuilder.CreateTexture(RoughSpecularIndirectDesc, TEXT("Lumen.ScreenProbeGather.RoughSpecularIndirect"));
        AddClearUAVPass(GraphBuilder, GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenSpaceDenoiserInputs.Textures[1])), FLinearColor::Black);
        bLumenUseDenoiserComposite = false;
        return ScreenSpaceDenoiserInputs;
    }

    // 从统一缓冲区拉取备用纹理.
    const FSceneTextureParameters SceneTextureParameters = GetSceneTextureParameters(GraphBuilder, SceneTextures.UniformBuffer);

    // 设置屏幕空间探针的参数.
    FScreenProbeParameters ScreenProbeParameters;
    ScreenProbeParameters.ScreenProbeTracingOctahedronResolution = LumenScreenProbeGather::GetTracingOctahedronResolution(View);
    ensureMsgf(ScreenProbeParameters.ScreenProbeTracingOctahedronResolution < (1 << 6) - 1, TEXT("Tracing resolution %u was larger than supported by PackRayInfo()"), ScreenProbeParameters.ScreenProbeTracingOctahedronResolution);
    ScreenProbeParameters.ScreenProbeGatherOctahedronResolution = LumenScreenProbeGather::GetGatherOctahedronResolution(ScreenProbeParameters.ScreenProbeTracingOctahedronResolution);
    ScreenProbeParameters.ScreenProbeGatherOctahedronResolutionWithBorder = ScreenProbeParameters.ScreenProbeGatherOctahedronResolution + 2 * (1 << (GLumenScreenProbeGatherNumMips - 1));
    ScreenProbeParameters.ScreenProbeDownsampleFactor = LumenScreenProbeGather::GetScreenDownsampleFactor(View);

    ScreenProbeParameters.ScreenProbeViewSize = FIntPoint::DivideAndRoundUp(View.ViewRect.Size(), (int32)ScreenProbeParameters.ScreenProbeDownsampleFactor);
    ScreenProbeParameters.ScreenProbeAtlasViewSize = ScreenProbeParameters.ScreenProbeViewSize;
    ScreenProbeParameters.ScreenProbeAtlasViewSize.Y += FMath::TruncToInt(ScreenProbeParameters.ScreenProbeViewSize.Y * GLumenScreenProbeGatherAdaptiveProbeAllocationFraction);

    ScreenProbeParameters.ScreenProbeAtlasBufferSize = FIntPoint::DivideAndRoundUp(SceneTextures.Config.Extent, (int32)ScreenProbeParameters.ScreenProbeDownsampleFactor);
    ScreenProbeParameters.ScreenProbeAtlasBufferSize.Y += FMath::TruncToInt(ScreenProbeParameters.ScreenProbeAtlasBufferSize.Y * GLumenScreenProbeGatherAdaptiveProbeAllocationFraction);

    ScreenProbeParameters.ScreenProbeGatherMaxMip = GLumenScreenProbeGatherNumMips - 1;
    ScreenProbeParameters.RelativeSpeedDifferenceToConsiderLightingMoving = GLumenScreenProbeRelativeSpeedDifferenceToConsiderLightingMoving;
    ScreenProbeParameters.ScreenTraceNoFallbackThicknessScale = Lumen::UseHardwareRayTracedScreenProbeGather() ? 1.0f : GLumenScreenProbeScreenTracesThicknessScaleWhenNoFallback;
    ScreenProbeParameters.NumUniformScreenProbes = ScreenProbeParameters.ScreenProbeViewSize.X * ScreenProbeParameters.ScreenProbeViewSize.Y;
    ScreenProbeParameters.MaxNumAdaptiveProbes = FMath::TruncToInt(ScreenProbeParameters.NumUniformScreenProbes * GLumenScreenProbeGatherAdaptiveProbeAllocationFraction);
    extern int32 GLumenScreenProbeGatherVisualizeTraces;
    ScreenProbeParameters.FixedJitterIndex = GLumenScreenProbeGatherVisualizeTraces == 0 ? GLumenScreenProbeFixedJitterIndex : 6;

    FRDGTextureDesc DownsampledDepthDesc(FRDGTextureDesc::Create2D(ScreenProbeParameters.ScreenProbeAtlasBufferSize, PF_R32_UINT, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV));
    ScreenProbeParameters.ScreenProbeSceneDepth = GraphBuilder.CreateTexture(DownsampledDepthDesc, TEXT("Lumen.ScreenProbeGather.ScreenProbeSceneDepth"));

    FRDGTextureDesc DownsampledSpeedDesc(FRDGTextureDesc::Create2D(ScreenProbeParameters.ScreenProbeAtlasBufferSize, PF_R16F, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV));
    ScreenProbeParameters.ScreenProbeWorldSpeed = GraphBuilder.CreateTexture(DownsampledSpeedDesc, TEXT("Lumen.ScreenProbeGather.ScreenProbeWorldSpeed"));

    FBlueNoise BlueNoise;
    InitializeBlueNoise(BlueNoise);
    ScreenProbeParameters.BlueNoise = CreateUniformBufferImmediate(BlueNoise, EUniformBufferUsage::UniformBuffer_SingleDraw);

    ScreenProbeParameters.OctahedralSolidAngleParameters.OctahedralSolidAngleTextureResolutionSq = GLumenOctahedralSolidAngleTextureSize * GLumenOctahedralSolidAngleTextureSize;
    ScreenProbeParameters.OctahedralSolidAngleParameters.OctahedralSolidAngleTexture = InitializeOctahedralSolidAngleTexture(GraphBuilder, View.ShaderMap, GLumenOctahedralSolidAngleTextureSize, View.ViewState->Lumen.ScreenProbeGatherState.OctahedralSolidAngleTextureRT);

    // 探针下采样深度.
    {
        FScreenProbeDownsampleDepthUniformCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FScreenProbeDownsampleDepthUniformCS::FParameters>();
        PassParameters->RWScreenProbeSceneDepth = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenProbeSceneDepth));
        PassParameters->RWScreenProbeWorldSpeed = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenProbeWorldSpeed));
        PassParameters->View = View.ViewUniformBuffer;
        PassParameters->SceneTexturesStruct = SceneTextures.UniformBuffer;
        PassParameters->SceneTextures = SceneTextureParameters;
        PassParameters->ScreenProbeParameters = ScreenProbeParameters;

        auto ComputeShader = View.ShaderMap->GetShader<FScreenProbeDownsampleDepthUniformCS>(0);

        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("UniformPlacement DownsampleFactor=%u", ScreenProbeParameters.ScreenProbeDownsampleFactor),
            ComputeShader,
            PassParameters,
            FComputeShaderUtils::GetGroupCount(ScreenProbeParameters.ScreenProbeViewSize, FScreenProbeDownsampleDepthUniformCS::GetGroupSize()));
    }

    FRDGBufferRef NumAdaptiveScreenProbes = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateBufferDesc(sizeof(uint32), 1), TEXT("Lumen.ScreenProbeGather.NumAdaptiveScreenProbes"));
    FRDGBufferRef AdaptiveScreenProbeData = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateBufferDesc(sizeof(uint32), FMath::Max<uint32>(ScreenProbeParameters.MaxNumAdaptiveProbes, 1)), TEXT("Lumen.ScreenProbeGather.daptiveScreenProbeData"));

    ScreenProbeParameters.NumAdaptiveScreenProbes = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(NumAdaptiveScreenProbes, PF_R32_UINT));
    ScreenProbeParameters.AdaptiveScreenProbeData = GraphBuilder.CreateSRV(FRDGBufferSRVDesc(AdaptiveScreenProbeData, PF_R32_UINT));

    const FIntPoint ScreenProbeViewportBufferSize = FIntPoint::DivideAndRoundUp(SceneTextures.Config.Extent, (int32)ScreenProbeParameters.ScreenProbeDownsampleFactor);
    FRDGTextureDesc ScreenTileAdaptiveProbeHeaderDesc(FRDGTextureDesc::Create2D(ScreenProbeViewportBufferSize, PF_R32_UINT, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV));
    FIntPoint ScreenTileAdaptiveProbeIndicesBufferSize = FIntPoint(ScreenProbeViewportBufferSize.X * ScreenProbeParameters.ScreenProbeDownsampleFactor, ScreenProbeViewportBufferSize.Y * ScreenProbeParameters.ScreenProbeDownsampleFactor);
    FRDGTextureDesc ScreenTileAdaptiveProbeIndicesDesc(FRDGTextureDesc::Create2D(ScreenTileAdaptiveProbeIndicesBufferSize, PF_R16_UINT, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV));
    ScreenProbeParameters.ScreenTileAdaptiveProbeHeader = GraphBuilder.CreateTexture(ScreenTileAdaptiveProbeHeaderDesc, TEXT("Lumen.ScreenProbeGather.ScreenTileAdaptiveProbeHeader"));
    ScreenProbeParameters.ScreenTileAdaptiveProbeIndices = GraphBuilder.CreateTexture(ScreenTileAdaptiveProbeIndicesDesc, TEXT("Lumen.ScreenProbeGather.ScreenTileAdaptiveProbeIndices"));

    FComputeShaderUtils::ClearUAV(GraphBuilder, View.ShaderMap, GraphBuilder.CreateUAV(FRDGBufferUAVDesc(NumAdaptiveScreenProbes, PF_R32_UINT)), 0);
    uint32 ClearValues[4] = {0, 0, 0, 0};
    AddClearUAVPass(GraphBuilder, GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenTileAdaptiveProbeHeader)), ClearValues);

    const uint32 AdaptiveProbeMinDownsampleFactor = FMath::Clamp(GLumenScreenProbeGatherAdaptiveProbeMinDownsampleFactor, 1, 64);

    if (ScreenProbeParameters.MaxNumAdaptiveProbes > 0 && AdaptiveProbeMinDownsampleFactor < ScreenProbeParameters.ScreenProbeDownsampleFactor)
    { 
        // 探针自适应地放置位置.
        uint32 PlacementDownsampleFactor = ScreenProbeParameters.ScreenProbeDownsampleFactor;
        do
        {
            PlacementDownsampleFactor /= 2;
            FScreenProbeAdaptivePlacementCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FScreenProbeAdaptivePlacementCS::FParameters>();
            PassParameters->RWScreenProbeSceneDepth = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenProbeSceneDepth));
            PassParameters->RWScreenProbeWorldSpeed = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenProbeWorldSpeed));
            PassParameters->RWNumAdaptiveScreenProbes = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(NumAdaptiveScreenProbes, PF_R32_UINT));
            PassParameters->RWAdaptiveScreenProbeData = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(AdaptiveScreenProbeData, PF_R32_UINT));
            PassParameters->RWScreenTileAdaptiveProbeHeader = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenTileAdaptiveProbeHeader));
            PassParameters->RWScreenTileAdaptiveProbeIndices = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenTileAdaptiveProbeIndices));
            PassParameters->View = View.ViewUniformBuffer;
            PassParameters->SceneTexturesStruct = SceneTextures.UniformBuffer;
            PassParameters->SceneTextures = SceneTextureParameters;
            PassParameters->ScreenProbeParameters = ScreenProbeParameters;
            PassParameters->PlacementDownsampleFactor = PlacementDownsampleFactor;

            auto ComputeShader = View.ShaderMap->GetShader<FScreenProbeAdaptivePlacementCS>(0);

            FComputeShaderUtils::AddPass(
                GraphBuilder,
                RDG_EVENT_NAME("AdaptivePlacement DownsampleFactor=%u", PlacementDownsampleFactor),
                ComputeShader,
                PassParameters,
                FComputeShaderUtils::GetGroupCount(FIntPoint::DivideAndRoundDown(View.ViewRect.Size(), (int32)PlacementDownsampleFactor), FScreenProbeAdaptivePlacementCS::GetGroupSize()));
        }
        while (PlacementDownsampleFactor > AdaptiveProbeMinDownsampleFactor);
    }
    else
    {
        FComputeShaderUtils::ClearUAV(GraphBuilder, View.ShaderMap, GraphBuilder.CreateUAV(FRDGBufferUAVDesc(AdaptiveScreenProbeData, PF_R32_UINT)), 0);
        AddClearUAVPass(GraphBuilder, GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.ScreenTileAdaptiveProbeIndices)), ClearValues);
    }

    FRDGBufferRef ScreenProbeIndirectArgs = GraphBuilder.CreateBuffer(FRDGBufferDesc::CreateIndirectDesc<FRHIDispatchIndirectParameters>((uint32)EScreenProbeIndirectArgs::Max), TEXT("Lumen.ScreenProbeGather.ScreenProbeIndirectArgs"));

    // 设置自适应探针的非直接参数.
    {
        FSetupAdaptiveProbeIndirectArgsCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FSetupAdaptiveProbeIndirectArgsCS::FParameters>();
        PassParameters->RWScreenProbeIndirectArgs = GraphBuilder.CreateUAV(FRDGBufferUAVDesc(ScreenProbeIndirectArgs, PF_R32_UINT));
        PassParameters->ScreenProbeParameters = ScreenProbeParameters;

        auto ComputeShader = View.ShaderMap->GetShader<FSetupAdaptiveProbeIndirectArgsCS>(0);

        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("SetupAdaptiveProbeIndirectArgs"),
            ComputeShader,
            PassParameters,
            FIntVector(1, 1, 1));
    }

    ScreenProbeParameters.ProbeIndirectArgs = ScreenProbeIndirectArgs;

    FLumenCardTracingInputs TracingInputs(GraphBuilder, Scene, View);

    FRDGTextureRef BRDFProbabilityDensityFunction = nullptr;
    FRDGBufferSRVRef BRDFProbabilityDensityFunctionSH = nullptr;
    GenerateBRDF_PDF(GraphBuilder, View, SceneTextures, BRDFProbabilityDensityFunction, BRDFProbabilityDensityFunctionSH, ScreenProbeParameters);

    const LumenRadianceCache::FRadianceCacheInputs RadianceCacheInputs = LumenScreenProbeGatherRadianceCache::SetupRadianceCacheInputs();
    LumenRadianceCache::FRadianceCacheInterpolationParameters RadianceCacheParameters;

    // 辐射率缓存.
    if (LumenScreenProbeGather::UseRadianceCache(View))
    {
        FScreenGatherMarkUsedProbesData MarkUsedProbesData;
        MarkUsedProbesData.Parameters.View = View.ViewUniformBuffer;
        MarkUsedProbesData.Parameters.SceneTexturesStruct = SceneTextures.UniformBuffer;
        MarkUsedProbesData.Parameters.ScreenProbeParameters = ScreenProbeParameters;
        MarkUsedProbesData.Parameters.VisualizeLumenScene = View.Family->EngineShowFlags.VisualizeLumenScene != 0 ? 1 : 0;
        MarkUsedProbesData.Parameters.RadianceCacheParameters = RadianceCacheParameters;

        // 渲染辐射率缓存.
        RenderRadianceCache(
            GraphBuilder, 
            TracingInputs, 
            RadianceCacheInputs, 
            Scene,
            View, 
            &ScreenProbeParameters, 
            BRDFProbabilityDensityFunctionSH, 
            FMarkUsedRadianceCacheProbes::CreateStatic(&ScreenGatherMarkUsedProbes), 
            &MarkUsedProbesData, 
            View.ViewState->RadianceCacheState, 
            RadianceCacheParameters);
    }

    if (LumenScreenProbeGather::UseImportanceSampling(View))
    {
        // 生成重要性采样射线.
        GenerateImportanceSamplingRays(
            GraphBuilder,
            View,
            SceneTextures,
            RadianceCacheParameters,
            BRDFProbabilityDensityFunction,
            BRDFProbabilityDensityFunctionSH,
            ScreenProbeParameters);
    }

    const FIntPoint ScreenProbeTraceBufferSize = ScreenProbeParameters.ScreenProbeAtlasBufferSize * ScreenProbeParameters.ScreenProbeTracingOctahedronResolution;
    FRDGTextureDesc TraceRadianceDesc(FRDGTextureDesc::Create2D(ScreenProbeTraceBufferSize, PF_FloatRGB, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV));
    ScreenProbeParameters.TraceRadiance = GraphBuilder.CreateTexture(TraceRadianceDesc, TEXT("Lumen.ScreenProbeGather.TraceRadiance"));
    ScreenProbeParameters.RWTraceRadiance = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.TraceRadiance));

    FRDGTextureDesc TraceHitDesc(FRDGTextureDesc::Create2D(ScreenProbeTraceBufferSize, PF_R32_UINT, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV));
    ScreenProbeParameters.TraceHit = GraphBuilder.CreateTexture(TraceHitDesc, TEXT("Lumen.ScreenProbeGather.TraceHit"));
    ScreenProbeParameters.RWTraceHit = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(ScreenProbeParameters.TraceHit));

    // 追踪屏幕空间的探针.
    TraceScreenProbes(
        GraphBuilder, 
        Scene,
        View, 
        GLumenGatherCvars.TraceMeshSDFs != 0 && Lumen::UseMeshSDFTracing(),
        SceneTextures.UniformBuffer,
        PrevSceneColorMip,
        LightingChannelsTexture,
        TracingInputs,
        RadianceCacheParameters,
        ScreenProbeParameters,
        MeshSDFGridParameters);
    
    FScreenProbeGatherParameters GatherParameters;
    // 过滤屏幕空间探针.
    FilterScreenProbes(GraphBuilder, View, ScreenProbeParameters, GatherParameters);

    FScreenSpaceBentNormalParameters ScreenSpaceBentNormalParameters;
    ScreenSpaceBentNormalParameters.UseScreenBentNormal = 0;
    ScreenSpaceBentNormalParameters.ScreenBentNormal = SystemTextures.Black;
    ScreenSpaceBentNormalParameters.ScreenDiffuseLighting = SystemTextures.Black;

    // 计算屏幕空间的环境法线.
    if (LumenScreenProbeGather::UseScreenSpaceBentNormal())
    {
        ScreenSpaceBentNormalParameters = ComputeScreenSpaceBentNormal(GraphBuilder, Scene, View, SceneTextures, LightingChannelsTexture, ScreenProbeParameters);
    }

    FRDGTextureDesc DiffuseIndirectDesc = FRDGTextureDesc::Create2D(SceneTextures.Config.Extent, PF_FloatRGBA, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV);
    FRDGTextureRef DiffuseIndirect = GraphBuilder.CreateTexture(DiffuseIndirectDesc, TEXT("Lumen.ScreenProbeGather.DiffuseIndirect"));

    FRDGTextureDesc RoughSpecularIndirectDesc = FRDGTextureDesc::Create2D(SceneTextures.Config.Extent, PF_FloatRGB, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV);
    FRDGTextureRef RoughSpecularIndirect = GraphBuilder.CreateTexture(RoughSpecularIndirectDesc, TEXT("Lumen.ScreenProbeGather.RoughSpecularIndirect"));

    {
        FScreenProbeIndirectCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FScreenProbeIndirectCS::FParameters>();
        PassParameters->RWDiffuseIndirect = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(DiffuseIndirect));
        PassParameters->RWRoughSpecularIndirect = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(RoughSpecularIndirect));
        PassParameters->GatherParameters = GatherParameters;
        PassParameters->ScreenProbeParameters = ScreenProbeParameters;
        PassParameters->View = View.ViewUniformBuffer;
        PassParameters->SceneTexturesStruct = SceneTextures.UniformBuffer;
        PassParameters->FullResolutionJitterWidth = GLumenScreenProbeFullResolutionJitterWidth;
        extern float GLumenReflectionMaxRoughnessToTrace;
        extern float GLumenReflectionRoughnessFadeLength;
        PassParameters->MaxRoughnessToTrace = GLumenReflectionMaxRoughnessToTrace;
        PassParameters->RoughnessFadeLength = GLumenReflectionRoughnessFadeLength;
        PassParameters->ScreenSpaceBentNormalParameters = ScreenSpaceBentNormalParameters;

        FScreenProbeIndirectCS::FPermutationDomain PermutationVector;
        PermutationVector.Set< FScreenProbeIndirectCS::FDiffuseIntegralMethod >(LumenScreenProbeGather::GetDiffuseIntegralMethod());
        auto ComputeShader = View.ShaderMap->GetShader<FScreenProbeIndirectCS>(PermutationVector);

        // 计算屏幕空间探针的非直接光.
        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("ComputeIndirect %ux%u", View.ViewRect.Width(), View.ViewRect.Height()),
            ComputeShader,
            PassParameters,
            FComputeShaderUtils::GetGroupCount(View.ViewRect.Size(), FScreenProbeIndirectCS::GetGroupSize()));
    }

    FSSDSignalTextures DenoiserOutputs;
    DenoiserOutputs.Textures[0] = DiffuseIndirect;
    DenoiserOutputs.Textures[1] = RoughSpecularIndirect;
    bLumenUseDenoiserComposite = false;

    // 屏幕空间探针的时间过滤.
    if (GLumenScreenProbeTemporalFilter)
    {
        if (GLumenScreenProbeUseHistoryNeighborhoodClamp)
        {
            FRDGTextureRef CompressedDepthTexture;
            FRDGTextureRef CompressedShadingModelTexture;
            {
                FRDGTextureDesc Desc = FRDGTextureDesc::Create2D(
                    SceneTextures.Depth.Resolve->Desc.Extent,
                    PF_R16F,
                    FClearValueBinding::None,                    
                    /* InTargetableFlags = */ TexCreate_ShaderResource | TexCreate_UAV);

                CompressedDepthTexture = GraphBuilder.CreateTexture(Desc, TEXT("Lumen.ScreenProbeGather.CompressedDepth"));

                Desc.Format = PF_R8_UINT;
                CompressedShadingModelTexture = GraphBuilder.CreateTexture(Desc, TEXT("Lumen.ScreenProbeGather.CompressedShadingModelID"));
            }

            {
                FGenerateCompressedGBuffer::FParameters* PassParameters = GraphBuilder.AllocParameters<FGenerateCompressedGBuffer::FParameters>();
                PassParameters->RWCompressedDepthBufferOutput = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(CompressedDepthTexture));
                PassParameters->RWCompressedShadingModelOutput = GraphBuilder.CreateUAV(FRDGTextureUAVDesc(CompressedShadingModelTexture));
                PassParameters->View = View.ViewUniformBuffer;
                PassParameters->SceneTextures = SceneTextureParameters;

                auto ComputeShader = View.ShaderMap->GetShader<FGenerateCompressedGBuffer>(0);

                FComputeShaderUtils::AddPass(
                    GraphBuilder,
                    RDG_EVENT_NAME("GenerateCompressedGBuffer"),
                    ComputeShader,
                    PassParameters,
                    FComputeShaderUtils::GetGroupCount(View.ViewRect.Size(), FGenerateCompressedGBuffer::GetGroupSize()));
            }

            FSSDSignalTextures ScreenSpaceDenoiserInputs;
            ScreenSpaceDenoiserInputs.Textures[0] = DiffuseIndirect;
            ScreenSpaceDenoiserInputs.Textures[1] = RoughSpecularIndirect;

            DenoiserOutputs = IScreenSpaceDenoiser::DenoiseIndirectProbeHierarchy(
                GraphBuilder,
                View, 
                PreviousViewInfos,
                SceneTextureParameters,
                ScreenSpaceDenoiserInputs,
                CompressedDepthTexture,
                CompressedShadingModelTexture);

            bLumenUseDenoiserComposite = true;
        }
        else
        {
            UpdateHistoryScreenProbeGather(
                GraphBuilder,
                View,
                SceneTextures,
                DiffuseIndirect,
                RoughSpecularIndirect);

            DenoiserOutputs.Textures[0] = DiffuseIndirect;
            DenoiserOutputs.Textures[1] = RoughSpecularIndirect;
        }
    }

    return DenoiserOutputs;
}

结合源码和RenderDoc截帧数据,可知屏幕空间的探针收集阶段异常复杂,常规流程的主要步骤有:全局并自适应调整位置、计算BRDF、渲染辐射率缓存、计算光照PDF、生成采样射线、追踪屏幕空间的探针、压缩追踪结果、追踪Voxel体素、组合追踪结果、过滤带收集的辐射率、处理环境法线、计算非直接光、更新历史数据:

由于以上步骤涉及太多了,只能结合截帧数据挑选部分重要步骤加以分析。

  • RadianceCache

光照缓存(RadianceCache)也是一系列非常复杂的过程,先后经历清理、标记、更新、分配探针,设置绘制参数,追踪探针,过滤探针辐射度等阶段:

RadianceCache最重要的是追踪屏幕空间的探针,它的输入数据有全局距离场、VoxelLighting等纹理。

输出是4096x4096的辐射率探针图集和深度:

TraceFromProbes输出的探针图集(局部放大)。

其使用的Compute Shader代码如下:

// Engine\Shaders\Private\Lumen\LumenRadianceCache.usf

groupshared float3 SharedTraceRadiance[THREADGROUP_SIZE][THREADGROUP_SIZE];
groupshared float SharedTraceHitDistance[THREADGROUP_SIZE][THREADGROUP_SIZE];

[numthreads(THREADGROUP_SIZE, THREADGROUP_SIZE, 1)]
void TraceFromProbesCS(
    uint3 GroupId : SV_GroupID,
    uint2 GroupThreadId : SV_GroupThreadID)
{
    uint TraceTileIndex = GroupId.y * TRACE_TILE_GROUP_STRIDE + GroupId.x;

    if (TraceTileIndex < ProbeTraceTileAllocator[0])
    {
        uint2 TraceTileCoord;
        uint TraceTileLevel;
        uint ProbeTraceIndex;
        // 获取追踪块的信息
        UnpackTraceTileInfo(ProbeTraceTileData[TraceTileIndex], TraceTileCoord, TraceTileLevel, ProbeTraceIndex);

        uint TraceResolution = (RadianceProbeResolution / 2) << TraceTileLevel;
        // 探针纹素坐标
        uint2 ProbeTexelCoord = TraceTileCoord * THREADGROUP_SIZE + GroupThreadId.xy;


        float3 ProbeWorldCenter;
        uint ClipmapIndex;
        uint ProbeIndex;
        // 获取探针的追踪数据.
        GetProbeTraceData(ProbeTraceIndex, ProbeWorldCenter, ClipmapIndex, ProbeIndex);

        if (all(ProbeTexelCoord < TraceResolution))
        {
            float2 ProbeTexelCenter = float2(0.5, 0.5);
            float2 ProbeUV = (ProbeTexelCoord + ProbeTexelCenter) / float(TraceResolution);
            float3 WorldConeDirection = OctahedralMapToDirection(ProbeUV);

            float FinalMinTraceDistance = max(MinTraceDistance, GetRadianceProbeTMin(ClipmapIndex));
            float FinalMaxTraceDistance = MaxTraceDistance;
            float EffectiveStepFactor = StepFactor;

            // 将球的立体角均匀地分布在所有锥体上,而不是基于八面体的畸变.
            float ConeHalfAngle = acosFast(1.0f - 1.0f / (float)(TraceResolution * TraceResolution));

            // 设置锥体追踪输入数据.
            FConeTraceInput TraceInput;
            TraceInput.Setup(
                ProbeWorldCenter, WorldConeDirection,
                ConeHalfAngle, MinSampleRadius,
                FinalMinTraceDistance, FinalMaxTraceDistance,
                EffectiveStepFactor);
            TraceInput.VoxelStepFactor = VoxelStepFactor;

            bool bContinueCardTracing = false;

            TraceInput.VoxelTraceStartDistance = CalculateVoxelTraceStartDistance(FinalMinTraceDistance, FinalMaxTraceDistance, MaxMeshSDFTraceDistance, bContinueCardTracing);

            // 为探针纹素执行锥体追踪.
            FConeTraceResult TraceResult = TraceForProbeTexel(TraceInput);

            // 存储追踪的光照结果.
            SharedTraceRadiance[GroupThreadId.y][GroupThreadId.x] = TraceResult.Lighting;

            // 存储追踪的深度.
            #if RADIANCE_CACHE_STORE_DEPTHS
                SharedTraceHitDistance[GroupThreadId.y][GroupThreadId.x] = TraceResult.OpaqueHitDistance;
            #endif
        }

        GroupMemoryBarrierWithGroupSync();

        uint2 ProbeAtlasBaseCoord = RadianceProbeResolution * uint2(ProbeIndex % ProbeAtlasResolutionInProbes.x, ProbeIndex / ProbeAtlasResolutionInProbes.x);

        // 存储光照结果和相交点的距离.
        if (TraceResolution < RadianceProbeResolution)
        {
            uint UpsampleFactor = RadianceProbeResolution / TraceResolution;
            ProbeAtlasBaseCoord += (THREADGROUP_SIZE * TraceTileCoord + GroupThreadId.xy) * UpsampleFactor;

            float3 Lighting = SharedTraceRadiance[GroupThreadId.y][GroupThreadId.x];

            for (uint Y = 0; Y < UpsampleFactor; Y++)
            {
                for (uint X = 0; X < UpsampleFactor; X++)
                {
                    RWRadianceProbeAtlasTexture[ProbeAtlasBaseCoord + uint2(X, Y)] = Lighting;
                }
            }

            #if RADIANCE_CACHE_STORE_DEPTHS
                float HitDistance = min(SharedTraceHitDistance[GroupThreadId.y][GroupThreadId.x], MaxHalfFloat);

                for (uint Y = 0; Y < UpsampleFactor; Y++)
                {
                    for (uint X = 0; X < UpsampleFactor; X++)
                    {
                        RWDepthProbeAtlasTexture[ProbeAtlasBaseCoord + uint2(X, Y)] = HitDistance;
                    }
                }
            #endif
        }
        else
        {
            uint DownsampleFactor = TraceResolution / RadianceProbeResolution;
            uint WriteTileSize = THREADGROUP_SIZE / DownsampleFactor;

            if (all(GroupThreadId.xy < WriteTileSize))
            {
                float3 Lighting = 0;

                for (uint Y = 0; Y < DownsampleFactor; Y++)
                {
                    for (uint X = 0; X < DownsampleFactor; X++)
                    {
                        Lighting += SharedTraceRadiance[GroupThreadId.y * DownsampleFactor + Y][GroupThreadId.x * DownsampleFactor + X];
                    }
                }

                ProbeAtlasBaseCoord += WriteTileSize * TraceTileCoord + GroupThreadId.xy;
                RWRadianceProbeAtlasTexture[ProbeAtlasBaseCoord] = Lighting / (float)(DownsampleFactor * DownsampleFactor);

                #if RADIANCE_CACHE_STORE_DEPTHS
                    float HitDistance = MaxHalfFloat;

                    for (uint Y = 0; Y < DownsampleFactor; Y++)
                    {
                        for (uint X = 0; X < DownsampleFactor; X++)
                        {
                            HitDistance = min(HitDistance, SharedTraceHitDistance[GroupThreadId.y * DownsampleFactor + Y][GroupThreadId.x * DownsampleFactor + X]);
                        }
                    }

                    RWDepthProbeAtlasTexture[ProbeAtlasBaseCoord] = HitDistance;
                #endif
            }
        }
    }
}

下面再进入TraceForProbeTexel分析探针纹素的追踪堆栈:

FConeTraceResult TraceForProbeTexel(FConeTraceInput TraceInput)
{
    // 构造追踪结果结构体.
    FConeTraceResult TraceResult;
    TraceResult = (FConeTraceResult)0;
    TraceResult.Lighting = 0.0;
    TraceResult.Transparency = 1.0;
    TraceResult.OpaqueHitDistance = TraceInput.MaxTraceDistance;

    // 锥体追踪Lumen场景的纹素, 后面有解析.
    ConeTraceLumenSceneVoxels(TraceInput, TraceResult);

    // 远景距离场的追踪.
#if TRACE_DISTANT_SCENE
    if (TraceResult.Transparency > .01f)
    {
        FConeTraceResult DistantTraceResult;
        // 锥体追踪Lumen远处场景, 后面有解析.
        ConeTraceLumenDistantScene(TraceInput, DistantTraceResult);
        TraceResult.Lighting += DistantTraceResult.Lighting * TraceResult.Transparency;
        TraceResult.Transparency *= DistantTraceResult.Transparency;
    }
#endif

    // 天空光处理.
#if ENABLE_DYNAMIC_SKY_LIGHT
    if (ReflectionStruct.SkyLightParameters.y > 0)
    {
        float SkyAverageBrightness = 1.0f;
        float Roughness = TanConeAngleToRoughness(tan(TraceInput.ConeAngle));

        TraceResult.Lighting = TraceResult.Lighting + GetSkyLightReflection(TraceInput.ConeDirection, Roughness, SkyAverageBrightness) * TraceResult.Transparency;
    }
#endif

    return TraceResult;
}

// 锥体追踪Lumen场景的纹素
void ConeTraceLumenSceneVoxels(
    FConeTraceInput TraceInput,
    inout FConeTraceResult OutResult)
{
#if SCENE_TRACE_VOXELS
    if (TraceInput.VoxelTraceStartDistance < TraceInput.MaxTraceDistance)
    {
        FConeTraceInput VoxelTraceInput = TraceInput;
        VoxelTraceInput.MinTraceDistance = TraceInput.VoxelTraceStartDistance;
        FConeTraceResult VoxelTraceResult;
        // 锥体追踪体素, 之前就解析过了.
        ConeTraceVoxels(VoxelTraceInput, VoxelTraceResult);

        // 应用透明度.
        #if !VISIBILITY_ONLY_TRACE
            OutResult.Lighting += VoxelTraceResult.Lighting * OutResult.Transparency;
        #endif
        OutResult.Transparency *= VoxelTraceResult.Transparency;
        OutResult.NumSteps += VoxelTraceResult.NumSteps;
        OutResult.OpaqueHitDistance = min(OutResult.OpaqueHitDistance, VoxelTraceResult.OpaqueHitDistance);
    }
#endif
}

// 锥体追踪Lumen远处场景.
void ConeTraceLumenDistantScene(
    FConeTraceInput TraceInput,
    inout FConeTraceResult OutResult)
{
    float3 debug = 0;
    TraceInput.MaxTraceDistance = LumenCardScene.DistantSceneMaxTraceDistance;
    TraceInput.bBlackOutSteepIntersections = true;

    FCardTraceBlendState CardTraceBlendState;
    CardTraceBlendState.Initialize(TraceInput.MaxTraceDistance);

    if (LumenCardScene.NumDistantCards > 0)
    {
        // 从裁剪图获取最小追踪距离.
        if (NumClipmapLevels > 0)
        {
            float3 VoxelLightingCenter = ClipmapWorldCenter[NumClipmapLevels - 1].xyz;
            float3 VoxelLightingExtent = ClipmapWorldSamplingExtent[NumClipmapLevels - 1].xyz;

            float3 RayEnd = TraceInput.ConeOrigin + TraceInput.ConeDirection * TraceInput.MaxTraceDistance;
            float2 IntersectionTimes = LineBoxIntersect(TraceInput.ConeOrigin, RayEnd, VoxelLightingCenter - VoxelLightingExtent, VoxelLightingCenter + VoxelLightingExtent);

            // If we are starting inside the voxel clipmaps, move the start of the trace past the voxel clipmaps
            if (IntersectionTimes.x < IntersectionTimes.y && IntersectionTimes.x < .001f)
            {
                TraceInput.MinTraceDistance = IntersectionTimes.y * TraceInput.MaxTraceDistance;
            }
        }

        float TraceEndDistance = TraceInput.MinTraceDistance;

        {
            uint ListIndex = 0;
            uint CardIndex = LumenCardScene.DistantCardIndices[ListIndex];

            // 锥体追踪单个Lumen卡片, 后面有解析.
            ConeTraceSingleLumenCard(
                TraceInput,
                CardIndex,
                debug,
                TraceEndDistance,
                CardTraceBlendState);
        }
    }

    OutResult = (FConeTraceResult)0;

    // 存储结果.
    #if !VISIBILITY_ONLY_TRACE
        OutResult.Lighting = CardTraceBlendState.GetFinalLighting();
    #endif
    OutResult.Transparency = CardTraceBlendState.GetTransparency();
    OutResult.NumSteps = CardTraceBlendState.NumSteps;
    OutResult.NumOverlaps = CardTraceBlendState.NumOverlaps;
    OutResult.OpaqueHitDistance = CardTraceBlendState.OpaqueHitDistance;
    OutResult.Debug = debug;
}

// 锥体追踪单个Lumen卡片
void ConeTraceSingleLumenCard(
    FConeTraceInput TraceInput,
    uint CardIndex,
    inout float3 Debug,
    inout float OutTraceEndDistance,
    inout FCardTraceBlendState CardTraceBlendState)
{
    // 获取卡片数据.
    FLumenCardData LumenCardData = GetLumenCardData(CardIndex);

    // 计算局部空间的锥体数据.
    float3 LocalConeOrigin = mul(TraceInput.ConeOrigin - LumenCardData.Origin, LumenCardData.WorldToLocalRotation);
    float3 LocalConeDirection = mul(TraceInput.ConeDirection, LumenCardData.WorldToLocalRotation);
    float3 LocalTraceEnd = LocalConeOrigin + LocalConeDirection * TraceInput.MaxTraceDistance;

    // 相交范围.
    float2 IntersectionRange = LineBoxIntersect(LocalConeOrigin, LocalTraceEnd, -LumenCardData.LocalExtent, LumenCardData.LocalExtent);
    IntersectionRange.x = max(IntersectionRange.x, TraceInput.MinTraceDistance / TraceInput.MaxTraceDistance);
    OutTraceEndDistance = IntersectionRange.y * TraceInput.MaxTraceDistance;

    if (IntersectionRange.y > IntersectionRange.x
        && LumenCardData.bVisible)
    {
        {
            // 卡片追踪混合状态.
            FCardTraceBlendState ConeStepBlendState;
            ConeStepBlendState.Initialize(TraceInput.MaxTraceDistance);

            float StepTime = IntersectionRange.x * TraceInput.MaxTraceDistance;
            float3 SamplePosition = LocalConeOrigin + StepTime * LocalConeDirection;
            float TraceEndDistance = IntersectionRange.y * TraceInput.MaxTraceDistance;

            float IntersectionLength = (IntersectionRange.y - IntersectionRange.x) * TraceInput.MaxTraceDistance;
            float MinStepSize = IntersectionLength / (float)LumenCardScene.MaxConeSteps;

            float PreviousStepTime = StepTime;
            float3 PreviousSamplePosition = SamplePosition;
            // Magic value to prevent linear intersection approximation on first step
            float PreviousHeightfieldZ = -2;

            bool bClampedToEnd = false;
            bool bFoundSurface = false;
            bool bRayAboveSurface = false;
            float IntersectionStepTime = 0;
            float2 IntersectionSamplePositionXY = SamplePosition.xy;
            float IntersectionSlope = 0;

            uint NumStepsPerLoop = 4; // 每次循环采样4次.
            for (uint StepIndex = 0; StepIndex < LumenCardScene.MaxConeSteps && StepTime < TraceEndDistance; StepIndex += NumStepsPerLoop)
            {
                float SampleRadius = max(TraceInput.ConeStartRadius + TraceInput.TanConeAngle * StepTime, TraceInput.MinSampleRadius);
                float StepSize = max(SampleRadius * TraceInput.StepFactor, MinStepSize);
                float TraceClampDistance = TraceEndDistance - StepSize * .0001f;

                float DepthMip;
                float2 DepthValidRegionScale;
                CalculateMip(SampleRadius, LumenCardData, LumenCardData.LocalExtent, LumenCardData.MaxMip, DepthMip, DepthValidRegionScale);

                // 4个采样位置.
                float3 SamplePosition1 = LocalConeOrigin + min(StepTime + 0 * StepSize, TraceClampDistance) * LocalConeDirection;
                float3 SamplePosition2 = LocalConeOrigin + min(StepTime + 1 * StepSize, TraceClampDistance) * LocalConeDirection;
                float3 SamplePosition3 = LocalConeOrigin + min(StepTime + 2 * StepSize, TraceClampDistance) * LocalConeDirection;
                float3 SamplePosition4 = LocalConeOrigin + min(StepTime + 3 * StepSize, TraceClampDistance) * LocalConeDirection;

                // 4个深度UV.
                float2 DepthAtlasUV1 = CalculateAtlasUV(SamplePosition1.xy, DepthValidRegionScale, LumenCardData);
                float2 DepthAtlasUV2 = CalculateAtlasUV(SamplePosition2.xy, DepthValidRegionScale, LumenCardData);
                float2 DepthAtlasUV3 = CalculateAtlasUV(SamplePosition3.xy, DepthValidRegionScale, LumenCardData);
                float2 DepthAtlasUV4 = CalculateAtlasUV(SamplePosition4.xy, DepthValidRegionScale, LumenCardData);

                // 4个深度.
                float Depth1 = Texture2DSampleLevel(DepthAtlas, TRACING_ATLAS_SAMPLER, DepthAtlasUV1, DepthMip).x;
                float Depth2 = Texture2DSampleLevel(DepthAtlas, TRACING_ATLAS_SAMPLER, DepthAtlasUV2, DepthMip).x;
                float Depth3 = Texture2DSampleLevel(DepthAtlas, TRACING_ATLAS_SAMPLER, DepthAtlasUV3, DepthMip).x;
                float Depth4 = Texture2DSampleLevel(DepthAtlas, TRACING_ATLAS_SAMPLER, DepthAtlasUV4, DepthMip).x;

                // 4个高度场Z值.
                float HeightfieldZ1 = LumenCardData.LocalExtent.z - Depth1 * 2 * LumenCardData.LocalExtent.z;
                float HeightfieldZ2 = LumenCardData.LocalExtent.z - Depth2 * 2 * LumenCardData.LocalExtent.z;
                float HeightfieldZ3 = LumenCardData.LocalExtent.z - Depth3 * 2 * LumenCardData.LocalExtent.z;
                float HeightfieldZ4 = LumenCardData.LocalExtent.z - Depth4 * 2 * LumenCardData.LocalExtent.z;

                ConeStepBlendState.RegisterStep(NumStepsPerLoop);

                // 高度场是否相交.
                bool4 HeightfieldHit = bool4(
                    SamplePosition1.z < HeightfieldZ1,
                    SamplePosition2.z < HeightfieldZ2,
                    SamplePosition3.z < HeightfieldZ3,
                    SamplePosition4.z < HeightfieldZ4);

                bool bRayBelowHeightfield = any(HeightfieldHit);
                bool bRayWasAboveSurface = bRayAboveSurface;

                if (!bRayBelowHeightfield)
                {
                    bRayAboveSurface = true;
                }

                // 从高度场以下开始的追踪必须在到达高度场以上才能被命中
                if (bRayBelowHeightfield && bRayWasAboveSurface)
                {
                    float HeightfieldZ;
                    if (HeightfieldHit.x)
                    {
                        SamplePosition = SamplePosition1;
                        HeightfieldZ = HeightfieldZ1;
                        StepTime = StepTime + 0 * StepSize;
                    }
                    else if (HeightfieldHit.y)
                    {
                        PreviousSamplePosition = SamplePosition1;
                        PreviousHeightfieldZ = HeightfieldZ1;
                        PreviousStepTime = StepTime + 0 * StepSize;

                        SamplePosition = SamplePosition2;
                        HeightfieldZ = HeightfieldZ2;
                        StepTime = StepTime + 1 * StepSize;
                    }
                    else if (HeightfieldHit.z)
                    {
                        PreviousSamplePosition = SamplePosition2;
                        PreviousHeightfieldZ = HeightfieldZ2;
                        PreviousStepTime = StepTime + 1 * StepSize;

                        SamplePosition = SamplePosition3;
                        HeightfieldZ = HeightfieldZ3;
                        StepTime = StepTime + 2 * StepSize;
                    }
                    else
                    {
                        PreviousSamplePosition = SamplePosition3;
                        PreviousHeightfieldZ = HeightfieldZ3;
                        PreviousStepTime = StepTime + 2 * StepSize;

                        SamplePosition = SamplePosition4;
                        HeightfieldZ = HeightfieldZ4;
                        StepTime = StepTime + 3 * StepSize;
                    }

                    StepTime = min(StepTime, TraceClampDistance);

                    if (PreviousHeightfieldZ != -2)
                    {
                        // 求出x的交点.
                        IntersectionStepTime = PreviousStepTime + ((PreviousSamplePosition.z - PreviousHeightfieldZ) * (StepTime - PreviousStepTime)) / (HeightfieldZ - PreviousHeightfieldZ + PreviousSamplePosition.z - SamplePosition.z);

                        float2 LocalPositionSlopeXY = (SamplePosition.xy - PreviousSamplePosition.xy) / (StepTime - PreviousStepTime);
                        IntersectionSamplePositionXY = LocalPositionSlopeXY * (IntersectionStepTime - PreviousStepTime) + PreviousSamplePosition.xy;

                        IntersectionSlope = abs(PreviousHeightfieldZ - HeightfieldZ) / max(length(PreviousSamplePosition.xy - SamplePosition.xy), .0001f);

                        PreviousHeightfieldZ = -2;
                        // 找到了表面.
                        bFoundSurface = true;
                    }
                    break;
                }

                PreviousStepTime = StepTime + 3 * StepSize;
                PreviousSamplePosition = SamplePosition4;
                PreviousHeightfieldZ = HeightfieldZ4;
                StepTime += 4 * StepSize;

                if (StepTime >= TraceEndDistance && !bClampedToEnd)
                {
                    bClampedToEnd = true;
                    // Stop the last step just before the intersection end, since the linear approximation needs to step past the surface to detect a hit, without terminating the loop
                    StepTime = TraceClampDistance;
                }
            }

            // 如果找到了表面点.
            if (bFoundSurface)
            {
                float IntersectionSampleRadius = TraceInput.ConeStartRadius + TraceInput.TanConeAngle * IntersectionStepTime;

                float MaxMip;
                float2 ValidRegionScale;
                CalculateMip(IntersectionSampleRadius, LumenCardData, LumenCardData.LocalExtent, LumenCardData.MaxMip, MaxMip, ValidRegionScale);

                float2 IntersectionAtlasUV = CalculateAtlasUV(IntersectionSamplePositionXY, ValidRegionScale, LumenCardData);

                float DistanceToSurface = 0;
                float ConeIntersectSurface = saturate(DistanceToSurface / IntersectionSampleRadius);
                float ConeVisibility = ConeIntersectSurface;

                float MaxDistanceFade = 1;

                ConeStepBlendState.RegisterOpaqueHit(IntersectionStepTime);
                OutTraceEndDistance = IntersectionStepTime;

                float Opacity = Texture2DSampleLevel(OpacityAtlas, TRACING_ATLAS_SAMPLER, IntersectionAtlasUV, MaxMip).x;
                float ConeOcclusion = (1.0f - ConeVisibility) * Opacity * MaxDistanceFade;

                #if VISIBILITY_ONLY_TRACE
                    float3 StepLighting = 0;
                #else
                    float3 StepLighting = Texture2DSampleLevel(FinalLightingAtlas, TRACING_ATLAS_SAMPLER, IntersectionAtlasUV, MaxMip).rgb;
                #endif
            
                if (TraceInput.bBlackOutSteepIntersections)
                {
                    // 假设陡峭的部分被其他面覆盖,然后淡出。
                    float SlopeFade = 1 - saturate((IntersectionSlope - 5) / 1.0f);
                    StepLighting = lerp(0, StepLighting, SlopeFade);
                    ConeOcclusion = lerp(0, ConeOcclusion, SlopeFade);
                }

                ConeStepBlendState.AddLighting(StepLighting, ConeOcclusion, IntersectionStepTime);
            }

            CardTraceBlendState.AddCardTrace(ConeStepBlendState);
        }
    }
}

以上可知,RadianceCache阶段经历纷繁复杂的渲染过程,其中单单TraceFromProbes就先后考虑了锥体追踪Voxel光场和场景远处的卡片,最后还需要考虑天空光的影响。

  • TraceScreenProbes

TraceScreenProbes包含追踪屏幕的探针、网格距离场、Voxel光照等,具体的代码如下:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenScreenProbeTracing.cpp

void TraceScreenProbes(
    FRDGBuilder& GraphBuilder, 
    const FScene* Scene,
    const FViewInfo& View, 
    bool bTraceMeshSDFs,
    TRDGUniformBufferRef<FSceneTextureUniformParameters> SceneTexturesUniformBuffer,
    const ScreenSpaceRayTracing::FPrevSceneColorMip& PrevSceneColor,
    FRDGTextureRef LightingChannelsTexture,
    const FLumenCardTracingInputs& TracingInputs,
    const LumenRadianceCache::FRadianceCacheInterpolationParameters& RadianceCacheParameters,
    FScreenProbeParameters& ScreenProbeParameters,
    FLumenMeshSDFGridParameters& MeshSDFGridParameters)
{
    const FSceneTextureParameters SceneTextures = GetSceneTextureParameters(GraphBuilder, SceneTexturesUniformBuffer);

    // 清理探针.
    {
        FClearTracesCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FClearTracesCS::FParameters>();
        PassParameters->ScreenProbeParameters = ScreenProbeParameters;

        auto ComputeShader = View.ShaderMap->GetShader<FClearTracesCS>(0);

        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("ClearTraces %ux%u", ScreenProbeParameters.ScreenProbeTracingOctahedronResolution, ScreenProbeParameters.ScreenProbeTracingOctahedronResolution),
            ComputeShader,
            PassParameters,
            ScreenProbeParameters.ProbeIndirectArgs,
            (uint32)EScreenProbeIndirectArgs::ThreadPerTrace * sizeof(FRHIDispatchIndirectParameters));
    }

    FLumenIndirectTracingParameters IndirectTracingParameters;
    SetupLumenDiffuseTracingParameters(IndirectTracingParameters);

    const bool bTraceScreen = View.PrevViewInfo.ScreenSpaceRayTracingInput.IsValid() 
        && GLumenScreenProbeGatherScreenTraces != 0
        && !View.Family->EngineShowFlags.VisualizeLumenIndirectDiffuse;

    // 追踪屏幕空间的探针.
    if (bTraceScreen)
    {
        FScreenProbeTraceScreenTexturesCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FScreenProbeTraceScreenTexturesCS::FParameters>();

        ScreenSpaceRayTracing::SetupCommonScreenSpaceRayParameters(GraphBuilder, SceneTextures, PrevSceneColor, View, /* out */ &PassParameters->ScreenSpaceRayParameters);

        PassParameters->ScreenSpaceRayParameters.CommonDiffuseParameters.SceneTextures = SceneTextures;

        {
            const FVector2D HZBUvFactor(
                float(View.ViewRect.Width()) / float(2 * View.HZBMipmap0Size.X),
                float(View.ViewRect.Height()) / float(2 * View.HZBMipmap0Size.Y));

            const FVector4 ScreenPositionScaleBias = View.GetScreenPositionScaleBias(SceneTextures.SceneDepthTexture->Desc.Extent, View.ViewRect);
            const FVector2D HZBUVToScreenUVScale = FVector2D(1.0f / HZBUvFactor.X, 1.0f / HZBUvFactor.Y) * FVector2D(2.0f, -2.0f) * FVector2D(ScreenPositionScaleBias.X, ScreenPositionScaleBias.Y);
            const FVector2D HZBUVToScreenUVBias = FVector2D(-1.0f, 1.0f) * FVector2D(ScreenPositionScaleBias.X, ScreenPositionScaleBias.Y) + FVector2D(ScreenPositionScaleBias.W, ScreenPositionScaleBias.Z);
            PassParameters->HZBUVToScreenUVScaleBias = FVector4(HZBUVToScreenUVScale, HZBUVToScreenUVBias);
        }

        checkf(View.ClosestHZB, TEXT("Lumen screen tracing: ClosestHZB was not setup, should have been setup by FDeferredShadingSceneRenderer::RenderHzb"));
        PassParameters->ClosestHZBTexture = View.ClosestHZB;
        PassParameters->SceneDepthTexture = SceneTextures.SceneDepthTexture;
        PassParameters->LightingChannelsTexture = LightingChannelsTexture;
        PassParameters->HZBBaseTexelSize = FVector2D(1.0f / View.ClosestHZB->Desc.Extent.X, 1.0f / View.ClosestHZB->Desc.Extent.Y);
        PassParameters->MaxHierarchicalScreenTraceIterations = GLumenScreenProbeGatherHierarchicalScreenTracesMaxIterations;
        PassParameters->UncertainTraceRelativeDepthThreshold = GLumenScreenProbeGatherUncertainTraceRelativeDepthThreshold;
        PassParameters->NumThicknessStepsToDetermineCertainty = GLumenScreenProbeGatherNumThicknessStepsToDetermineCertainty;

        PassParameters->ScreenProbeParameters = ScreenProbeParameters;
        PassParameters->IndirectTracingParameters = IndirectTracingParameters;
        PassParameters->RadianceCacheParameters = RadianceCacheParameters;

        FScreenProbeTraceScreenTexturesCS::FPermutationDomain PermutationVector;
        PermutationVector.Set< FScreenProbeTraceScreenTexturesCS::FRadianceCache >(LumenScreenProbeGather::UseRadianceCache(View));
        PermutationVector.Set< FScreenProbeTraceScreenTexturesCS::FHierarchicalScreenTracing >(GLumenScreenProbeGatherHierarchicalScreenTraces != 0);
        PermutationVector.Set< FScreenProbeTraceScreenTexturesCS::FStructuredImportanceSampling >(LumenScreenProbeGather::UseImportanceSampling(View));
        auto ComputeShader = View.ShaderMap->GetShader<FScreenProbeTraceScreenTexturesCS>(PermutationVector);

        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("TraceScreen"),
            ComputeShader,
            PassParameters,
            ScreenProbeParameters.ProbeIndirectArgs,
            (uint32)EScreenProbeIndirectArgs::ThreadPerTrace * sizeof(FRHIDispatchIndirectParameters));
    }

    // 追踪网格距离场.
    if (bTraceMeshSDFs)
    {
        // 硬件模式
        if (Lumen::UseHardwareRayTracedScreenProbeGather())
        {
            FCompactedTraceParameters CompactedTraceParameters = CompactTraces(
                GraphBuilder,
                View,
                ScreenProbeParameters,
                WORLD_MAX,
                IndirectTracingParameters.MaxTraceDistance);

            RenderHardwareRayTracingScreenProbe(GraphBuilder,
                Scene,
                SceneTextures,
                ScreenProbeParameters,
                View,
                TracingInputs,
                IndirectTracingParameters,
                RadianceCacheParameters,
                CompactedTraceParameters);
        }
        // 软件模式
        else
        {
            CullForCardTracing(
                GraphBuilder,
                Scene, View,
                TracingInputs,
                IndirectTracingParameters,
                /* out */ MeshSDFGridParameters);

            if (MeshSDFGridParameters.TracingParameters.DistanceFieldObjectBuffers.NumSceneObjects > 0)
            {
                FCompactedTraceParameters CompactedTraceParameters = CompactTraces(
                    GraphBuilder,
                    View,
                    ScreenProbeParameters,
                    IndirectTracingParameters.CardTraceEndDistanceFromCamera,
                    IndirectTracingParameters.MaxMeshSDFTraceDistance);

                {
                    FScreenProbeTraceMeshSDFsCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FScreenProbeTraceMeshSDFsCS::FParameters>();
                    GetLumenCardTracingParameters(View, TracingInputs, PassParameters->TracingParameters);
                    PassParameters->MeshSDFGridParameters = MeshSDFGridParameters;
                    PassParameters->ScreenProbeParameters = ScreenProbeParameters;
                    PassParameters->IndirectTracingParameters = IndirectTracingParameters;
                    PassParameters->SceneTexturesStruct = SceneTexturesUniformBuffer;
                    PassParameters->CompactedTraceParameters = CompactedTraceParameters;

                    FScreenProbeTraceMeshSDFsCS::FPermutationDomain PermutationVector;
                    PermutationVector.Set< FScreenProbeTraceMeshSDFsCS::FStructuredImportanceSampling >(LumenScreenProbeGather::UseImportanceSampling(View));
                    auto ComputeShader = View.ShaderMap->GetShader<FScreenProbeTraceMeshSDFsCS>(PermutationVector);

                    FComputeShaderUtils::AddPass(
                        GraphBuilder,
                        RDG_EVENT_NAME("TraceMeshSDFs"),
                        ComputeShader,
                        PassParameters,
                        CompactedTraceParameters.IndirectArgs,
                        0);
                }
            }
        }
    }

    // 压缩追踪参数.
    FCompactedTraceParameters CompactedTraceParameters = CompactTraces(
        GraphBuilder,
        View,
        ScreenProbeParameters,
        WORLD_MAX,
        // Make sure the shader runs on all misses to apply radiance cache + skylight
        IndirectTracingParameters.MaxTraceDistance + 1);

    // 追踪Voxel光照.
    {
        FScreenProbeTraceVoxelsCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FScreenProbeTraceVoxelsCS::FParameters>();
        PassParameters->RadianceCacheParameters = RadianceCacheParameters;
        GetLumenCardTracingParameters(View, TracingInputs, PassParameters->TracingParameters);
        PassParameters->ScreenProbeParameters = ScreenProbeParameters;
        PassParameters->IndirectTracingParameters = IndirectTracingParameters;
        PassParameters->SceneTexturesStruct = SceneTexturesUniformBuffer;
        PassParameters->CompactedTraceParameters = CompactedTraceParameters;

        const bool bRadianceCache = LumenScreenProbeGather::UseRadianceCache(View);

        FScreenProbeTraceVoxelsCS::FPermutationDomain PermutationVector;
        PermutationVector.Set< FScreenProbeTraceVoxelsCS::FDynamicSkyLight >(Lumen::ShouldHandleSkyLight(Scene, *View.Family));
        PermutationVector.Set< FScreenProbeTraceVoxelsCS::FTraceDistantScene >(Scene->LumenSceneData->DistantCardIndices.Num() > 0);
        PermutationVector.Set< FScreenProbeTraceVoxelsCS::FRadianceCache >(bRadianceCache);
        PermutationVector.Set< FScreenProbeTraceVoxelsCS::FStructuredImportanceSampling >(LumenScreenProbeGather::UseImportanceSampling(View));
        auto ComputeShader = View.ShaderMap->GetShader<FScreenProbeTraceVoxelsCS>(PermutationVector);

        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("TraceVoxels"),
            ComputeShader,
            PassParameters,
            CompactedTraceParameters.IndirectArgs,
            0);
    }

    if (GLumenScreenProbeGatherVisualizeTraces)
    {
        SetupVisualizeTraces(GraphBuilder, Scene, View, ScreenProbeParameters);
    }
}

先结合截帧数据分析TraceScreen,它的输入是BlueNoise、Velocity、深度、探针速度、射线信息、HZB、SSRReducedSceneColor等纹理,输出是像素格式为R11G11B10的TraceRadiance和R32的TraceHit纹理:

左:TraceRadiance,右:TraceHit。

它使用的Compute Shader如下:

// Engine\Shaders\Private\Lumen\LumenScreenProbeTracing.usf

[numthreads(PROBE_THREADGROUP_SIZE_2D, PROBE_THREADGROUP_SIZE_2D, 1)]
void ScreenProbeTraceScreenTexturesCS(
    uint3 GroupId : SV_GroupID,
    uint3 DispatchThreadId : SV_DispatchThreadID,
    uint3 GroupThreadId : SV_GroupThreadID)
{
#define DEINTERLEAVED_SCREEN_TRACING 1
    // 计算纹理坐标
#if DEINTERLEAVED_SCREEN_TRACING
    uint2 AtlasSizeInProbes = uint2(ScreenProbeAtlasViewSize.x, (GetNumScreenProbes() + ScreenProbeAtlasViewSize.x - 1) / ScreenProbeAtlasViewSize.x);
    uint2 ScreenProbeAtlasCoord = DispatchThreadId.xy % AtlasSizeInProbes;
    uint2 TraceTexelCoord = DispatchThreadId.xy / AtlasSizeInProbes;
#else
    uint2 ScreenProbeAtlasCoord = DispatchThreadId.xy / ScreenProbeTracingOctahedronResolution;
    uint2 TraceTexelCoord = DispatchThreadId.xy - ScreenProbeAtlasCoord * ScreenProbeTracingOctahedronResolution;
#endif

    uint ScreenProbeIndex = ScreenProbeAtlasCoord.y * ScreenProbeAtlasViewSize.x + ScreenProbeAtlasCoord.x;

    uint2 ScreenProbeScreenPosition = GetScreenProbeScreenPosition(ScreenProbeIndex);
    uint2 ScreenTileCoord = GetScreenTileCoord(ScreenProbeScreenPosition);

    if (ScreenProbeIndex < GetNumScreenProbes() && all(TraceTexelCoord < ScreenProbeTracingOctahedronResolution))
    {
        float2 ScreenUV = GetScreenUVFromScreenProbePosition(ScreenProbeScreenPosition);
        float SceneDepth = GetScreenProbeDepth(ScreenProbeAtlasCoord);

        if (SceneDepth > 0.0f)
        {
            float3 WorldPosition = GetWorldPositionFromScreenUV(ScreenUV, SceneDepth);

            float2 ProbeUV;
            float ConeHalfAngle;
            // 获取探针追踪的UV.
            GetProbeTracingUV(ScreenProbeAtlasCoord, TraceTexelCoord, GetProbeTexelCenter(ScreenTileCoord), 1, ProbeUV, ConeHalfAngle);

            float3 WorldConeDirection = OctahedralMapToDirection(ProbeUV);

            float DepthThresholdScale = HasDistanceFieldRepresentation(ScreenUV) ? 1.0f : ScreenTraceNoFallbackThicknessScale;

            {
                float TraceDistance = MaxTraceDistance;
                bool bCoveredByRadianceCache = false;
                #if RADIANCE_CACHE
                    float ProbeOcclusionDistance = GetRadianceProbeOcclusionDistanceWithInterpolation(WorldPosition, WorldConeDirection, bCoveredByRadianceCache);
                    TraceDistance = min(TraceDistance, ProbeOcclusionDistance);
                #endif


#if HIERARCHICAL_SCREEN_TRACING // 层级屏幕追踪

                bool bHit;
                bool bUncertain;
                float3 HitUVz;

                // 屏幕追踪
                TraceScreen(
                    WorldPosition + View.PreViewTranslation,
                    WorldConeDirection,
                    TraceDistance,
                    HZBUvFactorAndInvFactor,
                    MaxHierarchicalScreenTraceIterations, 
                    UncertainTraceRelativeDepthThreshold * DepthThresholdScale,
                    NumThicknessStepsToDetermineCertainty,
                    bHit,
                    bUncertain,
                    HitUVz);
                
                float Level = 1;
                bool bWriteDepthOnMiss = true;
#else // 非层级屏幕追踪
    
                uint NumSteps = 16;
                float StartMipLevel = 1.0f;
                float MaxScreenTraceFraction = .2f;

                // 通过限制跟踪距离,只能在固定步长计数的屏幕跟踪中获得良好的质量.
                float MaxWorldTraceDistance = SceneDepth * MaxScreenTraceFraction * 2.0 * GetTanHalfFieldOfView().x;
                TraceDistance = min(TraceDistance, MaxWorldTraceDistance);

                uint2 NoiseCoord = ScreenProbeAtlasCoord * ScreenProbeTracingOctahedronResolution + TraceTexelCoord;
                float StepOffset = InterleavedGradientNoise(NoiseCoord + 0.5f, 0);

                float RayRoughness = .2f;
                StepOffset = StepOffset - .9f;

                FSSRTCastingSettings CastSettings = CreateDefaultCastSettings();
                CastSettings.bStopWhenUncertain = true;

                bool bHit = false;
                float Level;
                float3 HitUVz;
                bool bRayWasClipped;

                // 初始化屏幕空间的来自世界空间的光线.
                FSSRTRay Ray = InitScreenSpaceRayFromWorldSpace(
                    WorldPosition + View.PreViewTranslation, WorldConeDirection,
                    /* WorldTMax = */ TraceDistance,
                    /* SceneDepth = */ SceneDepth,
                    /* SlopeCompareToleranceScale */ 2.0f * DepthThresholdScale,
                    /* bExtendRayToScreenBorder = */ false,
                    /* out */ bRayWasClipped);

                bool bUncertain;
                float3 DebugOutput;

                // 投射屏幕空间的射线.
                CastScreenSpaceRay(
                    FurthestHZBTexture, FurthestHZBTextureSampler,
                    StartMipLevel,
                    CastSettings,
                    Ray, RayRoughness, NumSteps, StepOffset,
                    HZBUvFactorAndInvFactor, false,
                    /* out */ DebugOutput,
                    /* out */ HitUVz,
                    /* out */ Level,
                    /* out */ bHit,
                    /* out */ bUncertain);

                // CastScreenSpaceRay skips Mesh SDF tracing in a lot of places where it shouldn't, in particular missing thin occluders due to low NumSteps.  
                bool bWriteDepthOnMiss = !bUncertain;

#endif
                bHit = bHit && !bUncertain;

                uint2 TraceCoord = GetTraceBufferCoord(ScreenProbeAtlasCoord, TraceTexelCoord);
                bool bFastMoving = false;

                // 处理相交后的逻辑.
                if (bHit)
                {
                    float2 ReducedColorUV = HitUVz.xy * ColorBufferScaleBias.xy + ColorBufferScaleBias.zw;
                    ReducedColorUV = min(ReducedColorUV, ReducedColorUVMax);

                    float3 Lighting = ColorTexture.SampleLevel(ColorTextureSampler, ReducedColorUV, Level).rgb;
                    
                    #if DEBUG_VISUALIZE_TRACE_TYPES
                        RWTraceRadiance[TraceCoord] = float3(.5f, 0, 0) * View.PreExposure;
                    #else
                        RWTraceRadiance[TraceCoord] = Lighting;
                    #endif

                    float3 HitWorldVelocity;
                    {
                        float2 HitScreenUV = HitUVz.xy;
                        float2 HitScreenPosition = (HitScreenUV.xy - View.ScreenPositionScaleBias.wz) / View.ScreenPositionScaleBias.xy;

                        float HitDeviceZ = HitUVz.z;
                        float HitSceneDepth = ConvertFromDeviceZ(HitUVz.z);
                        float3 HitHistoryScreenPosition = GetHistoryScreenPosition(HitScreenPosition, HitScreenUV, HitDeviceZ);

                        float3 HitTranslatedWorldPosition = mul(float4(HitScreenPosition * HitSceneDepth, HitSceneDepth, 1), View.ScreenToTranslatedWorld).xyz;
                        HitWorldVelocity = HitTranslatedWorldPosition - GetPrevTranslatedWorldPosition(HitHistoryScreenPosition);
                    }

                    float ProbeWorldSpeed = ScreenProbeWorldSpeed.Load(int3(ScreenProbeAtlasCoord, 0)).x;
                    float HitWorldSpeed = length(HitWorldVelocity);

                    bFastMoving = abs(ProbeWorldSpeed - HitWorldSpeed) / max(SceneDepth, 100.0f) > RelativeSpeedDifferenceToConsiderLightingMoving;
                }

                // 相交或要求写深度则保存深度.
                if (bHit || bWriteDepthOnMiss)
                {
                    float HitDistance = min(sqrt(ComputeRayHitSqrDistance(WorldPosition + View.PreViewTranslation, HitUVz)), MaxTraceDistance);
                    RWTraceHit[TraceCoord] = EncodeProbeRayDistance(HitDistance, bHit, bFastMoving);
                }
            }
        }
    }
}

上面会根据是否HIERARCHICAL_SCREEN_TRACING而进入两种不同的屏幕追踪方式,截帧数据显示HIERARCHICAL_SCREEN_TRACING为1,即会进入TraceScreen而不会进入CastScreenSpaceRay。下面分析TraceScreen

// Engine\Shaders\Private\Lumen\LumenScreenTracing.ush

// 通过遍历HZB追踪屏幕空间, 虽然精确但比较慢。
void TraceScreen(
    float3 RayTranslatedWorldOrigin, 
    float3 RayWorldDirection,
    float MaxWorldTraceDistance,
    float4 HZBUvFactorAndInvFactor,
    float MaxIterations,
    float UncertainTraceRelativeDepthThreshold,
    float NumThicknessStepsToDetermineCertainty,
    inout bool bHit,
    inout bool bUncertain,
    inout float3 OutScreenUV)
{
    // 计算射线起点的屏幕UV.
    float3 RayStartScreenUV;
    {
        float4 RayStartClip = mul(float4(RayTranslatedWorldOrigin, 1.0f), View.TranslatedWorldToClip);
        float3 RayStartScreenPosition = RayStartClip.xyz / max(RayStartClip.w, 1.0f);
        RayStartScreenUV = float3((RayStartScreenPosition.xy * float2(0.5f, -0.5f) + 0.5f) * HZBUvFactorAndInvFactor.xy, RayStartScreenPosition.z);
    }
    
    // 计算射线终点的屏幕UV.
    float3 RayEndScreenUV;
    {
        float3 ViewRayDirection = mul(float4(RayWorldDirection, 0.0), View.TranslatedWorldToView).xyz;
        float SceneDepth = mul(float4(RayTranslatedWorldOrigin, 1.0f), View.TranslatedWorldToView).z;
        // 将射线夹在Z==0的平面结束,这样结束点将在NDC空间中有效.
        float RayEndWorldDistance = ViewRayDirection.z < 0.0 ? min(-0.99f * SceneDepth / ViewRayDirection.z, MaxWorldTraceDistance) : MaxWorldTraceDistance;

        float3 RayWorldEnd = RayTranslatedWorldOrigin + RayWorldDirection * RayEndWorldDistance;
        float4 RayEndClip = mul(float4(RayWorldEnd, 1.0f), View.TranslatedWorldToClip);
        float3 RayEndScreenPosition = RayEndClip.xyz / RayEndClip.w;
        RayEndScreenUV = float3((RayEndScreenPosition.xy * float2(0.5f, -0.5f) + 0.5f) * HZBUvFactorAndInvFactor.xy, RayEndScreenPosition.z);

        float2 ScreenEdgeIntersections = LineBoxIntersect(RayStartScreenUV, RayEndScreenUV, float3(0, 0, 0), float3(HZBUvFactorAndInvFactor.xy, 1));

        // 重新计算它离开屏幕的终点.
        RayEndScreenUV = RayStartScreenUV + (RayEndScreenUV - RayStartScreenUV) * ScreenEdgeIntersections.y;
    }

    float BaseMipLevel = HZB_TRACE_INCLUDE_FULL_RES_DEPTH ? -1 : 0;
    float MipLevel = BaseMipLevel;

    // 跳出当前分块而不进行命中测试,以避免自遮挡. 这是必要的,因为HZB mip 0是最接近2x2深度的,而且HZB存储在16位浮点数中
    bool bStepOutOfCurrentTile = true;
    if (bStepOutOfCurrentTile)
    {
        float2 HZBTileSize = exp2(MipLevel) * HZBBaseTexelSize;
        float2 BiasedUV = RayStartScreenUV.xy;
        float3 HZBTileMin = float3(floor(BiasedUV.xy / HZBTileSize) * HZBTileSize, 0.0f);
        float3 HZBTileMax = float3(HZBTileMin.xy + HZBTileSize, 1);
        float2 TileIntersections = LineBoxIntersect(RayStartScreenUV, RayEndScreenUV, HZBTileMin, HZBTileMax);

        {
            float3 RayTileHit = RayStartScreenUV + (RayEndScreenUV - RayStartScreenUV) * TileIntersections.y;
            RayStartScreenUV = RayTileHit;
        }
    }

    bHit = false;
    bUncertain = false;

    float RayLength2D = length(RayEndScreenUV.xy - RayStartScreenUV.xy);
    float2 RayDirectionScreenUV = (RayEndScreenUV.xy - RayStartScreenUV.xy) / max(RayLength2D, .0001f);
    float3 RayScreenUV = RayStartScreenUV;
    float NumIterations = 0;
    
    // 无栈遍历HZB.
    while (MipLevel >= BaseMipLevel && NumIterations < MaxIterations)
    {
        float2 HZBTileSize = exp2(MipLevel) * HZBBaseTexelSize;
        // RayScreenUV is on a tile boundary due to bStepOutOfCurrentTile
        // Offset the UV along the ray direction so it always quantizes to the next tile
        float2 BiasedUV = RayScreenUV.xy + .01f * RayDirectionScreenUV.xy * HZBTileSize;
        float3 HZBTileMin = float3(floor(BiasedUV / HZBTileSize) * HZBTileSize, 0.0f);
        float3 HZBTileMax = float3(HZBTileMin.xy + HZBTileSize, 1);
        float2 TileIntersections = LineBoxIntersect(RayStartScreenUV, RayEndScreenUV, HZBTileMin, HZBTileMax);
        float3 RayTileHit = RayStartScreenUV + (RayEndScreenUV - RayStartScreenUV) * TileIntersections.y;

        float TileZ;
        float AvoidSelfIntersectionZScale = 1.0f;

#if HZB_TRACE_INCLUDE_FULL_RES_DEPTH
        if (MipLevel < 0)
        {
            TileZ = SceneDepthTexture.SampleLevel(GlobalPointClampedSampler, BiasedUV * HZBUVToScreenUVScaleBias.xy + HZBUVToScreenUVScaleBias.zw, 0).x;
        }
        else
#endif
        {
            TileZ = ClosestHZBTexture.SampleLevel(GlobalPointClampedSampler, BiasedUV, MipLevel).x;
            // 启发式避免错误的自遮挡, 因为HZB mip 0是最接近2x2深度的,而且HZB存储在16位浮点数中
            AvoidSelfIntersectionZScale = lerp(.99f, 1.0f, saturate(TileIntersections.y * 10.0f));
        }

        if (RayTileHit.z > TileZ * AvoidSelfIntersectionZScale)
        {
            RayScreenUV = RayTileHit;
            MipLevel++;

            if (TileIntersections.y == 1.0f)
            {
                // 射线没有和HZB块相交.
                MipLevel = BaseMipLevel - 1;
            }
        }
        else
        {
            if (abs(MipLevel - BaseMipLevel) < .1f)
            {
                // 将相交点的UV对齐到纹素的中心,进行SceneColor查找.
                RayScreenUV = float3(.5f * (HZBTileMin.xy + HZBTileMax.xy), RayTileHit.z);
                bHit = true;
                float IntersectionDepth = ConvertFromDeviceZ(TileZ);
                float RayTileEnterZ = RayStartScreenUV.z + (RayEndScreenUV.z - RayStartScreenUV.z) * TileIntersections.x;
                bUncertain = (ConvertFromDeviceZ(RayTileEnterZ) - IntersectionDepth) / max(IntersectionDepth, .00001f) > UncertainTraceRelativeDepthThreshold;
            }

            MipLevel--;
        }

        NumIterations++;
    }

    // 沿着射线确定特定厚度的线性步骤,以拒绝非常薄的表面(草, 头发, 植被)后面的相交.
    if (bHit && !bUncertain && NumThicknessStepsToDetermineCertainty > 0)
    {
        float ThicknessSearchMipLevel = 0.0f;
        float MipNumTexels = exp2(ThicknessSearchMipLevel);
        float2 HZBTileSize = MipNumTexels * HZBBaseTexelSize;
        float NumSteps = NumThicknessStepsToDetermineCertainty / MipNumTexels;
        float ThicknessSearchEndTime = min(length(RayDirectionScreenUV * HZBTileSize * NumSteps) / length(RayEndScreenUV.xy - RayScreenUV.xy), 1.0f);

        for (float I = 0; I < NumSteps; I++)
        {
            float3 SampleUV = RayScreenUV + (I / NumSteps) * ThicknessSearchEndTime * (RayEndScreenUV - RayScreenUV);

            if (all(SampleUV.xy > 0 && SampleUV.xy < HZBUvFactorAndInvFactor.xy))
            {
                float SampleTileZ = ClosestHZBTexture.SampleLevel(GlobalPointClampedSampler, SampleUV.xy, ThicknessSearchMipLevel).x;

                if (SampleUV.z > SampleTileZ)
                {
                    bUncertain = true;
                }
            }
        }
    }

    OutScreenUV.xy = RayScreenUV.xy * HZBUVToScreenUVScaleBias.xy + HZBUVToScreenUVScaleBias.zw;
    OutScreenUV.z = RayScreenUV.z;
}

关于HZB屏幕空间的光线追踪,推荐参看闫令琪大神的图形学课程《GAMES202-高质量实时渲染》Lecture9 Real-Time Global Illumination(Screen Space),其视频详尽动态地描述了HZB的遍历和追踪过程。下图只是截取视频的其中一幅图例:

  • TraceVoxels

追踪体素的输入有全局距离场、法线、深度、天空光、蓝噪点、VoxelLighting、RadianceProbeIndirectTexture、FinalRadianceAtlas、射线信息等,输出有R32的TraceHit、R11G11B10的TraceRandiance:

TraceVoxels的输出纹理TraceHit,存储了相交点的深度,注意右上角范围做了调整。

TraceVoxels的输出纹理TraceRadiance,存储了相交点的辐射率。

再分析其使用的compute shader:

// Engine\Shaders\Private\Lumen\LumenScreenProbeTracing.usf

[numthreads(PROBE_THREADGROUP_SIZE_1D, 1, 1)]
void ScreenProbeTraceVoxelsCS(
    uint3 GroupId : SV_GroupID,
    uint3 DispatchThreadId : SV_DispatchThreadID,
    uint3 GroupThreadId : SV_GroupThreadID)
{
    if (DispatchThreadId.x < CompactedTraceTexelAllocator[0])
    {
        uint ScreenProbeIndex;
        uint2 TraceTexelCoord;
        float TraceHitDistance;
        // 解码需要追踪的纹素信息.
        DecodeTraceTexel(CompactedTraceTexelData[DispatchThreadId.x], ScreenProbeIndex, TraceTexelCoord, TraceHitDistance);

        // 计算探针所在图集的UV.
        uint2 ScreenProbeAtlasCoord = uint2(ScreenProbeIndex % ScreenProbeAtlasViewSize.x, ScreenProbeIndex / ScreenProbeAtlasViewSize.x);
        // 追踪探针纹素的体素光照.
        TraceVoxels(ScreenProbeAtlasCoord, TraceTexelCoord, ScreenProbeIndex, TraceHitDistance);
    }
}

void TraceVoxels(
    uint2 ScreenProbeAtlasCoord,
    uint2 TraceTexelCoord,
    uint ScreenProbeIndex,
    float TraceHitDistance)
{
    // 计算追踪的UV.
    uint2 ScreenProbeScreenPosition = GetScreenProbeScreenPosition(ScreenProbeIndex);
    uint2 ScreenTileCoord = GetScreenTileCoord(ScreenProbeScreenPosition);

    uint2 TraceCoord = GetTraceBufferCoord(ScreenProbeAtlasCoord, TraceTexelCoord);
    
    {
        // 获取屏幕空间的各类数据.
        float2 ScreenUV = GetScreenUVFromScreenProbePosition(ScreenProbeScreenPosition);
        float SceneDepth = GetScreenProbeDepth(ScreenProbeAtlasCoord);
        float3 SceneNormal = DecodeNormal(SceneTexturesStruct.GBufferATexture.Load(int3(ScreenUV * View.BufferSizeAndInvSize.xy, 0)).xyz);

        bool bHit = false;

        {
            // 计算世界坐标.
            float3 WorldPosition = GetWorldPositionFromScreenUV(ScreenUV, SceneDepth);

            float2 ProbeUV;
            float ConeHalfAngle;
            // 获取探针追踪UV.
            GetProbeTracingUV(ScreenProbeAtlasCoord, TraceTexelCoord, GetProbeTexelCenter(ScreenTileCoord), 1, ProbeUV, ConeHalfAngle);

            // 从八面体图反算成方向.
            float3 WorldConeDirection = OctahedralMapToDirection(ProbeUV);

            // 采样位置.
            float3 SamplePosition = WorldPosition + SurfaceBias * WorldConeDirection;
            SamplePosition += SurfaceBias * SceneNormal;

            float TraceDistance = MaxTraceDistance;
            bool bCoveredByRadianceCache = false;
#if RADIANCE_CACHE
            float ProbeOcclusionDistance = GetRadianceProbeOcclusionDistanceWithInterpolation(WorldPosition, WorldConeDirection, bCoveredByRadianceCache);
            TraceDistance = min(TraceDistance, ProbeOcclusionDistance);
#endif

            // 构建锥体追踪输入数据.
            FConeTraceInput TraceInput;
            TraceInput.Setup(SamplePosition, WorldConeDirection, ConeHalfAngle, MinSampleRadius, MinTraceDistance, TraceDistance, StepFactor);
            TraceInput.VoxelStepFactor = VoxelStepFactor;
            TraceInput.VoxelTraceStartDistance = max(MinTraceDistance, TraceHitDistance);

            // 构建锥体追踪输出数据.
            FConeTraceResult TraceResult = (FConeTraceResult)0;
            TraceResult.Lighting = 0;
            TraceResult.Transparency = 1;
            TraceResult.OpaqueHitDistance = TraceInput.MaxTraceDistance;

            // 锥体追踪Lumen场景的光照体素.
            ConeTraceLumenSceneVoxels(TraceInput, TraceResult);

            if (TraceResult.Transparency <= .5f)
            {
                // 掠射角追踪的自相交产生的噪点无法被空间滤波器消除.
                #define USE_VOXEL_TRACE_HIT_DISTANCE 0
                #if USE_VOXEL_TRACE_HIT_DISTANCE
                    TraceHitDistance = TraceResult.OpaqueHitDistance;
                #else
                    TraceHitDistance = TraceDistance;
                #endif
                bHit = true;
            }

#if RADIANCE_CACHE
            if (bCoveredByRadianceCache)
            {
                if (TraceResult.Transparency > .5f)
                {
                    // 不保存辐射率缓存相交点的深度.
                    TraceHitDistance = MaxTraceDistance;
                }

                SampleRadianceCacheAndApply(WorldPosition, WorldConeDirection, ConeHalfAngle, float3(0, 0, 0), TraceResult.Lighting, TraceResult.Transparency);
            }
            else
#endif
            {
#if TRACE_DISTANT_SCENE
                // 追踪远处场景.
                if (TraceResult.Transparency > .01f)
                {
                    FConeTraceResult DistantTraceResult;
                    ConeTraceLumenDistantScene(TraceInput, DistantTraceResult);
                    TraceResult.Lighting += DistantTraceResult.Lighting * TraceResult.Transparency;
                    TraceResult.Transparency *= DistantTraceResult.Transparency;
                }
#endif
                // 计算天空光.
                EvaluateSkyRadianceForCone(WorldConeDirection, tan(ConeHalfAngle), TraceResult);

                if (TraceHitDistance >= GetProbeMaxHitDistance())
                {
                    TraceHitDistance = MaxTraceDistance;
                }
            }
            
            #if USE_PREEXPOSURE
                TraceResult.Lighting *= View.PreExposure;
            #endif

            #if DEBUG_VISUALIZE_TRACE_TYPES
                RWTraceRadiance[TraceCoord] = float3(0, 0, .5f) * View.PreExposure;
            #else
                RWTraceRadiance[TraceCoord] = TraceResult.Lighting;
            #endif
        }

        // 存储追踪结果, 将相交点距离/是否相交/是否移动编码到32位非负整数中.
        RWTraceHit[TraceCoord] = EncodeProbeRayDistance(TraceHitDistance, bHit, false);
    }
}
  • CompositeTraces

CompositeTraces就是根据前面步骤生成的TraceHit、RayInfo和TraceRadianc生成ScreenProbeRadiance、ScreenProbeHitDistance、ScreenProbeTraceMoving纹理。其使用的Compute Shader是LumenScreenProbeFiltering.usf,主入口是ScreenProbeCompositeTracesWithScatterCS,具体代码此文忽略。

  • FilterRadianceWithGather

CompositeTraces之后会经历数次FilterRadianceWithGather,执行探针辐射率过滤:

左:过滤前的ScreenProbeRadiance;右:执行若干次过滤后的ScreenProbeRadiance。

  • ComputeIndirect

这个阶段就是利用之前生成的各种屏幕空间的探针数据(深度、法线、基础色、FilteredScreenProbeRadiance、BentNormal)计算出最终的场景非直接光颜色(下图):

6.5.7.3 RenderLumenReflections

RenderLumenReflections就是渲染Lumen场景中粗糙度比较低比较光滑的表面的反射,其流程和RenderLumenScreenProbeGather类似,但更简单步骤更少:

其涉及的C++渲染代码如下:

// Engine\Source\Runtime\Renderer\Private\Lumen\LumenReflections.cpp

FRDGTextureRef FDeferredShadingSceneRenderer::RenderLumenReflections(
    FRDGBuilder& GraphBuilder, 
    const FViewInfo& View,
    const FSceneTextures& SceneTextures,
    const FLumenMeshSDFGridParameters& MeshSDFGridParameters,
    FLumenReflectionCompositeParameters& OutCompositeParameters)
{
    // 反射追踪的最大的粗糙度, 大于此的表面将忽略.
    OutCompositeParameters.MaxRoughnessToTrace = GLumenReflectionMaxRoughnessToTrace;
    OutCompositeParameters.InvRoughnessFadeLength = 1.0f / GLumenReflectionRoughnessFadeLength;

    (......)

    {
        (......)

        auto ComputeShader = View.ShaderMap->GetShader<FReflectionGenerateRaysCS>(0);

        // 生成射线Pass.
        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("GenerateRaysCS"),
            ComputeShader,
            PassParameters,
            ReflectionTileParameters.TracingIndirectArgs,
            0);
    }

    FLumenCardTracingInputs TracingInputs(GraphBuilder, Scene, View);

    (......)

    // 追踪反射.
    TraceReflections(
        GraphBuilder, 
        Scene,
        View, 
        GLumenReflectionTraceMeshSDFs != 0 && Lumen::UseMeshSDFTracing(),
        SceneTextures,
        TracingInputs,
        ReflectionTracingParameters,
        ReflectionTileParameters,
        MeshSDFGridParameters);
    
    (......)

    {
        FReflectionResolveCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FReflectionResolveCS::FParameters>();
        
        (......)
        
        auto ComputeShader = View.ShaderMap->GetShader<FReflectionResolveCS>(PermutationVector);

        // 解析反射.
        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("ReflectionResolve"),
            ComputeShader,
            PassParameters,
            ReflectionTileParameters.ResolveIndirectArgs,
            0);
    }

    (......)

    // 更新历史数据.
    UpdateHistoryReflections(
        GraphBuilder,
        View,
        SceneTextures,
        ReflectionTileParameters,
        ResolvedSpecularIndirect,
        SpecularIndirect);

    return SpecularIndirect;
}

void TraceReflections(
    FRDGBuilder& GraphBuilder,
    const FScene* Scene,
    const FViewInfo& View,
    bool bTraceMeshSDFs,
    const FSceneTextures& SceneTextures,
    const FLumenCardTracingInputs& TracingInputs,
    const FLumenReflectionTracingParameters& ReflectionTracingParameters,
    const FLumenReflectionTileParameters& ReflectionTileParameters,
    const FLumenMeshSDFGridParameters& InMeshSDFGridParameters)
{
    {
        (......)

        auto ComputeShader = View.ShaderMap->GetShader<FReflectionClearTracesCS>(0);

        // 清理追踪输出纹理.
        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("ClearTraces"),
            ComputeShader,
            PassParameters,
            ReflectionTileParameters.TracingIndirectArgs,
            0);
    }

    FLumenIndirectTracingParameters IndirectTracingParameters;
    SetupIndirectTracingParametersForReflections(IndirectTracingParameters);

    const FSceneTextureParameters& SceneTextureParameters = GetSceneTextureParameters(GraphBuilder, SceneTextures);

    const bool bScreenTraces = GLumenReflectionScreenTraces != 0;

    if (bScreenTraces)
    {
        FReflectionTraceScreenTexturesCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FReflectionTraceScreenTexturesCS::FParameters>();

        (......)

        FReflectionTraceScreenTexturesCS::FPermutationDomain PermutationVector;
        auto ComputeShader = View.ShaderMap->GetShader<FReflectionTraceScreenTexturesCS>(PermutationVector);

        // 屏幕追踪.
        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("TraceScreen"),
            ComputeShader,
            PassParameters,
            ReflectionTileParameters.TracingIndirectArgs,
            0);
    }
    
    // 网格距离场追踪.
    if (bTraceMeshSDFs)
    {
        if (Lumen::UseHardwareRayTracedReflections()) // 硬件追踪反射.
        {
            FCompactedReflectionTraceParameters CompactedTraceParameters = CompactTraces(
                GraphBuilder,
                View,
                ReflectionTracingParameters,
                ReflectionTileParameters,
                WORLD_MAX,
                IndirectTracingParameters.MaxTraceDistance);

            RenderLumenHardwareRayTracingReflections(
                GraphBuilder,
                SceneTextureParameters,
                View,
                ReflectionTracingParameters,
                ReflectionTileParameters,
                TracingInputs,
                CompactedTraceParameters,
                IndirectTracingParameters.MaxTraceDistance);
        }
        else
        {
            FLumenMeshSDFGridParameters MeshSDFGridParameters = InMeshSDFGridParameters;
            if (!MeshSDFGridParameters.NumGridCulledMeshSDFObjects)
            {
                CullForCardTracing(
                    GraphBuilder,
                    Scene, View,
                    TracingInputs,
                    IndirectTracingParameters,
                    /* out */ MeshSDFGridParameters);
            }

            if (MeshSDFGridParameters.TracingParameters.DistanceFieldObjectBuffers.NumSceneObjects > 0)
            {
                // 压缩追踪.
                FCompactedReflectionTraceParameters CompactedTraceParameters = CompactTraces(
                    GraphBuilder,
                    View,
                    ReflectionTracingParameters,
                    ReflectionTileParameters,
                    IndirectTracingParameters.CardTraceEndDistanceFromCamera,
                    IndirectTracingParameters.MaxMeshSDFTraceDistance);

                {
                    (......)
                    
                    auto ComputeShader = View.ShaderMap->GetShader<FReflectionTraceMeshSDFsCS>(PermutationVector);

                    // 追踪网格距离场.
                    FComputeShaderUtils::AddPass(
                        GraphBuilder,
                        RDG_EVENT_NAME("TraceMeshSDFs"),
                        ComputeShader,
                        PassParameters,
                        CompactedTraceParameters.IndirectArgs,
                        0);
                }
            }
        }
    }

    FCompactedReflectionTraceParameters CompactedTraceParameters = CompactTraces(...);

    {
        (......)
        
        auto ComputeShader = View.ShaderMap->GetShader<FReflectionTraceVoxelsCS>(PermutationVector);

        // 追踪Voxel光照.
        FComputeShaderUtils::AddPass(
            GraphBuilder,
            RDG_EVENT_NAME("TraceVoxels"),
            ComputeShader,
            PassParameters,
            CompactedTraceParameters.IndirectArgs,
            0);
    }
}

Lumen反射非直接光和Lumen漫反射非直接光最重要的区别是它们追踪的射线数量和方式有所不同,Lumen反射需要指定追踪的最大粗糙度GLumenReflectionMaxRoughnessToTrace(默认值是0.4,可由控制台命令r.Lumen.Reflections.MaxRoughnessToTrace改变),生成的TraceHit、TraceRadiance结果也会不同。

由于反射和漫反射涉及到的技术高度相似,此文就不再细究其技术细节了。

6.5.7.4 DiffuseIndirectComposite

此阶段就是将之前的RenderLumenScreenProbeGather生成的探针的信息(DiffuseIndirect、RoughSpecularIndirect)和RenderLumenReflections生成的反射信息(SpecularIndirect),结合场景的GBuffer及相关数据,生成最终的场景颜色:

组合了GI的漫反射和镜面反射后的场景颜色。(放大1.5倍,颜色范围做了调整)

至于组合的过程,可以在其使用的PS中找到答案:

// Engine\Shaders\Private\DiffuseIndirectComposite.usf

void MainPS(
    float4 SvPosition : SV_POSITION
    , out float4 OutAddColor : SV_Target0
    , out float4 OutMultiplyColor : SV_Target1
)
{
    float2 SceneBufferUV = SvPositionToBufferUV(SvPosition);
    float2 ScreenPosition = SvPositionToScreenPosition(SvPosition).xy;

    // 采样场景的GBuffer.
    FGBufferData GBuffer = GetGBufferDataFromSceneTextures(SceneBufferUV);

    // 采样每帧动态生成的AO.
    float DynamicAmbientOcclusion = AmbientOcclusionTexture.SampleLevel(AmbientOcclusionSampler, SceneBufferUV, 0).r;

    // 计算最终要应用的AO.  
    float AOMask = (GBuffer.ShadingModelID != SHADINGMODELID_UNLIT);
    float FinalAmbientOcclusion = lerp(1.0f, GBuffer.GBufferAO * DynamicAmbientOcclusion, AOMask * AmbientOcclusionStaticFraction);

    float3 TranslatedWorldPosition = mul(float4(ScreenPosition * GBuffer.Depth, GBuffer.Depth, 1), View.ScreenToTranslatedWorld).xyz;

    float3 N = GBuffer.WorldNormal;
    float3 V = normalize(View.TranslatedWorldCameraOrigin - TranslatedWorldPosition);
    float NoV = saturate(dot(N, V));

    // 应用非直接漫反射.
#if DIM_APPLY_DIFFUSE_INDIRECT
    {
        float3 DiffuseIndirectLighting = 0;
        float3 RoughSpecularIndirectLighting = 0;
        float3 SpecularIndirectLighting = 0;

        #if DIM_APPLY_DIFFUSE_INDIRECT == 4
            DiffuseIndirectLighting = DiffuseIndirect_Textures_0.SampleLevel(GlobalPointClampedSampler, SceneBufferUV, 0).rgb;
            RoughSpecularIndirectLighting = DiffuseIndirect_Textures_1.SampleLevel(GlobalPointClampedSampler, SceneBufferUV, 0).rgb;
            SpecularIndirectLighting = DiffuseIndirect_Textures_2.SampleLevel(GlobalPointClampedSampler, SceneBufferUV, 0).rgb;
        #else
        {
            // 采样降噪器的输出.
            FSSDKernelConfig KernelConfig = CreateKernelConfig();
                
            #if DEBUG_OUTPUT
            {
                KernelConfig.DebugPixelPosition = uint2(SvPosition.xy);
                KernelConfig.DebugEventCounter = 0;
            }
            #endif

            // Compile time.
            KernelConfig.bSampleKernelCenter = true;
            KernelConfig.BufferLayout = CONFIG_SIGNAL_INPUT_LAYOUT;
            KernelConfig.bUnroll = true;

            #if DIM_UPSCALE_DIFFUSE_INDIRECT
            {
                KernelConfig.SampleSet = SAMPLE_SET_2X2_BILINEAR;
                KernelConfig.BilateralDistanceComputation = SIGNAL_WORLD_FREQUENCY_REF_METADATA_ONLY;
                KernelConfig.WorldBluringDistanceMultiplier = 16.0;
                
                KernelConfig.BilateralSettings[0] = BILATERAL_POSITION_BASED(3);
                
                // SGPRs(Scalar General Purpose Register, 标量通用寄存器)
                KernelConfig.BufferSizeAndInvSize = View.BufferSizeAndInvSize * float4(0.5, 0.5, 2.0, 2.0);
                KernelConfig.BufferBilinearUVMinMax = View.BufferBilinearUVMinMax;
            }
            #else
            {
                KernelConfig.SampleSet = SAMPLE_SET_1X1;
                KernelConfig.bNormalizeSample = true;
                
                // SGPRs
                KernelConfig.BufferSizeAndInvSize = View.BufferSizeAndInvSize;
                KernelConfig.BufferBilinearUVMinMax = View.BufferBilinearUVMinMax;
            }
            #endif

            // VGPRs(Vector General Purpose Register, 向量通用寄存器)
            KernelConfig.BufferUV = SceneBufferUV; 
            {
                KernelConfig.CompressedRefSceneMetadata = GBufferDataToCompressedSceneMetadata(GBuffer);
                KernelConfig.RefBufferUV = SceneBufferUV;
                KernelConfig.RefSceneMetadataLayout = METADATA_BUFFER_LAYOUT_DISABLED;
            }
            KernelConfig.HammersleySeed = Rand3DPCG16(int3(SvPosition.xy, View.StateFrameIndexMod8)).xy;
                
            FSSDSignalAccumulatorArray UncompressedAccumulators = CreateSignalAccumulatorArray();
            FSSDCompressedSignalAccumulatorArray CompressedAccumulators = CompressAccumulatorArray(
                UncompressedAccumulators, CONFIG_ACCUMULATOR_VGPR_COMPRESSION);

            // 累加卷积核
            AccumulateKernel(
                KernelConfig,
                DiffuseIndirect_Textures_0,
                DiffuseIndirect_Textures_1,
                DiffuseIndirect_Textures_2,
                DiffuseIndirect_Textures_3,
                /* inout */ UncompressedAccumulators,
                /* inout */ CompressedAccumulators);

            // 采样
            FSSDSignalSample Sample;
            #if DIM_UPSCALE_DIFFUSE_INDIRECT
                Sample = NormalizeToOneSample(UncompressedAccumulators.Array[0].Moment1);
            #else
                Sample = UncompressedAccumulators.Array[0].Moment1;
            #endif
            
            // DIM_APPLY_DIFFUSE_INDIRECT是1或3时只有漫反射非直接光.
            #if DIM_APPLY_DIFFUSE_INDIRECT == 1 || DIM_APPLY_DIFFUSE_INDIRECT == 3
            {
                DiffuseIndirectLighting = Sample.SceneColor.rgb;
            }
            // DIM_APPLY_DIFFUSE_INDIRECT是2时有漫反射和镜面非直接光.
            #elif DIM_APPLY_DIFFUSE_INDIRECT == 2
            {
                DiffuseIndirectLighting = UncompressedAccumulators.Array[0].Moment1.ColorArray[0];
                SpecularIndirectLighting = UncompressedAccumulators.Array[0].Moment1.ColorArray[1];
            }
            #else
                #error Unimplemented
            #endif
        }
        #endif

        float3 DiffuseColor = bVisualizeDiffuseIndirect ? float3(.18f, .18f, .18f) : GBuffer.DiffuseColor;
        float3 SpecularColor = GBuffer.SpecularColor;

        #if DIM_APPLY_DIFFUSE_INDIRECT == 4
            RemapClearCoatDiffuseAndSpecularColor(GBuffer, NoV, DiffuseColor, SpecularColor);
        #endif

        #if DIM_APPLY_DIFFUSE_INDIRECT == 2 || DIM_APPLY_DIFFUSE_INDIRECT == 4
            float DiffuseIndirectAO = 1;
        #else
            float DiffuseIndirectAO = lerp(1, FinalAmbientOcclusion, ApplyAOToDynamicDiffuseIndirect);
        #endif

        FDirectLighting IndirectLighting;
        if (GBuffer.ShadingModelID == SHADINGMODELID_HAIR)
        {
            IndirectLighting.Diffuse = DiffuseIndirectLighting * GBuffer.BaseColor;
            IndirectLighting.Specular = 0;
        }
        else
        {
            IndirectLighting.Diffuse = DiffuseIndirectLighting * DiffuseColor * DiffuseIndirectAO;
            IndirectLighting.Transmission = 0;

            #if DIM_APPLY_DIFFUSE_INDIRECT == 4
                IndirectLighting.Specular = CombineRoughSpecular(GBuffer, NoV, SpecularIndirectLighting, RoughSpecularIndirectLighting, SpecularColor);
            #else
                IndirectLighting.Specular = SpecularIndirectLighting * EnvBRDF(SpecularColor, GBuffer.Roughness, NoV);
            #endif
        }

        const bool bNeedsSeparateSubsurfaceLightAccumulation = UseSubsurfaceProfile(GBuffer.ShadingModelID);

        if (bNeedsSeparateSubsurfaceLightAccumulation &&
            View.bSubsurfacePostprocessEnabled > 0 && View.bCheckerboardSubsurfaceProfileRendering > 0)
        {
            bool bChecker = CheckerFromSceneColorUV(SceneBufferUV);

            // Adjust for checkerboard. only apply non-diffuse lighting (including emissive) 
            // to the specular component, otherwise lighting is applied twice
            IndirectLighting.Specular *= !bChecker;
        }

        // 累加光照结果.
        FLightAccumulator LightAccumulator = (FLightAccumulator)0;
        LightAccumulator_Add(
            LightAccumulator,
            IndirectLighting.Diffuse + IndirectLighting.Specular,
            IndirectLighting.Diffuse,
            1.0f,
            bNeedsSeparateSubsurfaceLightAccumulation);
        // 获取光照结果.
        OutAddColor = LightAccumulator_GetResult(LightAccumulator);
    }
    #else
    {
        OutAddColor = 0;
    }
    #endif

    OutMultiplyColor = FinalAmbientOcclusion;
}

6.5.8 Lumen总结

Lumen的步骤很多很复杂,但总结起来可分为几个步骤:

1、构建MeshCard和LumenCard,更新它们。

2、根据Lumen场景的Card信息,追踪并更新对应的纹素(Texel)。

3、在漫反射和镜面反射阶段,利用多种方式追踪和计算屏幕空间表面的光照。

4、组合前述步骤得到的非直接光的漫反射和镜面反射,获得叠加了非直接光的最终场景颜色。

另外,在追踪过程中涉及到了多种方式,并且它们是按照权重过渡而成(下图)。

混合追踪示意图。红色表示屏幕追踪,绿色表示网格距离场追踪,蓝色表示Voxel Lighting追踪。颜色过渡代表着不同类型追踪之间的过渡。

修改DEBUG_VISUALIZE_TRACE_TYPES为1且在命令行关闭ShowFlag.DirectLighting可以开启追踪权重可视化模式:

// Engine\Shaders\Private\Lumen\LumenScreenProbeTracing.usf

#define DEBUG_VISUALIZE_TRACE_TYPES 1 // 启用追踪权重可视化(默认为0)

整体上,Lumen综合了SSGI、SDF(Mesh SDF和Global SDF)、Lumen Card、Voxel Cone等追踪技术,应用了各种技术生成了各类数据息(自适应的Screen Space Probe、 Irradiance Probe、Surface Cache、Prefilter Radiance、Voxel Lighting、RSM、Virtual Texture、Clipmap),计算出非直接光的漫反射和镜面反射,最后按权重混合成场景颜色。

Lumen漫反射GI支持软硬件两种方式,默认参数下,其软件方式涉及的各类追踪描述如下:

追踪类型译名范围描述
Screen Trace屏幕追踪全场景亦即SSGI,只要能追踪到相交点,优先使用其反弹信息。
Voxel Lighting Trace体素光照追踪距相机200米内基于Cone的射线追踪,会采样MIP快速得到不同Hit距离的信息。
Detail MeshCard Trace细节网格卡片追踪2~40米采样MeshCard 光照信息时会使⽤类似VSM的⽅式使⽤概率估算遮挡。
Distant MeshCard Trace远距网格卡片追踪200~1000米会追踪预先生成的全局距离场,不再使用遮挡估算。

Lumen镜面反射GI也支持软硬件两种方式,其中软件方式结合了SSR + SDF Tracing(Mesh SDF、Global SDF)的技术。

6.6 其它渲染技术

6.6.1 Temporal Super Resolution

时间超分辨率(Temporal Super Resolution,TSR)是新一代的时间抗锯齿算法,用来替换传统(UE4)的TAA。它的特性有利于低分辨率输入获得高分辨率的输出,且质量解决原生分辨率,在高频下更少鬼影更少闪烁,针对PS5等平台做了优化,但同时需要SM5.0以上的图形平台。

TSR使用的技术跟NVIDIA的DLSS和AMD的FidelityFX Super Resolution(FSR)相似,只是DLSS基于Tensor Core的深度学习做了加速,而TSR不需要依赖Tensor Core。换句话说,TSR可以不依赖RTX显卡而运行于其它显卡厂商的设备。TSR由于可以采用低分辨率输出高分辨率的纹理,所以不仅可以提升抗锯齿效果,还可以提升渲染性能,减少能耗。

不同于UE4,UE5只要配置没有显式禁用TemporalAA,无论选择了何种抗锯齿,在后处理阶段都会走TSR通道。调用堆栈如下所示:

// Engine\Source\Runtime\Renderer\Private\PostProcess\PostProcessing.cpp

void AddPostProcessingPasses(FRDGBuilder& GraphBuilder, const FViewInfo& View, ...)
{
    (......)
    
    // TAA抗锯齿.
    EMainTAAPassConfig TAAConfig = ITemporalUpscaler::GetMainTAAPassConfig(View);
    // TAA配置没有禁用.
    if (TAAConfig != EMainTAAPassConfig::Disabled)
    {
        (......)
        
        // 调用FDefaultTemporalUpscaler::AddPasses, 见后面的解析.
        UpscalerToUse->AddPasses(
            GraphBuilder,
            View,
            UpscalerPassInputs,
            &SceneColor.Texture,
            &SecondaryViewRect,
            &DownsampledSceneColor.Texture,
            &DownsampledSceneColor.ViewRect);
    }
    
    (......)
}

// Engine\Source\Runtime\Renderer\Private\PostProcess\TemporalAA.cpp

void FDefaultTemporalUpscaler::AddPasses(FRDGBuilder& GraphBuilder, const FViewInfo& View,...) const final
{
    // 如果启用了且支持第五代TAA, 则进入TSR通道.
    if (CVarTAAAlgorithm.GetValueOnRenderThread() && DoesPlatformSupportGen5TAA(View.GetShaderPlatform()))
    {
        *OutSceneColorHalfResTexture = nullptr;

        return AddTemporalSuperResolutionPasses(
            GraphBuilder,
            View,
            PassInputs,
            OutSceneColorTexture,
            OutSceneColorViewRect);
    }
    (......)
}

由此进入了AddTemporalSuperResolutionPasses,以下是RenderDoc截取的TSR渲染过程:

由此可知,TSR相比UE4的TAA多了很多个Pass,主要包含清理上一帧纹理、放大速度缓冲、摒弃无效速度缓冲、过滤频率、对比历史数据、后置过滤重投射、放大重投射、更新历史等几个阶段。

其中以上阶段最重要的一步是更新历史阶段,它会根据输入的场景颜色、深度、放大后速度、视差系数、历史帧数据(放大后重投影、重投影、高频、低频、元数据、子像素信息)等数据生成最终的抗锯齿后的场景颜色和当前的历史帧数据。

左:场景颜色输入;右:TSR后的场景颜色输出。

TSR输出的历史帧数据:低频、高频、元数据、子像素信息。

下面直接进入更新历史阶段使用的Compute Shader进行分析:

// /Engine/Private/TemporalAA/TAAUpdateHistory.usf

[numthreads(TILE_SIZE, TILE_SIZE, 1)]
void MainCS(
    uint2 GroupId : SV_GroupID,
    uint GroupThreadIndex : SV_GroupIndex)
{
    uint GroupWaveIndex = GetGroupWaveIndex(GroupThreadIndex, /* GroupSize = */ TILE_SIZE * TILE_SIZE);

    float4 Debug = 0.0;

    // 历史像素位置.
    taa_short2 HistoryPixelPos = (
        taa_short2(GroupId) * taa_short2(TILE_SIZE, TILE_SIZE) +
        Map8x8Tile2x2Lane(GroupThreadIndex));

    float2 ViewportUV = (float2(HistoryPixelPos) + 0.5f) * HistoryInfo_ViewportSizeInverse;
    float2 ScreenPos = ViewportUVToScreenPos(ViewportUV);
    
    // 输入视口中输出像素O中心的像素坐标.
    float2 PPCo = ViewportUV * InputInfo_ViewportSize + InputJitter;

    // 最近的输入像素K的中心像素坐标。
    float2 PPCk = floor(PPCo) + 0.5;
    
    taa_short2 InputPixelPos = ClampPixelOffset(
        taa_short2(InputPixelPosMin) + taa_short2(PPCo),
        InputPixelPosMin, InputPixelPosMax);

    // 获取重投影相关的信息.
    float2 PrevScreenPos = ScreenPos;
    taa_half ParallaxRejectionMask = taa_half(1.0);
    taa_half LowFrequencyRejection = taa_half(1.0);
    taa_half OutputPixelVelocity = taa_half(0.0);
    #if 1
    {
        float2 EncodedVelocity = DilatedVelocityTexture[InputPixelPos];
        ParallaxRejectionMask = ParallaxRejectionMaskTexture[InputPixelPos];

        float2 ScreenVelocity = DecodeVelocityFromTexture(float4(EncodedVelocity, 0.0, 0.0)).xy;

        PrevScreenPos = ScreenPos - ScreenVelocity;
        OutputPixelVelocity = taa_half(length(ScreenVelocity * HistoryInfo_ViewportSize));

        taa_ushort2 RejectionPixelPos = (taa_ushort2(InputPixelPos) - taa_short2(InputPixelPosMin)) / 2;
        LowFrequencyRejection = HistoryRejectionTexture[RejectionPixelPos];
        
        #if !CONFIG_CLAMP
        {
            ParallaxRejectionMask = taa_half(1.0);
            LowFrequencyRejection = taa_half(1.0);
        }
        #endif
    }
    #endif

    // 获取像素是否响应AA.
    bool bIsResponsiveAAPixel = false;
    #if CONFIG_RESPONSIVE_STENCIL
    {
        const uint kResponsiveStencilMask = 1 << 3;
            
        uint SceneStencilRef = InputSceneStencilTexture.Load(int3(InputPixelPos, 0)) STENCIL_COMPONENT_SWIZZLE;

        bIsResponsiveAAPixel = (SceneStencilRef & kResponsiveStencilMask) != 0;
    }
    #endif
    
    // 检测HistoryBufferUV是否在视口之外.
    bool bOffScreen = IsOffScreen(bCameraCut, PrevScreenPos, ParallaxRejectionMask);
    
    taa_half TotalRejection = bOffScreen ? 0.0 : saturate(LowFrequencyRejection * 4.0);


    // 以预测频率过滤输入场景颜色.
    taa_half3 FilteredInputColor;
    taa_half3 InputMinColor;
    taa_half3 InputMaxColor;
    taa_half InputPixelAlignement;
    taa_half ClosestInputLuma4;
    
    ISOLATE
    {
        // 从像素K到O的向量.
        taa_half2 dKO = taa_half2(PPCo - PPCk);

        FilteredInputColor = taa_half(0.0);

        taa_half FilteredInputColorWeight = taa_half(0.0);
        
        #if 0 // shader compiler bug :'(
            taa_half InputToHistoryFactor = taa_half(HistoryInfo_ViewportSize.x * InputInfo_ViewportSizeInverse.x);
            taa_half FinalInputToHistoryFactor = bOffScreen ? taa_half(1.0) : InputToHistoryFactor;
        #else
            float InputToHistoryFactor = float(HistoryInfo_ViewportSize.x * InputInfo_ViewportSizeInverse.x);
            float FinalInputToHistoryFactor = lerp(1.0, InputToHistoryFactor, TotalRejection);
        #endif

        InputMinColor = taa_half(INFINITE_FLOAT);
        InputMaxColor = taa_half(-INFINITE_FLOAT);

        // 根据CONFIG_SAMPLES用不同方式生成采样坐标并采样输入的场景颜色.
        UNROLL_N(CONFIG_SAMPLES)
        for (uint SampleId = 0; SampleId < CONFIG_SAMPLES; SampleId++)
        {
            taa_short2 SampleInputPixelPos;
            taa_half2 PixelOffset;
            
            #if CONFIG_SAMPLES == 9
            {
                taa_short2 iPixelOffset = taa_short2(kOffsets3x3[kSquareIndexes3x3[SampleId]]);
                PixelOffset = taa_half2(iPixelOffset);
                
                SampleInputPixelPos = AddAndClampPixelOffset(
                    InputPixelPos,
                    iPixelOffset, iPixelOffset,
                    InputPixelPosMin, InputPixelPosMax);
            }
            #elif CONFIG_SAMPLES == 5 || CONFIG_SAMPLES == 6
            {
                if (SampleId == 5)
                {
                    taa_short2 iPixelOffset;
                    #if CONFIG_COMPILE_FP16
                        iPixelOffset = int16_t2(1, 1) - int16_t2((asuint16(dKO) & uint16_t(0x8000)) >> uint16_t(14));
                        PixelOffset = asfloat16(asuint16(half(1.0)).xx | (asuint16(dKO) & uint16_t(0x8000)));
                    #else
                        iPixelOffset = SignFastInt(dKO);
                        PixelOffset = asfloat(asuint(1.0).xx | (asuint(dKO) & uint(0x80000000)));
                    #endif
                        
                    SampleInputPixelPos = ClampPixelOffset(InputPixelPos, InputPixelPosMin, InputPixelPosMax);
                }
                else
                {
                    taa_short2 iPixelOffset = taa_short2(kOffsets3x3[kPlusIndexes3x3[SampleId]]);
                    PixelOffset = taa_half2(iPixelOffset);
                    
                    SampleInputPixelPos = AddAndClampPixelOffset(
                        InputPixelPos,
                        iPixelOffset, iPixelOffset,
                        InputPixelPosMin, InputPixelPosMax);
                }
            }
            #else
                #error Unknown sample count
            #endif

            taa_half3 InputColor = InputSceneColorTexture[SampleInputPixelPos];

            taa_half2 dPP = PixelOffset - dKO;
            taa_half SampleSpatialWeight = ComputeSampleWeigth(FinalInputToHistoryFactor, dPP, /* MinimalContribution = */ float(0.005));

            taa_half ToneWeight = HdrWeight4(InputColor);

            FilteredInputColor       += (SampleSpatialWeight * ToneWeight) * InputColor;
            FilteredInputColorWeight += (SampleSpatialWeight * ToneWeight);

            if (SampleId == 0)
            {
                ClosestInputLuma4 = Luma4(InputColor);
                InputMinColor = TransformColorForClampingBox(InputColor);
                InputMaxColor = TransformColorForClampingBox(InputColor);
            }
            else
            {
                InputMinColor = min(InputMinColor, TransformColorForClampingBox(InputColor));
                InputMaxColor = max(InputMaxColor, TransformColorForClampingBox(InputColor));
            }
        }
        
        FilteredInputColor *= rcp(FilteredInputColorWeight);

        InputPixelAlignement = ComputeSampleWeigth(InputToHistoryFactor, dKO, /* MinimalContribution = */ float(0.0));
    }
        
    // 保存到LDS中,为VGPR采样历史数据腾出空间.
    #if CONFIG_MANUAL_LDS_SPILL
    ISOLATE
    {
        uint LocalGroupThreadIndex = GetGroupThreadIndex(GroupThreadIndex, GroupWaveIndex);

        SharedArray0[LocalGroupThreadIndex] = taa_half4(FilteredInputColor, LowFrequencyRejection);
        SharedArray1[LocalGroupThreadIndex] = taa_half4(InputMinColor, InputPixelAlignement);
        SharedArray2[LocalGroupThreadIndex] = taa_half4(InputMaxColor, OutputPixelVelocity);
    }
    #endif
    
    // 重投影历史数据.
    taa_half3 PrevHistoryMoment1;
    taa_half PrevHistoryValidity;
    
    taa_half3 PrevHistoryMommentMin;
    taa_half3 PrevHistoryMommentMax;

    taa_half3 PrevFallbackColor;
    taa_half PrevFallbackWeight;
    
    taa_subpixel_details PrevSubpixelDetails;

    ISOLATE
    {
        // 重投影历史数据.
        taa_half3 RawHistory0 = taa_half(0);
        taa_half3 RawHistory1 = taa_half(0);
        taa_half2 RawHistory2 = taa_half(0);

        taa_half3 RawHistory1Min = INFINITE_FLOAT;
        taa_half3 RawHistory1Max = -INFINITE_FLOAT;

        // 采样原始的历史数据.
        {
            float2 PrevHistoryBufferUV = (PrevHistoryInfo_ScreenPosToViewportScale * PrevScreenPos + PrevHistoryInfo_ScreenPosToViewportBias) * PrevHistoryInfo_ExtentInverse;
            PrevHistoryBufferUV = clamp(PrevHistoryBufferUV, PrevHistoryInfo_UVViewportBilinearMin, PrevHistoryInfo_UVViewportBilinearMax);

            #if 1
            {
                FCatmullRomSamples Samples = GetBicubic2DCatmullRomSamples(PrevHistoryBufferUV, PrevHistoryInfo_Extent, PrevHistoryInfo_ExtentInverse);

                UNROLL
                for (uint i = 0; i < Samples.Count; i++)
                {
                    float2 SampleUV = clamp(Samples.UV[i], PrevHistoryInfo_UVViewportBilinearMin, PrevHistoryInfo_UVViewportBilinearMax);

                    taa_half3 Sample0 = PrevHistory_Textures_0.SampleLevel(GlobalBilinearClampedSampler, SampleUV, 0);
                    taa_half3 Sample1 = PrevHistory_Textures_1.SampleLevel(GlobalBilinearClampedSampler, SampleUV, 0);
                    taa_half2 Sample2 = PrevHistory_Textures_2.SampleLevel(GlobalBilinearClampedSampler, SampleUV, 0);

                    RawHistory1Min = min(RawHistory1Min, Sample1 * SafeRcp(Sample2.g));
                    RawHistory1Max = max(RawHistory1Max, Sample1 * SafeRcp(Sample2.g));

                    RawHistory0 += Sample0 * taa_half(Samples.Weight[i]);
                    RawHistory1 += Sample1 * taa_half(Samples.Weight[i]);
                    RawHistory2 += Sample2 * taa_half(Samples.Weight[i]);
                }
                RawHistory0 *= taa_half(Samples.FinalMultiplier);
                RawHistory1 *= taa_half(Samples.FinalMultiplier);
                RawHistory2 *= taa_half(Samples.FinalMultiplier);
            }
            #else
            {
                RawHistory0 = PrevHistory_Textures_0.SampleLevel(GlobalBilinearClampedSampler, PrevHistoryBufferUV, 0);
                RawHistory1 = PrevHistory_Textures_1.SampleLevel(GlobalBilinearClampedSampler, PrevHistoryBufferUV, 0);
                RawHistory2 = PrevHistory_Textures_2.SampleLevel(GlobalBilinearClampedSampler, PrevHistoryBufferUV, 0);
            }
            #endif
            
            FSubpixelNeighborhood SubpixelNeighborhood = GatherPrevSubpixelNeighborhood(PrevHistory_Textures_3, PrevHistoryBufferUV);
            {
                PrevSubpixelDetails = 0;
                UNROLL_N(SUB_PIXEL_COUNT)
                for (uint SubpixelId = 0; SubpixelId < SUB_PIXEL_COUNT; SubpixelId++)
                {
                    taa_subpixel_payload SubpixelPayload = GetSubpixelPayload(SubpixelNeighborhood, SubpixelId);
                    PrevSubpixelDetails |= SubpixelPayload << (SUB_PIXEL_BIT_COUNT * SubpixelId);
                }
            }

            RawHistory0 = -min(-RawHistory0, taa_half(0.0));
            RawHistory1 = -min(-RawHistory1, taa_half(0.0));
            RawHistory2 = -min(-RawHistory2, taa_half(0.0));
        }
        
        // 解压历史数据.
        {
            PrevFallbackColor = RawHistory0;
            PrevFallbackWeight = RawHistory2.r;
            
            PrevHistoryMommentMin = RawHistory1Min;
            PrevHistoryMommentMax = RawHistory1Max;

            PrevHistoryMoment1 = RawHistory1;
            PrevHistoryValidity = RawHistory2.g;
        }

        // 校正历史数据.
        {
            PrevHistoryMommentMin *= taa_half(HistoryPreExposureCorrection);
            PrevHistoryMommentMax *= taa_half(HistoryPreExposureCorrection);
            PrevHistoryMoment1 *= taa_half(HistoryPreExposureCorrection);
            PrevFallbackColor *= taa_half(HistoryPreExposureCorrection);
        }
    }
    
    // 从LDS读取数据.
    #if CONFIG_MANUAL_LDS_SPILL
    ISOLATE
    {
        uint LocalGroupThreadIndex = GetGroupThreadIndex(GroupThreadIndex, GroupWaveIndex);

        taa_half4 RawLDS0 = SharedArray0[LocalGroupThreadIndex];
        taa_half4 RawLDS1 = SharedArray1[LocalGroupThreadIndex];
        taa_half4 RawLDS2 = SharedArray2[LocalGroupThreadIndex];

        FilteredInputColor = RawLDS0.rgb;
        InputMinColor = RawLDS1.rgb;
        InputMaxColor = RawLDS2.rgb;
        
        LowFrequencyRejection = RawLDS0.a;
        InputPixelAlignement = RawLDS1.a;
        OutputPixelVelocity = RawLDS2.a;
    }
    #endif

    // 如果当前低频偏离历史低频, 摒弃高频细节.
    #if CONFIG_LOW_FREQUENCY_DRIFT_REJECTION
    {
        taa_half3 PrevHighFrequencyYCoCg = TransformColorForClampingBox(PrevHistoryMoment1 * SafeRcp(PrevHistoryValidity));
        taa_half3 PrevYCoCg = TransformColorForClampingBox(PrevFallbackColor);
        taa_half3 ClampedPrevYCoCg = TransformColorForClampingBox(clamp(PrevFallbackColor, PrevHistoryMommentMin, PrevHistoryMommentMax));

        taa_half HighFrequencyRejection = MeasureRejectionFactor(
            PrevYCoCg, ClampedPrevYCoCg,
            PrevHighFrequencyYCoCg, InputMinColor, InputMaxColor);
        
        PrevHistoryMoment1 *= HighFrequencyRejection;
        PrevHistoryValidity *= HighFrequencyRejection;
    }
    #endif

    // 将当前帧的输入输入到下一帧的预测器中.
    const taa_half Histeresis = rcp(taa_half(MAX_SAMPLE_COUNT));
    const taa_half PredictionOnlyValidity = Histeresis * taa_half(2.0);
    
    // 截取备选数据.
    taa_half LumaMin;
    taa_half LumaMax;
    taa_half3 ClampedFallbackColor;
    taa_half FallbackRejection;
    {
        LumaMin = InputMinColor.x;
        LumaMax = InputMaxColor.x;

        taa_half3 PrevYCoCg = TransformColorForClampingBox(PrevFallbackColor);
        taa_half3 ClampedPrevYCoCg = clamp(PrevYCoCg, InputMinColor, InputMaxColor);
        taa_half3 InputCenterYCoCg = TransformColorForClampingBox(FilteredInputColor);

        ClampedFallbackColor = YCoCgToRGB(ClampedPrevYCoCg);
        
        FallbackRejection = MeasureRejectionFactor(
            PrevYCoCg, ClampedPrevYCoCg,
            InputCenterYCoCg, InputMinColor, InputMaxColor);

        #if !CONFIG_CLAMP
        {
            ClampedFallbackColor = PrevFallbackColor;
            FallbackRejection = taa_half(1.0);
        }
        #endif
    }

    taa_half3 FinalHistoryMoment1;
    taa_half FinalHistoryValidity;
    {
        // 根据完整性,计算需要摒弃多少历史记录.
        taa_half PrevHistoryRejectionWeight = LowFrequencyRejection;
            
        FLATTEN
        if (bOffScreen)
        {
            PrevHistoryRejectionWeight = taa_half(0.0);
        }

        taa_half DesiredCurrentContribution = max(Histeresis * InputPixelAlignement, taa_half(0.0));

        // 确定基于预测的摒弃是否足够可信.
        taa_half RejectionConfidentEnough = taa_half(1); // saturate(RejectionValidity * MAX_SAMPLE_COUNT - 3.0);

        // 计算新摒弃的有效性.
        taa_half RejectedValidity = (
            min(PrevHistoryValidity, PredictionOnlyValidity - DesiredCurrentContribution) +
            max(PrevHistoryValidity - PredictionOnlyValidity + DesiredCurrentContribution, taa_half(0.0)) * PrevHistoryRejectionWeight);

        RejectedValidity = PrevHistoryValidity * PrevHistoryRejectionWeight;

        // 计算最大输出有效性.
        taa_half OutputValidity = (
            clamp(RejectedValidity + DesiredCurrentContribution, taa_half(0.0), PredictionOnlyValidity) +
            clamp(RejectedValidity + DesiredCurrentContribution * PrevHistoryRejectionWeight * RejectionConfidentEnough - PredictionOnlyValidity, 0.0, 1.0 - PredictionOnlyValidity));

        FLATTEN
        if (bIsResponsiveAAPixel)
        {
            OutputValidity = taa_half(0.0);
        }
        
        taa_half InvPrevHistoryValidity = SafeRcp(PrevHistoryValidity);

        taa_half PrevMomentWeight = max(OutputValidity - DesiredCurrentContribution, taa_half(0.0));
        taa_half CurrentMomentWeight = min(DesiredCurrentContribution, OutputValidity);
        
        {
            taa_half PrevHistoryToneWeight = HdrWeightY(Luma4(PrevHistoryMoment1) * InvPrevHistoryValidity);
            taa_half FilteredInputToneWeight = HdrWeight4(FilteredInputColor);
            
            taa_half BlendPrevHistory = PrevMomentWeight * PrevHistoryToneWeight;
            taa_half BlendFilteredInput = CurrentMomentWeight * FilteredInputToneWeight;

            taa_half CommonWeight = OutputValidity * SafeRcp(BlendPrevHistory + BlendFilteredInput);

            FinalHistoryMoment1 = (
                PrevHistoryMoment1 * (CommonWeight * BlendPrevHistory * InvPrevHistoryValidity) +
                FilteredInputColor * (CommonWeight * BlendFilteredInput));
        }

        // 量化有效性的8位编码调整,以避免数字偏移.
        taa_half OutputInvValidity = SafeRcp(OutputValidity);
        FinalHistoryValidity = ceil(taa_half(255.0) * OutputValidity) * rcp(taa_half(255.0));
        FinalHistoryMoment1 *= FinalHistoryValidity * OutputInvValidity;
    }

    // 计算备用的历史数据.
    taa_half3 FinalFallbackColor;
    taa_half FinalFallbackWeight;
    {
        const taa_half TargetHesteresisCurrentFrameWeight = rcp(taa_half(MAX_FALLBACK_SAMPLE_COUNT));

        taa_half LumaHistory = Luma4(PrevFallbackColor);
        taa_half LumaFiltered = Luma4(FilteredInputColor);

        {
            taa_half OutputBlend = ComputeFallbackContribution(FinalHistoryValidity);
        }

        taa_half BlendFinal;
        #if 1
        {
            taa_half CurrentFrameSampleCount = max(InputPixelAlignement, taa_half(0.005));
            
            // 仅使用一个样本计数就可以极快地恢复历史摒弃, 但随后立即稳定,以便子像素频率可以尽快使用.
            taa_half PrevFallbackSampleCount;
            FLATTEN
            if (PrevFallbackWeight < taa_half(1.0))
            {
                PrevFallbackSampleCount = PrevFallbackWeight;
            }
            else
            {
                PrevFallbackSampleCount = taa_half(MAX_FALLBACK_SAMPLE_COUNT);
            }

            // 根据低频摒弃历史数据.
            #if 1
            {
                taa_half PrevFallbackRejectionFactor = saturate(LowFrequencyRejection * (CurrentFrameSampleCount + PrevFallbackSampleCount) / PrevFallbackSampleCount);

                PrevFallbackSampleCount *= PrevFallbackRejectionFactor;
            }
            #endif

            BlendFinal = CurrentFrameSampleCount / (CurrentFrameSampleCount + PrevFallbackSampleCount);

            // 增加运动的混合权重.
            #if 1
            {
                BlendFinal = lerp(BlendFinal, max(taa_half(0.2), BlendFinal), saturate(OutputPixelVelocity * rcp(taa_half(40.0))));
            }
            #endif

            // 抗闪烁.
            #if 1
            {
                taa_half DistToClamp = min( abs(LumaHistory - LumaMin), abs(LumaHistory - LumaMax) ) / max3( LumaHistory, LumaFiltered, taa_half(1e-4) );
                BlendFinal *= taa_half(0.2) + taa_half(0.8) * saturate(taa_half(0.5) * DistToClamp);
            }
            #endif
            
            // 确保至少有一些小的贡献.
            #if 1
            {
                BlendFinal = max( BlendFinal, saturate( taa_half(0.01) * LumaHistory / abs( LumaFiltered - LumaHistory ) ) );
            }
            #endif

            // 反应力度是新帧的1/4.
            BlendFinal = bIsResponsiveAAPixel ? taa_half(1.0/4.0) : BlendFinal;

            // 完全摒弃历史数据.
            {
                PrevFallbackSampleCount *= TotalRejection;
                BlendFinal = lerp(1.0, BlendFinal, TotalRejection);
            }

            FinalFallbackWeight = saturate(CurrentFrameSampleCount + PrevFallbackSampleCount);
            
            #if 1
                FinalFallbackWeight = saturate(floor(255.0 * (CurrentFrameSampleCount + PrevFallbackSampleCount)) * rcp(255.0));
            #endif
        }
        #endif

        {
            taa_half FilterWeight = HdrWeight4(FilteredInputColor);
            taa_half ClampedHistoryWeight = HdrWeight4(ClampedFallbackColor);

            taa_half2 Weights = WeightedLerpFactors(ClampedHistoryWeight, FilterWeight, BlendFinal);

            FinalFallbackColor = ClampedFallbackColor * Weights.x + FilteredInputColor * Weights.y;
        }
    }

    // 更新子像素细节.
    taa_subpixel_details FinalSubpixelDetails;
    {
        taa_half2 dKO = taa_half2(PPCo - PPCk);

        bool bUpdate = all(abs(dKO) < 0.5 * (InputInfo_ViewportSize.x * HistoryInfo_ViewportSizeInverse.x));

        FinalSubpixelDetails = PrevSubpixelDetails;

        taa_subpixel_payload ParallaxFactorBits = ParallaxFactorTexture[InputPixelPos] & SUB_PIXEL_PARALLAX_FACTOR_BIT_MASK;

        {
            const uint ParallaxFactorMask = (
                (SUB_PIXEL_PARALLAX_FACTOR_BIT_MASK << (SUB_PIXEL_PARALLAX_FACTOR_BIT_OFFSET + 0 * SUB_PIXEL_BIT_COUNT)) | 
                (SUB_PIXEL_PARALLAX_FACTOR_BIT_MASK << (SUB_PIXEL_PARALLAX_FACTOR_BIT_OFFSET + 1 * SUB_PIXEL_BIT_COUNT)) | 
                (SUB_PIXEL_PARALLAX_FACTOR_BIT_MASK << (SUB_PIXEL_PARALLAX_FACTOR_BIT_OFFSET + 2 * SUB_PIXEL_BIT_COUNT)) | 
                (SUB_PIXEL_PARALLAX_FACTOR_BIT_MASK << (SUB_PIXEL_PARALLAX_FACTOR_BIT_OFFSET + 3 * SUB_PIXEL_BIT_COUNT)) | 
                0x0);
            
            // 重置视差系数.
            FLATTEN
            if (bOffScreen)
            {
                FinalSubpixelDetails = FinalSubpixelDetails & ~ParallaxFactorMask;
            }
        }

        FLATTEN
        if (bUpdate)
        {
            bool2 bBool = dKO < 0.0;

            uint SubpixelId = dot(uint2(bBool), uint2(1, SUB_PIXEL_GRID_SIZE));
            uint SubpixelShift = SubpixelId * SUB_PIXEL_BIT_COUNT;

            taa_subpixel_payload SubpixelPayload = (ParallaxFactorBits << SUB_PIXEL_PARALLAX_FACTOR_BIT_OFFSET);

            FinalSubpixelDetails = (FinalSubpixelDetails & (~(SUB_PIXEL_BIT_MASK << SubpixelShift))) | (SubpixelPayload << SubpixelShift);
        }
    }

    // 计算最终输出.
    taa_half3 FinalOutputColor;
    taa_half FinalOutputValidity;
    {
        taa_half OutputBlend = ComputeFallbackContribution(FinalHistoryValidity);

        FinalOutputValidity = lerp(taa_half(1.0), saturate(FinalHistoryValidity), OutputBlend);

        taa_half3 NormalizedFinalHistoryMoment1 = taa_half3(FinalHistoryMoment1 * float(SafeRcp(FinalHistoryValidity)));

        taa_half FallbackWeight = HdrWeight4(FinalFallbackColor);
        taa_half Moment1Weight = HdrWeight4(NormalizedFinalHistoryMoment1);

        taa_half2 Weights = WeightedLerpFactors(FallbackWeight, Moment1Weight, OutputBlend);

        #if DEBUG_FALLBACK_BLENDING
            taa_half3 FallbackColor = taa_half3(1, 0.25, 0.25);
            taa_half3 HighFrequencyColor = taa_half3(0.25, 1, 0.25);

            FinalOutputColor = FinalFallbackColor * Weights.x * FallbackColor + NormalizedFinalHistoryMoment1 * Weights.y * HighFrequencyColor;
        #elif DEBUG_LOW_FREQUENCY_REJECTION
            taa_half3 DebugColor = lerp(taa_half3(1, 0.5, 0.5), taa_half3(0.5, 1, 0.5), LowFrequencyRejection);
            
            FinalOutputColor = FinalFallbackColor * Weights.x * DebugColor + NormalizedFinalHistoryMoment1 * Weights.y * DebugColor;
        #else
            FinalOutputColor = FinalFallbackColor * Weights.x + NormalizedFinalHistoryMoment1 * Weights.y;
        #endif
    }

    ISOLATE
    {
        uint LocalGroupThreadIndex = GetGroupThreadIndex(GroupThreadIndex, GroupWaveIndex);

        taa_short2 LocalHistoryPixelPos = (
            taa_short2(GroupId) * taa_short2(TILE_SIZE, TILE_SIZE) +
            Map8x8Tile2x2Lane(LocalGroupThreadIndex));
            
        LocalHistoryPixelPos = InvalidateOutputPixelPos(LocalHistoryPixelPos, HistoryInfo_ViewportMax);

        // 输出最终的历史数据.
        {
            #if CONFIG_ENABLE_STOCASTIC_QUANTIZATION
            {
                uint2 Random = Rand3DPCG16(int3(LocalHistoryPixelPos, View.StateFrameIndexMod8)).xy;
                float2 E = Hammersley16(0, 1, Random);

                FinalHistoryMoment1 = QuantizeForFloatRenderTarget(FinalHistoryMoment1, E.x, HistoryQuantizationError);
                FinalFallbackColor = QuantizeForFloatRenderTarget(FinalFallbackColor, E.x, HistoryQuantizationError);
            }
            #endif

            FinalFallbackColor = -min(-FinalFallbackColor, taa_half(0.0));
            FinalHistoryMoment1 = -min(-FinalHistoryMoment1, taa_half(0.0));
            FinalFallbackColor = min(FinalFallbackColor, taa_half(Max10BitsFloat));
            FinalHistoryMoment1 = min(FinalHistoryMoment1, taa_half(Max10BitsFloat));
            
            HistoryOutput_Textures_0[LocalHistoryPixelPos] = FinalFallbackColor;
            HistoryOutput_Textures_1[LocalHistoryPixelPos] = FinalHistoryMoment1;
            HistoryOutput_Textures_2[LocalHistoryPixelPos] = taa_half2(FinalFallbackWeight, FinalHistoryValidity);
            HistoryOutput_Textures_3[LocalHistoryPixelPos] = FinalSubpixelDetails;

            #if DEBUG_OUTPUT
            {
                DebugOutput[LocalHistoryPixelPos] = Debug;
            }
            #endif
        }

        // 输出最终的场景颜色.
        {
            taa_half3 OutputColor = FinalOutputColor;
                
            OutputColor = -min(-OutputColor, taa_half(0.0));
            OutputColor = min(OutputColor, taa_half(Max10BitsFloat));

            SceneColorOutput[LocalHistoryPixelPos] = OutputColor;
        }
    }
}

由此可知,相较传统的TAA,TSR增加了很多数据,包含当前和历史的高频、低频、视差系数、重投影等等数据,先后根据这些信息摒弃或恢复历史数据,生成当前帧的混合权重,最终算出抗锯齿之后的场景颜色和历史帧数据。

以上代码只是TSR的最后一个阶段更新历史数据的代码,前面还有很多步骤来生成此阶段所需的数据,此文不再分析,留给读者们自行研究。

6.6.2 Strata

笔者粗略地看了Strata的相关代码,看起来Strata类似于UE4的Material Layer,但它主要应用于Nanite几何体的材质投射、混合和光影处理。Strata有专用的材质、材质节点、着色模型、可视化模式和Shader处理模块。不过,当前EA版本尚处于体验阶段,限制较多。涉及Strata的主要文件有:

  • Strata.h/cpp
  • StrataMaterial.h/cpp
  • StrataDefinitions.h
  • MaterialExpressionStrata.h
  • Strata.ush
  • BasePassPixelShader.usf
  • DeferredLightPixelShaders.usf
  • 场景渲染管线、光照相关的代码。

有兴趣的同学自行研读相关源码。

6.7 本篇总结

本篇主要阐述了UE5的编辑器特性、Nanite、Lumen及相关渲染技术,但由于UE5改动巨大,无法覆盖所有的技术点,除了本篇文章谈及的技术,实际上还有很多未涉及的,这就需要感兴趣的读者自己去探索UE的源码了。

UE5 EA阶段,无论是Nanite还是Lumen,都存在着诸多瑕疵,如Nanite只支持静态物体,Lumen的噪点、漏光,TSR的闪烁和模糊,阴影精度的不足(下图),海量传统特性的不支持......

镜头离物体足够近时出现的物体模糊和阴影瑕疵。

虽然UE5目前存在着诸多瑕疵,但它是沐浴着阳光雨露的小树苗,经过Epic Game的精心培育,假以时日,终会成长为枝繁叶茂的参天大树,荫护着UE引擎关联的各行各业。UE5 really No.1!!!

特别说明

  • 感谢所有参考文献的作者,部分图片来自参考文献和网络,侵删。
  • 本系列文章为笔者原创,只发表在博客园上,欢迎分享本文链接,但未经同意,不允许转载
  • 系列文章,未完待续,完整目录请戳内容纲目
  • 系列文章,未完待续,完整目录请戳内容纲目
  • 系列文章,未完待续,完整目录请戳内容纲目

参考文献

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
### 回答1: 您可以使用Lumen来构建API和Microservices,以及快速构建和维护您的应用程序后端。具体操作方法可以参考官方文档,可以了解更多关于如何使用Lumen的信息:https://lumen.laravel.com/docs/5.6 ### 回答2: UE5 Lumen虚幻引擎5中全新的全局光照系统,它能够提供卓越的视觉效果和更高质量的实时光照渲染。下面我将简要介绍UE5 Lumen的一些使用方法。 首先,启用Lumen需要确保项目使用的是虚幻引擎5。在项目设置中,可以找到“Renderer Settings(渲染设置)”并将全局光照模式切换为“Lumen”。 使用Lumen之后,您可以通过调整Lumen的各种参数和属性来获得期望的光照效果。在虚幻编辑器中,可以通过在“World Settings(世界设置)”面板中找到Lumen相关的设置。可以调整光源的影响范围、默认反射率、灯光传播速度等。 此外,Lumen还通过使用Virtual Shadow Maps(虚拟阴影贴图)的方式提供了一个方便的动态阴影解决方案。这些虚拟阴影贴图将在运行时生成,可以覆盖较大范围的场景并提供更高质量的动态阴影效果。 Lumen还提供了自适应全局光照(ALGI)功能,这意味着光照会根据场景中的物体进行实时计算,以提供更真实的光照效果。在项目中启用ALGI后,可以在“World Settings”面板中的Lumen设置下找到对应选项。 最后,Lumen还支持与虚幻引擎5中的其他特性和工具的集成。例如,您可以将LumenLumen的虚拟光源、物理基础解算器和其他虚幻引擎5功能一起使用,以获得更出色的效果。 总之,UE5 Lumen虚幻引擎5中一款非常强大和灵活的全局光照系统。通过了解和使用Lumen相关的设置和功能,开发者可以在项目中获得更真实、更高质量的光照效果。 ### 回答3: UE5 Lumen是一项用于实时光线追踪的技术,以下是关于如何使用UE5 Lumen的相关信息。 首先,要使用UE5 Lumen,您需要打开UE5编辑器并创建一个新的项目或打开现有的项目。然后,您需要确保您的项目使用的是UE5版本,因为Lumen仅在UE5中可用。 一旦项目打开,您可以通过导航到“项目设置”>“引擎”>“渲染器设置”来启用Lumen。在这里,您可以找到一些Lumen相关的选项以及不同的属性进行调整。这些属性包括光线传输的距离、光照强度、间接光照的质量等等。 在设置完Lumen的属性后,您可以在场景中开始使用Lumen。您可以在场景中添加光源,并通过调整其属性来控制光照的参数,例如光照强度、颜色等等。 Lumen将根据这些光源的位置和属性来计算场景的光照效果。 另外,您还可以使用Lumen的虚拟光源功能。虚拟光源是一种模拟真实光源的方法,允许您创建没有直接光照但会产生间接光照效果的区域。您可以在场景中将虚拟光源放置在需要间接光照的区域,并调整其属性以实现所需的效果。 值得注意的是,Lumen需要消耗较多的计算资源,因此在使用Lumen时需要权衡计算性能和图形质量。您可以在场景中选择性地启用Lumen,以确保在处理复杂场景时获得平衡的性能和效果。 总的来说,UE5 Lumen是一项强大的实时光线追踪技术,可以为UE5项目带来真实而逼真的光影效果。通过在项目设置中启用Lumen,并合理调整光源的属性,您可以在场景中使用Lumen并在渲染时体验到其带来的改进。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值