前言
本文主要从引擎源码的角度分析DLSS、FSR这样的超分辨率技术是如何接入到游戏的渲染管线中以及如何使用渲染管线中的资源完成超分算法,而不过多深入超分算法以及Shader代码本身,对于超分原理部分DLSS和FSR官方都有很详细的文档,全文是建立在对DLSS、FSR在Unreal引擎中的插件代码分析上完成的,由于DLSS的源码大多不开源,所以主体分析以FSR2为主
超分技术简介
利用游戏画面的连续性,将重建高分辨率画面的信息与成本平摊到此前多帧中,通过采集到的多种Buffer信息来帮助丢弃错误信息和确定混合权重,将当前帧画面与过往帧画面进行混合最终输出高分辨率图像
下面是一些超分辨率算法常使用到的一些引擎信息
Color buffer
当前帧渲染分辨率下计算得到的颜色缓冲
用途
1.在Compute Luminance Pyramid阶段计算Exposure和luminance(一张1/2渲染分辨率的光照贴图)
2.在Depth Clip阶段计算历史像素位置,并检测是否可用,将结果写入Adjusted color buffer的Alpha channel中
3.在Reproject&Accumulance阶段作为Upscalling的基础,并据此与历史像素混合
Depth buffer
以渲染分辨率计算的深度值缓冲
用途
1.在Reconstruct&Dilate阶段计算重构上一帧的depth buffer(输出Dilated Depth buffer)
2.在Depth Clip阶段辅助剔除掉不合格的历史像素
Motion vector
以渲染分辨率计算的运动向量,坐标系是二维屏幕坐标系
用途
1.在Reconstruct&Dilate阶段计算得到UV坐标系下的motion vectors(输出Dilated motion vectors)
2.在Depth Clip阶段寻找本帧像素对应的上一帧像素
Reactive Mask
以渲染分辨率标注的遮罩值,介于0-1(0~0.9)之间
用途
用于标注Alpha Blend(半透明)物体,由于半透明物体无法写入深度值和Motion vector,所以需要Reactive Mask来调整混合权重,一般将Alpha通道值Clamp为0~0.9写入该值,在Accumulate阶段的混合权重就是由Reactive Mask决定的,可以理解为每个像素对混合的响应度。
Explosure
单一值,可设定或由Compute Luminance Pyramid阶段根据Color buffer计算得到
用途
一般不需要设定,在Compute Luminance Pyramid可自动计算,是用于ToneMapping(色调映射)的,如果程序使用HDR等功能话可以手动设定对应的值
Transparent&Composition mask
渲染分辨率下的遮罩值,默认为0,可自行写入设定
用途
用于特殊渲染例如光线追踪反射和顶点动画,能够影响历史像素的保护机制,并且移除光照不稳定因子,如果某个像素对应遮罩值为1意味着该像素对应的历史像素将被完全移除,该像素将被“锁定”。
Jitter Offset
在混合多帧像素时的抖动(Subpixel),通过将一个随机的偏移量应用到相机的变换矩阵上来实现subpixel的抖动,该偏移量一般来说是一个随机序列(例如Halton序列),每一帧依次循环的对这个序列进行采样得到的值,这是无论是TSR还是TAA都会使用到的一个重要参数,在Unreal中可以直接在PostProcessing的输入参数ViewInfo中找到该帧的对应值
用途
混合阶段通过采样过去帧的偏移量,能够以更高的分辨率更均匀的混合样本数据
源码分析
这是FSR2插件的源码模块图,我们分模块来看
FFXFSR2Api/FFXFSR2D3D12/FFXFSR2Vulkan
针对不同平台引用了不同的头文件和声明不同宏,除了Vulkan外其他都是空Module(StartupModule和ShutdownModule为空),只有{ModuleName}.h文件有作用。FSR2Include.h判断不同平台后引用了这些头文件,然后又被主要模块FSR2TemporalUpscaling中的其他文件引用。
FSR2
主模块但不是实际的功能模块,主要是负责将实际功能模块TemporalUpscaler接入到渲染管线中,具体是通过注册实例化FFSR2ViewExtension对象,通过ViewExtension扩展渲染管线后处理阶段。
class FFSR2Module final : public IModuleInterface
{
public:
// IModuleInterface implementation
void StartupModule() override;
void ShutdownModule() override;
private:
TSharedPtr<FFSR2ViewExtension, ESPMode::ThreadSafe> FSR2ViewExtension;
};
void FFSR2Module::StartupModule()
{
// FSR2's view extension will always be enabled, but that isn't the same as enabling FSR2 itself.
// This allows FSR2 to coexist with other upscalers.
FSR2ViewExtension = FSceneViewExtensions::NewExtension<FFSR2ViewExtension>();
}
void FFSR2Module::ShutdownModule()
{
// This is a smart pointer. Setting it to null is the correct way to release its memory.
FSR2ViewExtension = nullptr;
}
FSR2TemporalUpscaling
真正的核心模块,RHI中包含了Upscaling使用到的所有GlobalShader(Pass),其余部分是负责和引擎进行对接,我们还是从前面提到的入口ViewExtension看起
渲染管线接入
Unreal在整个渲染流程中预留了很多可注册修改的位置给ViewExtension,目的是为了让渲染模块化,不需要修改引擎源码使用插件即可定制部分渲染管线。
对于超分所关注的后处理部分而言,主要有两个扩展接入方式,第一个方式是使用后处理管线中预留的四个可插入Pass的位置,
通过从源码深入理解Unreal渲染管线我们知道整个渲染管线位于DeferedShadingRenderer::Render中,其中关于后处理的部分是通过AddPostProcessingPass这个函数实现的,在这个函数中存在一个匿名函数AddAfterPass,它的作用实际上就是通过GetAfterPassCallbacks查找符合当前条件的回调函数然后调用。而这个条件实际上就是插入位置,回调函数就是我们在ExtensionView中注册好的执行函数。
const auto AddAfterPass = [&](EPass InPass, FScreenPassTexture InSceneColor) -> FScreenPassTexture
{
// In some cases (e.g. OCIO color conversion) we want View Extensions to be able to add extra custom post processing after the pass.
FAfterPassCallbackDelegateArray& PassCallbacks = PassSequence.GetAfterPassCallbacks(InPass);
if (PassCallbacks.Num())
{
FPostProcessMaterialInputs InOutPostProcessAfterPassInputs = GetPostProcessMaterialInputs(InSceneColor);
for (int32 AfterPassCallbackIndex = 0; AfterPassCallbackIndex < PassCallbacks.Num(); AfterPassCallbackIndex++)
{
InOutPostProcessAfterPassInputs.SetInput(EPostProcessMaterialInput::SceneColor, InSceneColor);
FAfterPassCallbackDelegate& AfterPassCallback = PassCallbacks[AfterPassCallbackIndex];
PassSequence.AcceptOverrideIfLastPass(InPass, InOutPostProcessAfterPassInputs.OverrideOutput, AfterPassCallbackIndex);
InSceneColor = AfterPassCallback.Execute(GraphBuilder, View, InOutPostProcessAfterPassInputs);
}
}
return MoveTemp(InSceneColor);
};
......
if(PassSequence.IsEnabled(EPass::MotionBlur)){...}
FScreenPassTexture NewSceneColor = AddAfterPass(EPass::MotionBlur, SceneColor);
if(PassSequence.IsEnabled(EPass::Tonemap)){...}
SceneColor = AddAfterPass(EPass::Tonemap, SceneColor);
if(PassSequence.IsEnabled(EPass::FXAA)){...}
SceneColor = AddAfterPass(EPass::FXAA, SceneColor);
if(PassSequence.IsEnabled(EPass::VisualizeDepthOfField)){...}
SceneColor = AddAfterPass(EPass::VisualizeDepthOfField, SceneColor);
可以看到四个插入位置分别是MotionBlur、Tonemap、FXAA和VisualizeDepthField之后,在ExtensionView中我们可以重写SubscribeToPostProcessingPass来指定插入到后处理管线的位置,同时可以复写IsActiveThisFrame_Internal来判断此帧是否要插入
void FPixelInspectorSceneViewExtension::SubscribeToPostProcessingPass(EPostProcessingPass PassId, FAfterPassCallbackDelegateArray& InOutPassCallbacks, bool bIsPassEnabled)
{
if (PassId == EPostProcessingPass::FXAA)
{
InOutPassCallbacks.Add(FAfterPassCallbackDelegate::CreateRaw(this, &FPixelInspectorSceneViewExtension::PostProcessPassAfterFxaa_RenderThread));
}
if (PassId == EPostProcessingPass::MotionBlur)
{
InOutPassCallbacks.Add(FAfterPassCallbackDelegate::CreateRaw(this, &FPixelInspectorSceneViewExtension::PostProcessPassAfterMotionBlur_RenderThread));
}
}
以PixelInspector插件为例,它通过重写ViewExtension::SubscribeToPostProcessingPass在FXAA和MotionBlur后分别插入了两个Pass
这是后处理管线最常见的扩展方式,但对于Upscaler类型的扩展Unreal提供了另外一种方式,也是DLSS和FSR都在使用的方式
因为TemporalUpscaler发生的位置实际上是确定的,基本是在后处理管线的最前端
在前面提到的4个可插入后处理Pass执行之前,Unreal会调用ViewFamily中注册的TemporalUpscaler来添加UpscalerPass
if (TAAConfig != EMainTAAPassConfig::Disabled)
{
const ITemporalUpscaler* UpscalerToUse = (TAAConfig == EMainTAAPassConfig::ThirdParty) ? View.Family->GetTemporalUpscalerInterface() : ITemporalUpscaler::GetDefaultTemporalUpscaler();
ITemporalUpscaler::FPassInputs UpscalerPassInputs;
UpscalerPassInputs.DownsampleOverrideFormat = DownsampleOverrideFormat;
UpscalerPassInputs.SceneColorTexture = SceneColor.Texture;
UpscalerPassInputs.SceneDepthTexture = SceneDepth.Texture;
UpscalerPassInputs.SceneVelocityTexture = Velocity.Texture;
UpscalerPassInputs.PostDOFTranslucencyResources = PostDOFTranslucencyResources;
UpscalerPassInputs.MoireInputTexture = TSRMoireInput;
ITemporalUpscaler::FOutputs Outputs = UpscalerToUse->AddPasses(
GraphBuilder,
View,
UpscalerPassInputs);
SceneColor = Outputs.FullRes;
HalfResSceneColor = Outputs.HalfRes;
QuarterResSceneColor = Outputs.QuarterRes;
VelocityFlattenTextures = Outputs.VelocityFlattenTextures;
}
而注册一般是在ViewExtension的BeginRenderViewFamily中
void FDLSSUpscalerViewExtension::BeginRenderViewFamily(FSceneViewFamily& ViewFamily)
{
if (!ViewFamily.GetTemporalUpscalerInterface())
{
GetGlobalDLSSUpscaler()->SetupViewFamily(ViewFamily);
}
}
void FFSR2ViewExtension::BeginRenderViewFamily(FSceneViewFamily& InViewFamily)
{
InViewFamily.SetTemporalUpscalerInterface(new FFSR2TemporalUpscalerProxy(Upscaler));
}
DLSS和FSR就是通过这种方式将自身嵌入到后处理管线中
DLSS的整个ViewExtension中就只重写了这一个函数,而FSR还写了一大堆内容去捕获管线上某些位置的资源,后面结合Upscaler可以看到,我们直接进入最重要最复杂的Upscaler中探索
资源准备
Upscaling的过程中需要若干渲染管线上的资源,有可能是管线特定阶段的输出,具体如前面所示,这一部分主要分析一下几个被特别计算保存在Upscaler的成员变量中的资源,DLSS这部分被隐藏起来了所以不做分析
ReflectionTexture
ReflectionTexture是通过在Upsampler中重载了IScreenSpaceDenoiser::DenoiseReflection获取得到的,这个函数会在DeferredRenderer::RenderDeferredReflectionAndSkyLighting中计算完反射后被调用
IScreenSpaceDenoiser::FReflectionsOutputs FFSR2TemporalUpscaler::DenoiseReflections(
FRDGBuilder& GraphBuilder,
const FViewInfo& View,
FPreviousViewInfo* PreviousViewInfos,
const FSceneTextureParameters& SceneTextures,
const FReflectionsInputs& ReflectionInputs,
const FReflectionsRayTracingConfig RayTracingConfig) const
{
IScreenSpaceDenoiser::FReflectionsOutputs Outputs;
Outputs.Color = ReflectionInputs.Color;
if (FSR2ShouldRenderRayTracingReflections(View) || CVarFSR2UseExperimentalSSRDenoiser.GetValueOnRenderThread())
{
Outputs = WrappedDenoiser->DenoiseReflections(GraphBuilder, View, PreviousViewInfos, SceneTextures, ReflectionInputs, RayTracingConfig);
}
else if (IsFSR2SSRTemporalPassRequired(View))
{
const bool bComposePlanarReflections = FSR2HasDeferredPlanarReflections(View);
check(View.ViewState);
FTAAPassParameters TAASettings(View);
TAASettings.Pass = ETAAPassConfig::ScreenSpaceReflections;
TAASettings.SceneDepthTexture = SceneTextures.SceneDepthTexture;
TAASettings.SceneVelocityTexture = SceneTextures.GBufferVelocityTexture;
TAASettings.SceneColorInput = ReflectionInputs.Color;
TAASettings.bOutputRenderTargetable = bComposePlanarReflections;
FTAAOutputs TAAOutputs = AddTemporalAAPass(
GraphBuilder,
View,
TAASettings,
View.PrevViewInfo.SSRHistory,
&View.ViewState->PrevFrameViewInfo.SSRHistory);
Outputs.Color = TAAOutputs.SceneColor;
}
ReflectionTexture = Outputs.Color;
return Outputs;
}
SceneColorPreAlpha
SceneColorPreAlpha的获取依赖于FFSR2FXSystem,这个类唯一的作用就是获取渲染Alpha物体前的SceneColor,其中只重写了FFXSystemInterface::PostRenderOpaque,这个函数会在管线的RenderOpaqueFX中被调用,位置介于Atmosphere Pass和Render Translucency之间,获取了当前ScnenTexture->Color后,使用一个Pass通过Upscaler::CopyOpaqueSceneColor将其拷贝到SceneColorPreAlphaRT中
void PostRenderOpaque(FRDGBuilder& GraphBuilder, TConstArrayView<FViewInfo> Views, bool bAllowGPUParticleUpdate)
{
FRHIUniformBuffer* ViewUniformBuffer = GetReferenceViewUniformBuffer(Views);
const FSceneTextures* SceneTextures = GetViewFamilyInfo(Views).GetSceneTexturesChecked();
FRDGTextureMSAA PreAlpha = SceneTextures->Color;
auto const& Config = SceneTextures->Config;
EPixelFormat SceneColorFormat = Config.ColorFormat;
uint32 NumSamples = Config.NumSamples;
if (Upscaler->SceneColorPreAlpha.GetReference() == nullptr)
{
FRHITextureCreateDesc SceneColorPreAlphaCreateDesc = FRHITextureCreateDesc::Create2D(TEXT("FSR2SceneColorPreAlpha"), QuantizedSize.X, QuantizedSize.Y, SceneColorFormat);
SceneColorPreAlphaCreateDesc.SetNumMips(1);
SceneColorPreAlphaCreateDesc.SetNumSamples(NumSamples);
SceneColorPreAlphaCreateDesc.SetFlags((ETextureCreateFlags)(ETextureCreateFlags::RenderTargetable | ETextureCreateFlags::ShaderResource));
Upscaler->SceneColorPreAlpha = RHICreateTexture(SceneColorPreAlphaCreateDesc);
Upscaler->SceneColorPreAlphaRT = CreateRenderTarget(Upscaler->SceneColorPreAlpha.GetReference(), TEXT("FSR2SceneColorPreAlpha"));
}
FFSR2FXPass::FParameters* PassParameters = GraphBuilder.AllocParameters<FFSR2FXPass::FParameters>();
FRDGTextureRef SceneColorPreAlphaRDG = GraphBuilder.RegisterExternalTexture(Upscaler->SceneColorPreAlphaRT);
PassParameters->InputColorTexture = PreAlpha.Target;
PassParameters->OutputColorTexture = SceneColorPreAlphaRDG;
GraphBuilder.AddPass(RDG_EVENT_NAME("FFSR2FXSystem::PostRenderOpaque"), PassParameters, ERDGPassFlags::Copy,
[this, PassParameters, ViewUniformBuffer, PreAlpha](FRHICommandListImmediate& RHICmdList)
{
PassParameters->InputColorTexture->MarkResourceAsUsed();
PassParameters->OutputColorTexture->MarkResourceAsUsed();
Upscaler->PreAlpha = PreAlpha;
Upscaler->CopyOpaqueSceneColor(RHICmdList, ViewUniformBuffer, nullptr, this->SceneTexturesUniformParams);
}
);
}
PostInput
PostProcessInput是通过重载ViewExtension::PrePostProcessPass_RenderThread来捕获渲染管线后处理阶段的输入参数,它的调用位置是在调用AddPostProcessingPass开始后处理部分之前,Upscaler中很多地方都是使用了这个输入中的数据而不是AddPasses中的PassInput,注意区分(具体区别暂时不清楚)
void FFSR2ViewExtension::PrePostProcessPass_RenderThread(FRDGBuilder& GraphBuilder, const FSceneView& View, const FPostProcessingInputs& Inputs)
{
// FSR2 requires the separate translucency data which is only available through the post-inputs so bind them to the upscaler now.
if (View.GetFeatureLevel() >= ERHIFeatureLevel::SM5)
{
if (CVarEnableFSR2.GetValueOnAnyThread())
{
IFSR2TemporalUpscalingModule& FSR2ModuleInterface = FModuleManager::GetModuleChecked<IFSR2TemporalUpscalingModule>(TEXT("FSR2TemporalUpscaling"));
FSR2ModuleInterface.GetFSR2Upscaler()->SetPostProcessingInputs(Inputs);
}
}
}
其中包括了TranslucencyResourcesMap,SceneTexture实际上就是完整的GBuffer,包括VelocityBuffer等,整体内容基本和Upscaler的AddPasses中传入的PassInput一样。
进入超分流程
上面可以看到后处理调用的入口是Upsampler中的AddPasses函数
DLSS
ITemporalUpscaler::FOutputs FDLSSSceneViewFamilyUpscaler::AddPasses(
FRDGBuilder& GraphBuilder,
const FViewInfo& View,
const FPassInputs& PassInputs
) const
{
const FTemporalAAHistory& InputHistory = View.PrevViewInfo.TemporalAAHistory;
const TRefCountPtr<ICustomTemporalAAHistory> InputCustomHistory = View.PrevViewInfo.CustomTemporalAAHistory;
FTemporalAAHistory* OutputHistory = View.ViewState ? &(View.ViewState->PrevFrameViewInfo.TemporalAAHistory) : nullptr;
TRefCountPtr < ICustomTemporalAAHistory >* OutputCustomHistory = View.ViewState ? &(View.ViewState->PrevFrameViewInfo.CustomTemporalAAHistory) : nullptr;
FDLSSPassParameters DLSSParameters(View);
const FIntRect SecondaryViewRect = DLSSParameters.OutputViewRect;
ITemporalUpscaler::FOutputs Outputs;
{
const bool bDilateMotionVectors = CVarNGXDLSSDilateMotionVectors.GetValueOnRenderThread() != 0;
FRDGTextureRef CombinedVelocityTexture = AddVelocityCombinePass(GraphBuilder, View, PassInputs.SceneDepthTexture, PassInputs.SceneVelocityTexture, bDilateMotionVectors);
DLSSParameters.SceneColorInput = PassInputs.SceneColorTexture;
DLSSParameters.SceneVelocityInput = CombinedVelocityTexture;
DLSSParameters.SceneDepthInput = PassInputs.SceneDepthTexture;
DLSSParameters.bHighResolutionMotionVectors = bDilateMotionVectors;
const FDLSSOutputs DLSSOutputs = AddDLSSPass(
GraphBuilder,
View,
DLSSParameters,
InputHistory,
OutputHistory,
InputCustomHistory,
OutputCustomHistory
);
Outputs.FullRes.Texture = DLSSOutputs.SceneColor;
Outputs.FullRes.ViewRect = SecondaryViewRect;
}
return Outputs;
}
DLSS中的AddPasses其实就是将Unreal原生计算的VelocityBuffer调用AddVelocityCombinePass进行了调整,然后重新填入PassParameters,并且获取了TemporalAAHistory作为输入参数,然后调用AddDLSSPass。
AddDLSSPass中首先分配了输出Buffer,然后申请了RDGResouces作为PassParameter,将真正的Pass添加到RDG中(之前其实一直都只是函数间的直接调用,资源也都是普通线程资源)
FDLSSOutputs Outputs;
{
FRDGTextureDesc SceneColorDesc = FRDGTextureDesc::Create2D(
OutputExtent,
PF_FloatRGBA,
FClearValueBinding::Black,
TexCreate_ShaderResource | TexCreate_UAV);
const TCHAR* OutputName = TEXT("DLSSOutputSceneColor");
Outputs.SceneColor = GraphBuilder.CreateTexture(
SceneColorDesc,
OutputName);
}
FDLSSStateRef DLSSState = (InputCustomHistory && InputCustomHistory->DLSSState) ? InputCustomHistory->DLSSState : MakeShared<FDLSSState, ESPMode::ThreadSafe>();
{
FDLSSShaderParameters* PassParameters = GraphBuilder.AllocParameters<FDLSSShaderParameters>();
// Input buffer shader parameters
{
PassParameters->SceneColorInput = Inputs.SceneColorInput;
PassParameters->SceneDepthInput = Inputs.SceneDepthInput;
PassParameters->SceneVelocityInput = Inputs.SceneVelocityInput;
PassParameters->EyeAdaptation = GetEyeAdaptationTexture(GraphBuilder, View);
}
// Outputs
{
PassParameters->SceneColorOutput = Outputs.SceneColor;
}
{获取jitteroffset,DeltaWorldTime等Pass引用参数}
GraphBuilder.AddPass(
RDG_EVENT_NAME("DLSS %s%s %dx%d -> %dx%d",
PassName,
Sharpness != 0.0f ? TEXT(" Sharpen") : TEXT(""),
SrcRect.Width(), SrcRect.Height(),
DestRect.Width(), DestRect.Height()),
PassParameters,
ERDGPassFlags::Compute | ERDGPassFlags::Raster | ERDGPassFlags::SkipRenderPass,
// FRHICommandListImmediate forces it to run on render thread, FRHICommandList doesn't
[LocalNGXRHIExtensions, PassParameters, Inputs, bCameraCut, JitterOffset, DeltaWorldTime, PreExposure, Sharpness, NGXDLAAPreset, NGXDLSSPreset, NGXPerfQuality, DLSSState, bUseAutoExposure, bReleaseMemoryOnDelete](FRHICommandListImmediate& RHICmdList)
{
FRHIDLSSArguments DLSSArguments;
FMemory::Memzero(&DLSSArguments, sizeof(DLSSArguments));
// input parameters
DLSSArguments.SrcRect = Inputs.InputViewRect;
{...}
PassParameters->SceneColorInput->MarkResourceAsUsed();
DLSSArguments.InputColor = PassParameters->SceneColorInput->GetRHI();
PassParameters->SceneVelocityInput->MarkResourceAsUsed();
DLSSArguments.InputMotionVectors = PassParameters->SceneVelocityInput->GetRHI();
PassParameters->SceneDepthInput->MarkResourceAsUsed();
DLSSArguments.InputDepth = PassParameters->SceneDepthInput->GetRHI();
PassParameters->EyeAdaptation->MarkResourceAsUsed();
DLSSArguments.InputExposure = PassParameters->EyeAdaptation->GetRHI();
DLSSArguments.PreExposure = PreExposure;
DLSSArguments.bUseAutoExposure = bUseAutoExposure;
// output images
PassParameters->SceneColorOutput->MarkResourceAsUsed();
DLSSArguments.OutputColor = PassParameters->SceneColorOutput->GetRHI();
RHICmdList.Transition(FRHITransitionInfo(DLSSArguments.OutputColor, ERHIAccess::Unknown, ERHIAccess::UAVMask));
RHICmdList.EnqueueLambda(
[LocalNGXRHIExtensions, DLSSArguments, DLSSState](FRHICommandListImmediate& Cmd) mutable
{
LocalNGXRHIExtensions->ExecuteDLSS(Cmd, DLSSArguments, DLSSState);
});
});
在RDG的Pass中通过获取各参数的RHI资源封装了DLSSArgument传入RHI上执行的NGXRHI::ExecuteDLSS,在这个函数中会按顺序执行各个Pass,但从这个函数的具体实现开始就不开源了
FSR
FSRUpsampler的AddPasses不像DLSS只是做一些重新装填参数没有实质内容,FSR的所有数据预处理都放在了这个函数体中,所以整个函数比较复杂
正常情况下开启AutoExposure是走上面的if,如果条件不满足的话则会调用Unreal引擎中自带的一个TSR,所以之后我们都是分析上面if中的流程
if (IsApiSupported() && (View.PrimaryScreenPercentageMethod == EPrimaryScreenPercentageMethod::TemporalUpscale) && bHasAutoExposure && (InputExtents.X < OutputExtents.X) && (InputExtents.Y < OutputExtents.Y))
{
...
}
else
{
return GetDefaultTemporalUpscaler()->AddPasses(
GraphBuilder,
View,
PassInputs);
}
CreateReactiveMaskAndCompositeMask
第一个流程是计算ReactiveMask,这是FSR中处理半透明物体(不写入深度和Velocity)的一种方式,将透明度Clamp到0-0.9的范围写入ReactiveMask中,会在之后根据这个值调整Accumulation阶段与历史帧的混合权重
计算ReactiveMask和CompositeMask需要以下数据(每个Pass需要哪些数据只要查看传入的PassParameter设置了哪些即可)
1.场景中所有透明物体渲染信息(SeparateTranslucency)
2.部分Gbuffer信息
3.场景Reflection计算信息(ReflectionTexture)
4.深度纹理(DepthBuffer)
5.渲染透明物体前ColorBuffer(SceneColorPreAlpha)
6.当前渲染得到的ColorBuffer(SceneColor)
7.VelocityTexture
8.其他(如LumenSpecular等)
GBuffer、DepthBuffer、SeparateTranslucency以及ColorBuffer直接从PassInput中获取
在渲染管线的RenderTranslucency阶段Unreal将所有透明物体的信息渲染到了一张TranslucencyResourcesMap上,而这张图又在AddPostProcessingPass前被装入PassInput,最终在这里作为SeparateTranslucency。
PostProcessingInputs.TranslucencyViewResourcesMap = FTranslucencyViewResourcesMap(TranslucencyResourceMap, ViewIndex);
其他所有需要的资源在之前都已经计算完了(大部分在Upsampler的成员变量中),所以在Create ReactiveMask部分的工作就是创建对应的RDGResources然后把他们全部装入PassParameter,然后通过为CreateReactiveMaskShader开一个Pass来计算结果
{Create RDGResources}
{Build PassParameter}
TShaderMapRef<FFSR2CreateReactiveMaskCS> ComputeShaderFSR(View.ShaderMap);
FComputeShaderUtils::AddPass(
GraphBuilder,
RDG_EVENT_NAME("FidelityFX-FSR2/CreateReactiveMask (CS)"),
ComputeShaderFSR,
PassParameters,
FComputeShaderUtils::GetGroupCount(FIntVector(InputExtents.X, InputExtents.Y, 1),
FIntVector(FFSR2ConvertVelocityCS::ThreadgroupSizeX, FFSR2ConvertVelocityCS::ThreadgroupSizeY, FFSR2ConvertVelocityCS::ThreadgroupSizeZ))
);
Dedither
去抖动,主要是用在SHADINGMODELID_HAIR上
ConsolidateMotionVectors
将UE计算出的默认格式的MotionVector转换为FSR要求的格式
这部分就是通过为ConvertVelocityShader开一个Pass实现的,需要的参数有
1.DepthTexture(UAV)
2.InputDepth(SRV)
3.InputVelocity(SRV)
FRDGTextureDesc MotionVectorDesc = FRDGTextureDesc::Create2D(InputExtentsQuantized, PF_G16R16F, FClearValueBinding::Black, TexCreate_ShaderResource | TexCreate_UAV | TexCreate_RenderTargetable);
FRDGTextureRef MotionVectorTexture = GraphBuilder.CreateTexture(MotionVectorDesc, TEXT("FSR2MotionVectorTexture"));
{
FFSR2ConvertVelocityCS::FParameters* PassParameters = GraphBuilder.AllocParameters<FFSR2ConvertVelocityCS::FParameters>();
FRDGTextureUAVDesc OutputDesc(MotionVectorTexture);
PassParameters->DepthTexture = SceneDepth;
PassParameters->InputDepth = GraphBuilder.CreateSRV(DepthDesc);
PassParameters->InputVelocity = GraphBuilder.CreateSRV(VelocityDesc);
PassParameters->View = View.ViewUniformBuffer;
PassParameters->OutputTexture = GraphBuilder.CreateUAV(OutputDesc);
TShaderMapRef<FFSR2ConvertVelocityCS> ComputeShaderFSR(View.ShaderMap);
FComputeShaderUtils::AddPass(
GraphBuilder,
RDG_EVENT_NAME("FidelityFX-FSR2/ConvertVelocity (CS)"),
ComputeShaderFSR,
PassParameters,
FComputeShaderUtils::GetGroupCount(FIntVector(SceneDepth->Desc.Extent.X, SceneDepth->Desc.Extent.Y, 1),
FIntVector(FFSR2ConvertVelocityCS::ThreadgroupSizeX, FFSR2ConvertVelocityCS::ThreadgroupSizeY, FFSR2ConvertVelocityCS::ThreadgroupSizeZ))
);
}
进入RenderGraph中
在前面的步骤中我们已经从渲染管线上获取并计算了基本所有Upscale过程中需要用到的参数,然后又进行了一些CreateResources和Init FSR2Context等准备工作,接下来我们把他们全部封装到PassParameter中然后将FSR的真正执行流程ffxFsr2ContextDispatch加入到RDG或直接调用
FFSR2Pass::FParameters* PassParameters = GraphBuilder.AllocParameters<FFSR2Pass::FParameters>();
PassParameters->DummyBuffer = DummyBuf;
PassParameters->ColorTexture = SceneColor;
PassParameters->DepthTexture = SceneDepth;
PassParameters->VelocityTexture = MotionVectorTexture;
if (bValidEyeAdaptation)
{
PassParameters->ExposureTexture = GetEyeAdaptationTexture(GraphBuilder, View);
}
PassParameters->ReactiveMaskTexture = ReactiveMaskTexture;
PassParameters->CompositeMaskTexture = CompositeMaskTexture;
PassParameters->OutputTexture = OutputTexture;
这里有个关于图形api的判断,如果是DX12或者Vulkan的话就要将ffxFsr2ContextDispatch添加到RDG中,而如果是其他大于SM5的平台(DX11,ES3.1等)就直接调用,具体原因接着往下看
if (CurrentApi == EFSR2TemporalUpscalerAPI::Unreal)
{
Fsr2DispatchParams.color = ffxGetResourceFromUEResource(&FSR2State->Params.callbacks, PassParameters->ColorTexture.GetTexture());
Fsr2DispatchParams.depth = ffxGetResourceFromUEResource(&FSR2State->Params.callbacks, PassParameters->DepthTexture.GetTexture());
Fsr2DispatchParams.motionVectors = ffxGetResourceFromUEResource(&FSR2State->Params.callbacks, PassParameters->VelocityTexture.GetTexture());
Fsr2DispatchParams.exposure = ffxGetResourceFromUEResource(&FSR2State->Params.callbacks, PassParameters->ExposureTexture.GetTexture());
if (PassParameters->ReactiveMaskTexture)
{
Fsr2DispatchParams.reactive = ffxGetResourceFromUEResource(&FSR2State->Params.callbacks, PassParameters->ReactiveMaskTexture.GetTexture());
}
Fsr2DispatchParams.output = ffxGetResourceFromUEResource(&FSR2State->Params.callbacks, PassParameters->OutputTexture.GetTexture(), FFX_RESOURCE_STATE_UNORDERED_ACCESS);
Fsr2DispatchParams.commandList = (FfxCommandList)CurrentGraphBuilder;
ffxFsr2SetFeatureLevel(&FSR2State->Params.callbacks, View.GetFeatureLevel());
FfxErrorCode Code = ffxFsr2ContextDispatch(&FSR2State->Fsr2, &Fsr2DispatchParams);
}
else
{
GraphBuilder.AddPass(RDG_EVENT_NAME("FidelityFX-FSR2"), PassParameters, ERDGPassFlags::Compute | ERDGPassFlags::Raster | ERDGPassFlags::SkipRenderPass, [&View, &PassInputs, CurrentApi, ApiAccess, PassParameters, PrevCustomHistory, Fsr2DispatchParamsPtr, FSR2State](FRHICommandListImmediate& RHICmdList)
{
{将PassParam再装填为Fsr2DispatchParams,就像上面一样但api不同}
switch (CurrentApi)
{
#if FSR2_ENABLE_DX12
case EFSR2TemporalUpscalerAPI::D3D12:
{
RHICmdList.EnqueueLambda([FSR2State, DispatchParams, ApiAccess](FRHICommandListImmediate& cmd) mutable
{
ID3D12GraphicsCommandList* cmdList = (ID3D12GraphicsCommandList*)ApiAccess->GetNativeCommandBuffer(cmd);
DispatchParams.commandList = ffxGetCommandListDX12(cmdList);
FfxErrorCode Code = ffxFsr2ContextDispatch(&FSR2State->Fsr2, &DispatchParams);
});
break;
}
#endif
#if FSR2_ENABLE_VK
case EFSR2TemporalUpscalerAPI::Vulkan:
{
RHICmdList.EnqueueLambda([FSR2State, DispatchParams, ApiAccess](FRHICommandListImmediate& cmd) mutable
{
VkCommandBuffer cmdList = (VkCommandBuffer)ApiAccess->GetNativeCommandBuffer(cmd);
DispatchParams.commandList = ffxGetCommandListVK(cmdList);
FfxErrorCode Code = ffxFsr2ContextDispatch(&FSR2State->Fsr2, &DispatchParams);
});
break;
}
#endif
}
}}
从ffxFsr2ContextDispatch开始就属于FSR的ThirdParty代码了,也就是说并不基于Unreal而是完全的cpp和shader代码
fsr2Dispatch中的第一部分依然是资源转换,使用fpRegisterResources将params中的很多资源都重新注册转换为context中的资源,也包括一些简单的计算bias和各种Setup等,然后就是通过scheduleDispatch将我们熟悉的几个Pass加入到待执行队列FfxFsr2Interface::scratchBuffer中
在fsr2Dispatch的最后调用fpExecuteGpuJobs执行整个队列
这些fpExecuteGpuJobs和fpUnregisterResources实际上都是委托,根据平台的不同指向不同的api,在Unreal上就是指向UE的处理函数
以execute为例对应的就是FlushRenderJob_UE,支持Clear、Copy和Compute三种类型的任务
switch (job->jobType)
{
case FFX_GPU_JOB_CLEAR_FLOAT:
{
FRDGTexture* RdgTex = Context->GetRDGTexture(*GraphBuilder, job->clearJobDescriptor.target.internalIndex);
if (RdgTex)
{
FRDGTextureUAVRef UAV = GraphBuilder->CreateUAV(RdgTex);
AddClearUAVPass(*GraphBuilder, UAV, job->clearJobDescriptor.color);
}
else
{
FRDGBufferUAVRef UAV = GraphBuilder->CreateUAV(Context->GetRDGBuffer(*GraphBuilder, job->clearJobDescriptor.target.internalIndex), PF_R32_FLOAT);
AddClearUAVFloatPass(*GraphBuilder, UAV, job->clearJobDescriptor.color[0]);
}
break;
}
case FFX_GPU_JOB_COPY:
{
if ((Context->GetType(job->copyJobDescriptor.src.internalIndex) == FFX_RESOURCE_TYPE_BUFFER) && (Context->GetType(job->copyJobDescriptor.dst.internalIndex) == FFX_RESOURCE_TYPE_BUFFER))
{
check(false);
}
else
{
FRHITexture* Src = (FRHITexture*)Context->GetResource(job->copyJobDescriptor.src.internalIndex);
FRHITexture* Dst = (FRHITexture*)Context->GetResource(job->copyJobDescriptor.dst.internalIndex);
FRHICopyTextureInfo Info;
Info.NumMips = FMath::Min(Src->GetNumMips(), Dst->GetNumMips());
AddCopyTexturePass(*GraphBuilder, Context->GetRDGTexture(*GraphBuilder, job->copyJobDescriptor.src.internalIndex), Context->GetRDGTexture(*GraphBuilder, job->copyJobDescriptor.dst.internalIndex), Info);
}
break;
}
case FFX_GPU_JOB_COMPUTE:
{
IFSR2SubPass* Pipeline = (IFSR2SubPass*)job->computeJobDescriptor.pipeline.pipeline;
check(Pipeline);
Pipeline->Dispatch(*GraphBuilder, Context, job);
break;
}
}
再以Compute Job为例,Dispatch实际上就是使用对应的Compute Shader添加一个RDGPass(这也就是为什么前面分平台的时候Unreal是直接调用而不需要添加RDGPass,因为最后实际执行的时候又回到Unreal中为每个具体的任务添加RDGPass)
那为什么我们要绕这么一大圈,扯这么多代理,各种转换资源Buffer,初始化Context呢,原因就是FSR是一个多平台插件,而设计Pass执行的部分是一个ThirdParty代码,它相当于一个中转站,从一个平台的api调用,最后再调用另一个平台的api解决问题,但对于Unreal相当于无意义的在这里绕了个圈,因为调用它的是Unreal代码,它最后调用的也是Unreal代码
将当前帧结果写回History
if (CanWritePrevViewInfo)
{
// Copy the new history into the history wrapper
GraphBuilder.QueueTextureExtraction(OutputTexture, &View.ViewState->PrevFrameViewInfo.TemporalAAHistory.RT[0]);
}