【UE4源代码观察】观察DDC(DerivedDataCache)

概念

DDC,全名DerivedDataCache(派生数据缓存)。很早就知道UE4里存在DDC这个概念,也发现了DDC占用了很多磁盘空间,也遇到过DDC导致了问题然后清理过之后问题消失的情况。但是DDC的细节从来没有深究过,好奇心驱使我想了解更多关于DDC的内容。

官方文档指出DDC的概念是:
●The Derived Data Cache (DDC) stores versions of Assets in the formats used by the Unreal Engine on its target platforms, as opposed to the source formats artists create that are imported into the Editor and stored in .uasset files。DDC存储了一个资源的版本,这个版本是UE在目标平台上所用的格式。与此相对的是Artist所创建的原始格式的资源,那些资源被导入到UE4编辑器中存储成了.uasset文件
●Content stored in the DDC is disposable in that it can always be regenerated at any time using the data stored in the .uasset file。存储在DDC中的内容可以随时丢弃,因为他们可以随时由.uasset文件重新生成。
●Storing these derived formats externally makes it possible to easily add or change the formats used by the engine without needing to modify the source asset file。在外部存储派生格式是为了可以随时添加或更改引擎所用的格式,而不需要修改原始资源文件(指.uasset文件

官方文档先给出了概念,然后补充了特点,随后说出了这么做的理由。看似已经讲明白了DDC的含义,但我还有很多疑问:
DDC的概括下来就是特定平台对应格式的资源版本,那么这里的对应平台是指什么?是指Windows、Android这种操作系统平台?(恐怕没这么简单)。此外,如果DDC是和.uasset文件所对应的,那么DDC文件夹的目录层级应该和.uasset文件一致的,但实际并不一致,而且.uasset文件是不是和DDC文件一一对应这也并不确定。

为了离上面这些问题的答案更近一步,我决定观察DDC模块的源代码。这个模块在Developer分类中,打开这个源代码的模块后我惊喜的发现他所包含的文件并不多,更重要的是:DDC模块只依赖于Core模块。看来理解他的成本并没有之前想象中那么高。
这篇博客接下来会先清点DDC模块中的class,观察他们的继承关系;程序内的对象;和其他的类有怎样的联系;它有哪些值得一提的函数或是变量;等等。之后尝试找出一些关于DDC的问题的答案。

清点DDC模块里的class

IDerivedDataCacheModule

它是DDC的模块类,内容很少:

/**
 * Module for the DDC
 */
class IDerivedDataCacheModule : public IModuleInterface
{
public:
	/** Return the DDC interface **/
	virtual FDerivedDataCacheInterface& GetDDC() = 0;
};

.cpp中内容也很少,基本上,他只是拥有一个FDerivedDataCache类型的指针(是一个单例),在启动模块时创建,并提供能获取这个单例指针的函数:

class FDerivedDataCacheModule : public IDerivedDataCacheModule
{
	/** Cached reference to DDC singleton, helpful to control singleton's lifetime. */
	FDerivedDataCache* DDC;

public:
	virtual FDerivedDataCacheInterface& GetDDC() override
	{
		return InternalSingleton();
	}
	virtual void StartupModule() override
	{
		// make sure DDC gets created early, previously it might have happened in ShutdownModule() (for PrintLeaks()) when it was already too late
		DDC = static_cast< FDerivedDataCache* >( &GetDDC() );
	}
	virtual void ShutdownModule() override
	{
		FDDCCleanup::Shutdown();
		if (DDC)
		{
			DDC->PrintLeaks();
		}
	}
	FDerivedDataCacheModule():	DDC(nullptr)
	{
	}
};
IMPLEMENT_MODULE( FDerivedDataCacheModule, DerivedDataCache);
FDerivedDataCacheInterface

这是一个接口类,含有大量的纯虚函数。他没有基类:

/** 
 * Interface for the derived data cache
 * This API is fully threadsafe (with the possible exception of the system interface: NotfiyBootComplete, etc).
**/
class FDerivedDataCacheInterface

它有一个函数是GetSynchronous

/** 
* Synchronously checks the cache and if the item is present, it returns the cached results, otherwise tells the deriver to build the data and then updates the cache
 * @param	DataDeriver	plugin to produce cache key and in the event of a miss, return the data.
 * @param	bDataWasBuilt if non-null, set to true if the data returned had to be built instead of retrieved from the DDC. Used for stat tracking.
 * @return	true if the data was retrieved from the cache or the deriver built the data sucessfully. false can only occur if the plugin returns false.
**/
virtual bool GetSynchronous(class FDerivedDataPluginInterface* DataDeriver, TArray<uint8>& OutData, bool* bDataWasBuilt = nullptr) = 0;

这个函数的名字以中文的语法习惯来看有些怪,实际上Synchronous是副词(Synchronously,同步地)。因此这个函数的意思是以同步地方式来获得(指获得DDC)。从注释可以明白:它的作用是以同步地方式(即程序会立即返回结果)获得DDC,他会先检查缓存中是否有这个项目,有的话便会将结果存到OutData里,否则重新构建数据并更新缓存。它的参数是一个FDerivedDataPluginInterface指针。
于此相对的是GetAsynchronous函数(异步的方式):

/** 
 * Starts the async process of checking the cache and if the item is present, retrieving the cached results, otherwise telling the deriver to build the data and then updating the cache
 * If the plugin does not support threading, all of the above will be completed before the call returns.
 * @param	DataDeriver	plugin to produce cache key and in the event of a miss, return the data.
 * @return	a handle that can be used for PollAsynchronousCompletion, WaitAsynchronousCompletion and GetAsynchronousResults
**/
virtual uint32 GetAsynchronous(class FDerivedDataPluginInterface* DataDeriver) = 0;

注意:同步版返回了是否成功得到DDC,而异步版返回的值是创建的任务的handle

此外,还有参数是TCHAR* CacheKey版:

/** 
* Synchronously checks the cache and if the item is present, it returns the cached results, otherwise it returns false
 * @param	CacheKey	Key to identify the data
 * @return	true if the data was retrieved from the cache
**/
virtual bool GetSynchronous(const TCHAR* CacheKey, TArray<uint8>& OutData) = 0; 

FDerivedDataCacheInterface的实现是FDerivedDataCache

/**
 * Implementation of the derived data cache
 * This API is fully threadsafe
**/
class FDerivedDataCache : public FDerivedDataCacheInterface

而他也有子类:

/** 
 * Implementation of the derived data cache, this layer implements rollups
**/
class FDerivedDataCacheWithRollups : public FDerivedDataCache

这两个类都是在.cpp文件中定义的,而不是在.h中。这说明他们完全是私有的,不需要在外部访问,外部访问的接口全在FDerivedDataCacheInterface中定义了。但是程序运行时,是只有一个FDerivedDataCache单例的,而二者如何选择的逻辑在DerivedDataCache.cpp中的InternalSingleton函数中可以看到:
在这里插入图片描述
可以看到它是根据命令行参数来DDCNoRollups选择的。

FDerivedDataBackend

这个类也是个抽象的类,有纯虚函数。它起一个接口作用。没有基类:

class FDerivedDataBackend
{
public:
	// Singleton to retrieve the GLOBAL backend
	// @return Reference to the global cache backend
	static FDerivedDataBackend& Get();

	// Singleton to retrieve the root cache
	// @return Reference to the global cache root
	virtual FDerivedDataBackendInterface& GetRoot() = 0;

	// System Interface, copied from FDerivedDataCacheInterface
	virtual void NotifyBootComplete() = 0;
	virtual void AddToAsyncCompletionCounter(int32 Addend) = 0;
	virtual void WaitForQuiescence(bool bShutdown = false) = 0;
	virtual void GetDirectories(TArray<FString>& OutResults) = 0;
	virtual bool GetUsingSharedDDC() const = 0;

	// Mounts a read-only pak file.
	// @param PakFilename Pak filename
	virtual FDerivedDataBackendInterface* MountPakFile(const TCHAR* PakFilename) = 0;

	// Unmounts a read-only pak file.
	// @param PakFilename Pak filename
	virtual bool UnmountPakFile(const TCHAR* PakFilename) = 0;

	virtual void GatherUsageStats(TMap<FString, FDerivedDataCacheUsageStats>& UsageStats) = 0;
};

FDerivedDataBackendGraph是它的实现:

/**
  * This class is used to create a singleton that represents the derived data cache hierarchy and all of the wrappers necessary
  * ideally this would be data driven and the backends would be plugins...
**/
class FDerivedDataBackendGraph : public FDerivedDataBackend

它在程序中也是只有一个全局的单例

/**
* Singleton to retrieve the GLOBAL backend
 *
 * @return Reference to the global cache backend
 */
static FORCEINLINE FDerivedDataBackendGraph& Get()
{
	static FDerivedDataBackendGraph SingletonInstance;
	return SingletonInstance;
}

这个类是在.cpp文件中定义的,而不是在.h中。这说明外部不用关心他的细节,所有需要访问的接口已经全在FDerivedDataBackend中定义了

FDerivedDataBackendInterface

这个类没有父类:

/** 
 * Interface for cache server backends. 
 * The entire API should be callable from any thread (except the singleton can be assumed to be called at least once before concurrent access).
**/
class FDerivedDataBackendInterface

它有一个函数是GetCachedData

/**Synchronous retrieve of a cache item
 *
 * @param	CacheKey	Alphanumeric+underscore key of this cache item
 * @param	OutData		Buffer to receive the results, if any were found
 * @return				true if any data was found, and in this case OutData is non-empty*/
virtual bool GetCachedData(const TCHAR* CacheKey, TArray<uint8>& OutData)=0;

这个函数的参数是一个TCHAR* CacheKey,之后得到的数据会存储在OutData中。

FDerivedDataBackendInterface有如下子类:
FDerivedDataBackendAsyncPutWrapper
FDerivedDataBackendCorruptionWrapper
FDerivedDataBackendVerifyWrapper
FDerivedDataLimitKeyLengthWrapper
FFileSystemDerivedDataBackend
FHierarchicalDerivedDataBackend
FMemoryDerivedDataBackend
FPakFileDerivedDataBackend

FDerivedDataBackendGraph中有一些FDerivedDataBackendInterface类型的指针:

/** Root of the graph */
FDerivedDataBackendInterface*					RootCache;

/** References to all created backed interfaces */
TArray< FDerivedDataBackendInterface* > CreatedBackends;

/** Instances of backend interfaces which exist in only one copy */
FMemoryDerivedDataBackend*		BootCache;
FPakFileDerivedDataBackend*		WritePakCache;
FDerivedDataBackendInterface*	AsyncPutWrapper;
FDerivedDataBackendInterface*	KeyLengthWrapper;
FHierarchicalDerivedDataBackend* HierarchicalWrapper;
/** Support for multiple read only pak files. */
TArray<FPakFileDerivedDataBackend*>		ReadPakCache;
FDerivedDataPluginInterface

这也是一个接口类,没有父类:

/** Interface for data deriving backends
 * This API will not be called concurrently, except that Build might be called on different instances if IsBuildThreadsafe.**/
class FDerivedDataPluginInterface

它的所有接口如下:

/** Get the plugin name, this is used as the first part of the cache key 
* @return	Name of the plugin**/
virtual const TCHAR* GetPluginName() const = 0;

/** Get the version of the plugin, this is used as part of the cache key. This is supposed to
* be a guid string ( ex. "69C8C8A6-A9F8-4EFC-875C-CFBB72E66486" )
* @return	Version string of the plugin**/
virtual const TCHAR* GetVersionString() const = 0;

/** Returns the largest and plugin specific part of the cache key. This must be a alphanumeric+underscore
* @return	Version number of the plugin, for licensees.**/
virtual FString GetPluginSpecificCacheKeySuffix() const = 0;

/** Indicates that this plugin is threadsafe. Note, the system itself will not call it concurrently if this false, however, then you are responsible for not calling the system itself concurrently.
* @return	true if this plugin is threadsafe**/
virtual bool IsBuildThreadsafe() const = 0;

/** Indicated that this plugin generates deterministic data. This is used for DDC verification */
virtual bool IsDeterministic() const { return false; }

/** Indicated that this plugin generates deterministic data. This is used for DDC verification */
virtual FString GetDebugContextString() const { return TEXT("Unknown Context"); }

/** Does the work of deriving the data. 
* @param	OutData	Array of bytes to fill in with the result data
* @return	true if successful, in the event of failure the cache is not updated and failure is propagated to the original caller.**/
virtual bool Build(TArray<uint8>& OutData) = 0;

其中最值得注意的应该是Build函数了,它是负责构建DDC数据的,结果会放在OutData里。

FDDCCleanup

FDDCCleanup 是一个 FRunnable:

/** 
 * DDC Filesystem Cache cleanup thread.
 */
class DERIVEDDATACACHE_API FDDCCleanup : public FRunnable

我的DDC数据存放在了何处?

官方文档指出了,DDC的路径在DefaultEngine.ini配置文件中有指明,同时,“用Epic Games Launcher安装的引擎启动” 和 “从源代码编译并启动”这两种方式会有区别。这在源代码的观察中也证实了这一点。
FDerivedDataBackendGraph有一个成员变量:

/** List of directories used by the DDC */
TArray<FString> Directories;

Directories虽然是列表,但实际上代码中只有一处对他有增加元素的操作,而且在此的断点只触发了一次:
在这里插入图片描述
从堆栈可以看到它是在引擎初始化阶段创建FDerivedDataBackendGraph单例时得到路径的。
再往这个堆栈上面找,就可以看到在FDerivedDataBackendGraph的构造函数中有逻辑:
在这里插入图片描述
可以看到他根据FApp::IsEngineInstalled()(是否是用安装版的引擎),来决定是用InstalledDerivedDataBackendGraph还是DerivedDataBackendGraph
在使用源代码编译的引擎时(现在的情况),使用DerivedDataBackendGraph中的配置:

[DerivedDataBackendGraph]
MinimumDaysToKeepFile=7
Root=(Type=KeyLength, Length=120, Inner=AsyncPut)
AsyncPut=(Type=AsyncPut, Inner=Hierarchy)
Hierarchy=(Type=Hierarchical, Inner=Boot, Inner=Pak, Inner=EnginePak, Inner=Local, Inner=Shared)
Boot=(Type=Boot, Filename="%GAMEDIR%DerivedDataCache/Boot.ddc", MaxCacheSize=512)
Local=(Type=FileSystem, ReadOnly=false, Clean=false, Flush=false, PurgeTransient=true, DeleteUnused=true, UnusedFileAge=34, FoldersToClean=-1, Path=%ENGINEDIR%DerivedDataCache, EnvPathOverride=UE-LocalDataCachePath, EditorOverrideSetting=LocalDerivedDataCache)
Shared=(Type=FileSystem, ReadOnly=false, Clean=false, Flush=false, DeleteUnused=true, UnusedFileAge=10, FoldersToClean=10, MaxFileChecksPerSec=1, Path=?EpicDDC, EnvPathOverride=UE-SharedDataCachePath, EditorOverrideSetting=SharedDerivedDataCache, CommandLineOverride=SharedDataCachePath)
AltShared=(Type=FileSystem, ReadOnly=true, Clean=false, Flush=false, DeleteUnused=true, UnusedFileAge=23, FoldersToClean=10, MaxFileChecksPerSec=1, Path=?EpicDDC2, EnvPathOverride=UE-SharedDataCachePath2)
Pak=(Type=ReadPak, Filename="%GAMEDIR%DerivedDataCache/DDC.ddp")
EnginePak=(Type=ReadPak, Filename=%ENGINEDIR%DerivedDataCache/DDC.ddp)

在这之中,LocalShared分别指本地DDC和共享DDC。我这里是本地,看其中的LocalPath

Path=%ENGINEDIR%DerivedDataCache

即Engine目录的DerivedDataCache文件夹。这符合刚才调试的情况。
对于InstalledDerivedDataBackendGraph,它的LocalPath

Path=%ENGINEVERSIONAGNOSTICUSERDIR%DerivedDataCache

ENGINEVERSIONAGNOSTICUSERDIR即(Engine Version Agnostic User Dir 引擎版本无关用户路径)观察发现是在C:\Users\admin\AppData\Local\UnrealEngine\Common,此外上级目录还有其他特定版本的对应文件夹:
在这里插入图片描述

DDC数据什么时候触发构建,什么时候使用?

根据FDerivedDataCacheInterface::GetAsynchronous的注释就可知道:DDC数据在获取的时候,如果发现还不存在,就会触发构建。因此应该重点观察这个函数。
这个函数有同步异步版本,还有FDerivedDataPluginInterface* DataDeriverTCHAR* CacheKey参数版本,总共 2X2=4个版本。但他们内容很相似,都是会创建一个FAsyncTask<FBuildAsyncWorker>,不同的是,同步版本会立马执行,而异步版本会在其他线程中执行。而对于参数:
FDerivedDataPluginInterface* DataDeriver版本会使用这个对象:

FAsyncTask<FBuildAsyncWorker> PendingTask(DataDeriver, *CacheKey, true);

TCHAR* CacheKey版本则不会使用:

FAsyncTask<FBuildAsyncWorker> PendingTask((FDerivedDataPluginInterface*)NULL, CacheKey, true);

而对于他何时被调用,我本想总结出一些规律,但后来发现DDC在太多地方被用到了,这似乎是一个通用的方法,而每种资源都不一样。
已有的观察到的堆栈有:

ShaderMap相关:
在这里插入图片描述
UBodySetup这种UObject相关:
在这里插入图片描述
Texture相关:
在这里插入图片描述
Texture相关:
在这里插入图片描述
我知道还有很多地方调用。但我想最好的观察方法应该是研究某一种资源时,研究它的DDC如何获取的。

CacheKey的算法?

对于DDC来说,我想CacheKey的观察很重要,因为那些对CacheKey的生成有贡献的内容,一定就是一个资源派生数据时所基于的“平台”与“格式”。

对于这个问题,其实每种资源也有不同的答案,大体上分两类:

1.对于使用FDerivedDataPluginInterface的:
/** 
* Internal function to build a cache key out of the plugin name, versions and plugin specific info
 * @param	DataDeriver	plugin to produce the elements of the cache key.
 * @return				Assembled cache key
**/
static FString BuildCacheKey(FDerivedDataPluginInterface* DataDeriver)
{
	FString Result = FDerivedDataCacheInterface::BuildCacheKey(DataDeriver->GetPluginName(), DataDeriver->GetVersionString(), *DataDeriver->GetPluginSpecificCacheKeySuffix());
	return Result;
}

具体就要看实现FDerivedDataPluginInterface的类的内部了。目前发现他有6个实现:
FDerivedDataAnimationCompression
FChaosDerivedDataCooker
FDerivedDataPhysXCooker
FDerivedAudioDataCompressor
FDerivedDataGeometryCollectionCooker
FDerivedDataNavCollisionCooker

2.对于不使用FDerivedDataPluginInterface的:

那么这个CacheKey的计算就更个性化了。
比如对于Texture:

/**
 * Constructs a derived data key from the key suffix.
 * @param KeySuffix - The key suffix.
 * @param OutKey - The full derived data key.
 */
static void GetTextureDerivedDataKeyFromSuffix(const FString& KeySuffix, FString& OutKey)
{
	OutKey = FDerivedDataCacheInterface::BuildCacheKey(
		TEXT("TEXTURE"),
		TEXTURE_DERIVEDDATA_VER,
		*KeySuffix
		);
}

KeySuffix则由GetTextureDerivedDataKeySuffix函数得到:

/**
 * Computes the derived data key suffix for a texture with the specified compression settings.
 * @param Texture - The texture for which to compute the derived data key.
 * @param BuildSettings - Build settings for which to compute the derived data key.
 * @param OutKeySuffix - The derived data key suffix.
 */
 void GetTextureDerivedDataKeySuffix(const UTexture& Texture, const FTextureBuildSettings* BuildSettingsPerLayer, FString& OutKeySuffix)

其中一段:

// build the key, but don't use include the version if it's 0 to be backwards compatible
OutKeySuffix = FString::Printf(TEXT("%s_%s%s%s_%02u_%s"),
	*BuildSettings.TextureFormatName.GetPlainNameString(),
	Version == 0 ? TEXT("") : *FString::Printf(TEXT("%d_"), Version),
	*Texture.Source.GetIdString(),
	*CompositeTextureStr,
	(uint32)NUM_INLINE_DERIVED_MIPS,
	(TextureFormat == NULL) ? TEXT("") : *TextureFormat->GetDerivedDataKeyString(Texture)
	);

可以看到其中用到了TextureFormat(类型是ITextureFormat

/**
 * Interface for texture compression modules.
 */
class ITextureFormat

构建是采用什么算法?

对于这个问题,答案依旧是根据不同资源有不同的算法。
对于每一种资源,可以搜索FDerivedDataCacheInterface关键字:
在这里插入图片描述
然后顺藤摸瓜找到算法。

例如,对于Texture,就可以观察到在FTexturePlatformData::Cache函数中,FTextureCacheDerivedDataWorker被创建随后工作,然后DDC的具体算法就在FTextureCacheDerivedDataWorker::DoWork()中。

总结

其实关于DDC还有很多疑问,但是目前明白的一个很重要的事情是:他和具体的资源关系密切。DDC模块虽然内容不多,但其实际上实现的是一个框架,而具体的DDC的键值与内容的逻辑,则在不同资源类型的代码中。
我想,这篇博客中虽然我没能彻底明白DDC,但它一定会在未来我研究一些资源的DDC时提供帮助。

  • 14
    点赞
  • 16
    收藏
    觉得还不错? 一键收藏
  • 4
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值