VTM数据结构学习

pinkelectronic

已于 2022-10-31 10:38:22 修改

阅读量1.1k

点赞数

分类专栏： H266视频编码文章标签：学习视频编解码

于 2022-10-13 16:14:04 首次发布

本文链接：https://blog.csdn.net/pinkelectronic/article/details/127301600

版权

H266视频编码专栏收录该内容

9 篇文章

订阅专栏

学习就是一直看重复的东西然后不会。

这篇文章仅供自己加深映像，无论是排版还是完成度都以下方文章为优秀：

VVC参考软件VTM数据结构

VTM基本的数据结构不多，定义下来能看懂得也不多(笑)，不够如果去过一遍，还是能懂大概的继承关系，放图。

显然祖宗是Size和Position，Area则是他们的整合，其实Size和Position只用知道基本的就好了，继承了以后在构造的时候都只是被一笔带过，更不用说也没携带什么成员函数。UnitArea到CS是一块整体，UnitBuf则联系到了Picture，这两部分应该是比较关键的部分。

下图为HM和VTM的一些对比。

HM	NextSoftware
Z-index, (CTU)-RS-address, Depth	Position, Size, (Comp)Area, UnitArea
TComDataCU	CodingUnit, PredictionUnit, TransformUnit Operations in CU, PU, TU namespaces CodingStructure
TComTU	Partitioning is governed by Partitioner
TComPicYuv	Picture
TComPic	Picture
TComYuv	UnitAreaBuf

在参考文档中，CS是被标红的，我这里把它标黑，可以说CS是贯穿全文的，非常重要。

下面给出原文，机翻和我的理解：

1.CodingStructure Basics：

Contains CodingUnit etc. objects and maps them to the picture // map v = 映射
A TComDataCU replacement, but globally allocated
Top-level CodingStructure contains all CU’s, PU’s and TU’s in the frame
Sub-level CodingStructure contains a representation of a specific UnitArea
After creation it’s empty and needs to be filled
addCU/PU/TU methods create and map the specific object
getCU/PU/TU fetches the specific objects addressed using global Position
Dynamically allocates the required resources
Uses dynamic_cache for increased performance

包含CodingUnit等对象，并将它们映射到图片 // 一一对应
替换TComDataCU，但全局分配 // 不用这个东西
顶层编码结构包含框架中所有的CU, PU和TU // CS包含了所有，不用再细分
子级别编码结构包含一个特定单元区域的表示
在创建之后，它是空的，需要填充
addCU/PU/TU方法创建并映射特定的对象 // add和get，字面意思
getCU/PU/TU使用全局位置来获取特定的对象
动态分配所需的资源
使用dynamic_cache提高性能

2.在CS下的RD-Search： //暂时还没有很懂

Designed for Top-Down approach // 自顶向下方法
Allows for local test encoding with “transparent” global context
Follows the well known best-temp scheme with up-propagation //也就是best temp存储最佳吧
Hierarchically cascaded
A CodingStructure is set up to represent a local UnitArea
Calls outside of this UnitArea are forwarded to the parent CodingStructure
Parent nodes are not aware of the children nodes
Best candidates need to be propagated to the parents

为自顶向下方法设计
允许使用“透明”全局上下文进行本地测试编码
采用众所周知的向上传播的最佳临时方案
分层级联
一个CodingStructure被设置来表示一个本地的UnitArea，在这个UnitArea之外的调用被转发到父CodingStructure // 意思就是要调用之外的CU时需要调用父节点

CodingStructure以表示一个局部UnitArea而建立，访问UnitArea之外的信息需要返回上层CodingStructure

父节点不知道子节点
最好的候选人需要宣传给父母

3.Partitioner

A simple class governing splitting (CU and TU, quad-tree and possibly others)
Modelled as a stack – new splits are created as levels on the currently processed area
For HEVC
Contains accessors for current split info (partitioner.curr*)
Depth (CU, TU) as well as the actual current UnitArea
For QTBT and further (additionally to HEVC-features)
Allows to set split restrictions (e.g. constraint splits at a certain level)
Allows to perform split plausibility checks (canSplit)

一个管理分裂的类(CU和TU，四叉树，MTT和BTT)
堆栈模型，划分按照等级顺序，新的划分在已划分的区域上进行；
包含当前拆分信息也就是划分结构的访问器(partitioner.curr*)，深度(CU, TU)以及实际当前UnitArea
允许设置拆分限制(例如，约束拆分在某个级别)
允许执行拆分合理性检查

4.Data Ownership //数据谱系关系

Each piece of data is owned by some object, which needs to allocate and release it
Picture
Owned by EncLib or DecLib
Owns signal buffers, Slice objects, SEI messages and TileMap
AreaBuf, UnitBuf
Do not own any data
PelStorage
Might own the buffers (depends if create or createFromBuf used for creation)
Owned data is stored in m_origin member

每一段数据都属于某个对象，该对象需要分配和释放数据
Picture
由EncLib或DecLib拥有
拥有信号缓冲区，Slice对象，SEI信息和Tile
AreaBuf, UnitBuf
不拥有任何数据
PelStorage
可能拥有缓冲区(取决于create或createFromBuf用于创建)
拥有的数据存储在m_origin成员中

CodingStructure
Top-Layer: owned by Picture
Links to signal buffers of Picture, does not own them
Other (temporary in RD-Search): owned by EncCu or IntraSearch
Contains own signal buffers, owns them
Always owns buffers describing the structure and layout (not signal)
Owns transformation coefficient buffers
Does not own CodingUnit etc., only links to them through dynamic_cache

CodingStructure
顶层CS: 归属于Picture
链接到Picture的信号缓冲区，但不拥有它们
其他(RD-Search中的临时):由EncCu或IntraSearch拥有
包含自己的信号缓冲区，拥有它们
总是拥有描述结构和布局的缓冲区(不是信号)
拥有变换系数缓冲器
不拥有编码单元等，只有链接到他们通过dynamic_cache

CodingUnit, PredictionUnit, TransformUnit
Owned by dynamic_cache – objects need to be acquired by get and freed by cache
TransformUnit
Does not own transformation coefficient buffers
Links to buffers from CodingStructure
dynamic_cache
Top-Level cache is global (dynamically allocated on runtime and freed on exit)
RD-search cache is owned by EncCu and IntraSearch

CodingUnit、PredictionUnit TransformUnit
由dynamic_cache拥有-对象需要通过get获取并通过cache释放
TransformUnit
不拥有变换系数缓冲器
链接到来自CodingStructure的缓冲区
dynamic_cache
顶级缓存是全局的(在运行时动态分配，在退出时释放)
RD-search 缓存由EncCu和IntraSearch拥有