jvet-w2002 预测段翻译（未完成）

最新推荐文章于 2024-09-14 16:21:49 发布

青椒鸡汤

最新推荐文章于 2024-09-14 16:21:49 发布

阅读量310

点赞数 1

分类专栏：视频编解码文章标签： c++

好

本文链接：https://blog.csdn.net/dfhg54/article/details/124849797

版权

视频编解码专栏收录该内容

41 篇文章 35 订阅

订阅专栏

本文深入探讨了视频编码标准VVC（Versatile Video Coding）中的帧内预测模式，包括增加到67种的预测模式，特别是针对非方形块的宽角度预测模式。此外，还介绍了4抽头滤波器和参考像素平滑滤波的使用，以提高预测精度。VVC还引入了4:2:2和4:4:4色彩格式的支持，并优化了色度预测，利用交叉组件线性模型（CCLM）减少分量间的冗余。这些技术旨在提升视频压缩效率，降低带宽需求。

摘要由CSDN通过智能技术生成

帧内预测流程（还有几个小细节）_青椒鸡汤的博客-CSDN博客_帧内预测

3.3帧内预测

3.31 帧内预测的67种模式

To capture the arbitrary edge directions presented in natural video, the number of directional intra modes in VVC is extended from 33, as used in HEVC, to 65. The new directional modes not in HEVC are depicted as red dotted arrows in Figure 12, and the planar and DC modes remain the same. These denser directional intra prediction modes apply for all block sizes and for both luma and chroma intra predictions.

In VVC, several conventional angular intra prediction modes are adaptively replaced with wide-angle intra prediction modes for the non-square blocks. Wide angle intra prediction is described in 3.3.1.2.

In HEVC, every intra-coded block has a square shape and the length of each of its side is a power of 2. Thus, no division operations are required to generate an intra-predictor using DC mode. In VVC, blocks can have a rectangular shape that necessitates the use of a division operation per block in the general case. To avoid division operations for DC prediction, only the longer side is used to compute the average for non-square blocks.

为了捕捉到视频所呈现的任意边缘方向，帧内预测模式从HEVC的33种增加到了65种，这些在HEVC中不存在的角度模式在图中以红线的方式标注，planar和DC模式依然有。这些密集的帧内预测模式适用于所有块尺寸以及亮度和色度内预测。

在VVC中，几种常用的角度预测模式被为非矩形设计的广角预测模式所代替。

在HEVC中，每个帧内编码的块都是方形的，每条边的长度都是2的幂。因此。生成一个帧内的DC模式的预测期不需要除运算。在VVC中，因为块可以是矩形，一般情况下需要对每个块使用除法操作。为了避免在DC模式下帧内预测中使用除运算，只使用较长的边来估算非方形块的均值。

3.3.1.1 帧内预测编码

图12 67种预测模式

To keep the complexity of the most probable mode (MPM) list generation low, an intra mode coding method with 6 MPMs is used by considering two available neighboring intra modes. The following three aspects are considered to construct the MPM list:

Default intra modes

Neighbouring intra modes

Derived intra modes

为了降低最有可能预测模式(MPM)列表生成的复杂性，通过参考2个有可能的相邻帧内预测模式，一种带有6个MPM的帧内模式编码方法被使用。构建MPM列表主要考虑以下三个方面:

默认的帧内模式

相邻块的帧内预测模式

由此决定的帧内预测模式

A unified 6-MPM list is used for intra blocks irrespective of whether MRL and ISP coding tools are applied or not. The MPM list is constructed based on intra modes of the left and above neighboring block. Suppose the mode of the left is denoted as Left and the mode of the above block is denoted as Above, the unified MPM list is constructed as follows:

无论是否使用MRL（Multiple reference line intra prediction. 多参考行帧内预测）和ISP（Intra subblock partitioning 帧内子块划分）编码工具，块内都使用统一的6-MPM列表。MPM列表的构建基于左方和上方的相邻块，假设左边的模态记为left，上面的块的模态记为above，则统一的MPM列表构造如下:

When a neighboring block is not available, its intra mode is set to Planar by default.

当相邻块不可用时，则其模式默认设置为Planar模式

If both modes Left and Above are non-angular modes:

MPM list à {Planar, DC, V, H, V − 4, V + 4}

如果相邻的左和上边的块都不是角度模式，则MPM列表为{Planar, DC, V, H, V − 4, V + 4}

If one of modes Left and Above is angular mode, and the other is non-angular:

Set a mode Max as the larger mode in Left and Above

MPM list à {Planar, Max, Max − 1, Max + 1, Max –– 2, Max + 2}

如果相邻的左或上的块有一个是角度模式，另一个非角度模式，则设置一个变量MAX，

max的值为左，上块中较大的模式，则MPM列表为 {Planar, Max, Max − 1, Max + 1, Max –– 2, Max + 2}

If Left and Above are both angular and they are different:

Set a mode Max as the larger mode in Left and Above

Set a mode Min as the smaller mode in Left and Above

如果相邻的左或上的块都是角度模式，设置max,min，max为较大的模式，min为较小的

If Max – Min is equal to 1 :

MPM list à {Planar, Left, Above, Min – 1, Max + 1, Min – 2}

Otherwise, if Max – Min is greater than or equal to 62 :

MPM list à {Planar, Left, Above, Min + 1, Max – 1, Min + 2}

Otherwise, if Max – Min is equal to 2 :

MPM list à {Planar, Left, Above, Min + 1, Min – 1, Max + 1}

Otherwise :

MPM list à {Planar, Left, Above, Min – 1, –Min + 1, Max – 1}

若max-min == 1

MPM list à {Planar, Left, Above, Min – 1, Max + 1, Min – 2}

max-min 大于等于62

MPM list à {Planar, Left, Above, Min + 1, Max – 1, Min + 2}

max-min ==2

MPM list à {Planar, Left, Above, Min + 1, Min – 1, Max + 1}

其他情况

MPM list à {Planar, Left, Above, Min – 1, –Min + 1, Max – 1}

If Left and Above are both angular and they are the same:

MPM list à {Planar, Left, Left − 1, Left + 1, Left – 2, Left + 2}

如果都为角度模式且相同，则

MPM list à {Planar, Left, Left − 1, Left + 1, Left – 2, Left + 2}

Besides, the first bin of the mpm index codeword is CABAC context coded. In total three contexts are used, corresponding to whether the current intra block is MRL enabled, ISP enabled, or a normal intra block.

During 6 MPM list generation process, pruning is used to remove duplicated modes so that only unique modes can be included into the MPM list. For entropy coding of the 61 non-MPM modes, a Truncated Binary Code (TBC) is used.

此外，mpm列表码字的第一个bin是CABAC上下文编码。总共使用了三种，即当前预测块是MRL,ISP或是一个普通预测块。

在6-MPM列表形成过程中，剪枝用于删除重复的模式，以便只有唯一的模式可以包含在MPM列表中，对于61种非mpm模式的熵编码，采用了截断二进制码(TBC)。

3.3.1.2非方形块的宽角度帧内预测

Conventional angular intra prediction directions are defined from 45 degrees to −135 degrees in clockwise direction. In VVC, several conventional angular intra prediction modes are adaptively replaced with wide-angle intra prediction modes for non-square blocks. The replaced modes are signalled using the original mode indexes, which are remapped to the indexes of wide angular modes after parsing. The total number of intra prediction modes is unchanged, i.e., 67, and the intra mode coding method is unchanged.

常规的帧内角度预测方向被定义为从顺时针45度到135度。在VVC中的非正方形块中，许多常规的帧内预测模式被替换为宽角度预测模式。这些替换的模式依然使用原有模式的索引号，在解码端解析时再将其映射为宽角度预测模式。总的帧内预测模式数未改变，依然为67且帧内预测编码方式未改变。

Figure 13 – Reference samples for wide-angular intra prediction

To support these prediction directions, the top reference with length 2W+1, and the left reference with length 2H+1, are defined as shown in Figure 13.

顶部的参考像素数为2W+1,左侧为2H+1

Table 3-2 – Intra prediction modes replaced by wide-angular modes

宽角度模式代替普通模式的数量由块的宽高比决定，在表13中有解释

Aspect ratio	Replaced intra prediction modes
W / H == 16	Modes 2,3,4,5,6,7,8,9,10,11,12, 13,14,15
W / H == 8	Modes 2,3,4,5,6,7,8,9,10,11,12, 13
W / H == 4	Modes 2,3,4,5,6,7,8,9,10,11
W / H == 2	Modes 2,3,4,5,6,7,8,9
W / H == 1	None
W / H == 1/2	Modes 59,60,61,62,63,64,65,66
W / H == 1/4	Mode 57,58,59,60,61,62,63,64,65,66
W / H == 1/8	Modes 55, 56,57,58,59,60,61,62,63,64,65,66
W / H == 1/16	Modes 53, 54, 55, 56,57,58,59,60,61,62,63,64,65,66

Figure 14 – Problem of discontinuity in case of directions beyond 45°

As shown in Figure 14, two vertically-adjacent predicted samples may use two non-adjacent reference samples in the case of wide-angle intra prediction. Hence, low-pass reference samples filter and side smoothing are applied to the wide-angle prediction to reduce the negative effect of the increased gap ∆pα. If a wide-angle mode represents a non-fractional offset. There are 8 modes in the wide-angle modes satisfy this condition, which are [−14, −12, −10, −6, 72, 76, 78, 80]. When a block is predicted by these modes, the samples in the reference buffer are directly copied without applying any interpolation. With this modification, the number of samples needed to be smoothing is reduced. Besides, it aligns the design of non-fractional modes in the conventional prediction modes and wide-angle modes.

如图14所示，在使用宽角度预测时，两个邻近的垂直预测样本像素可以使用两个不相邻的参考像素样本。此外，随着角度宽角度预测α的不断增大，宽角度预测也要对参考像素运用低通滤波和边界平滑滤波来降低其影响。在宽角度模式中有8种non-fractional角度，即 [−14, −12, −10, −6, 72, 76, 78, 80]。如果一个块由这几种模式预测，参考像素不仅插值直接使用，这样需要平滑滤波的惨老像素数目大大减少。另外，这还将传统预测模式和宽角度预测模式中的对non-fractional模式的设计匹配到了一起。

In VVC, 4:2:2 and 4:4:4 chroma formats are supported as well as 4:2:0. Chroma derived mode (DM) derivation table for 4:2:2 chroma format was initially ported from HEVC extending the number of entries from 35 to 67 to align with the extension of intra prediction modes. Since HEVC specification does not support prediction angle below −135 degree and above 45 degree, luma intra prediction modes ranging from 2 to 5 are mapped to 2. Therefore chroma DM derivation table for 4:2:2: chroma format is updated by replacing some values of the entries of the mapping table to convert prediction angle more precisely for chroma blocks.

在VVC中，4:2:2和4:4:4色彩格式如4:2:0一样也支持。4:2:2色度格式的色度衍生模式(DM)衍生表最初是从HEVC移植的，将条目数量从35个扩展到67个，以匹配内部预测模式的扩展。由于HEVC规范不支持预估角度低于135度和高于45度，所以亮度预测模式中2~5模式都映射为2。因此，4:2:2色度格式的色度DM推导表已更新，来替换映射表项的一些值，以更精确地转换色度块的预测角度。

3.3.1.3 4抽头滤波器和参考像素的平滑滤波

Four-tap intra interpolation filters are utilized to improve the directional intra prediction accuracy. In HEVC, a two-tap linear interpolation filter has been used to generate the intra prediction block in the directional prediction modes (i.e., excluding Planar and DC predictors). In VVC, the two sets of 4-tap IFs replace lower precision linear interpolation as in HEVC, where one is a DCT-based interpolation filter (DCTIF) and the other one is a 4-tap smoothing interpolation filter (SIF). The DCTIF is constructed in the same way as the one used for chroma component motion compensation in both HEVC and VVC. The SIF is obtained by convolving the 2-tap linear interpolation filter with [1 2 1] /4 filter.

四抽头插值滤波器用于提高方向预测的精度。在HEVC中，一种双抽头线性插值滤波器被用来在方向预测模式中生成帧内预测块(不包括planar和DC模式)。在VVC中，由4抽头替换了HEVC中的2抽头滤波器。其中一个是基于DCT的插值滤波器(DCTIF)，另一个是4抽头平滑插值滤波器(SIF)。DCTIF的构造方法与HEVC和VVC中用于色度分量运动补偿的方法相同。SIF是通过2抽头线性插值滤波器与[1 2 1]/4滤波器卷积得到的。

Depending on the intra prediction mode, the following reference samples processing is performed:

The directional intra-prediction mode is classified into one of the following groups:
- Group A: vertical or horizontal modes (HOR_IDX, VER_IDX),
- Group B: directional modes that represent non-fractional angles (−14, −12, −10, −6, 2, 34, 66, 72, 76, 78, 80,) and Planar mode,
- Group C: remaining directional modes;

如果帧内角度预测模式被分为如下几组：

A组：水平，垂直模式(HOR_IDX, VER_IDX)

B组：角度模式中的non-fractiona角度和planar模式

C组：其他角度模式

If the directional intra-prediction mode is classified as belonging to group A, then then no filters are applied to reference samples to generate predicted samples;
Otherwise, if a mode falls into group B and the mode is a directional mode, and all of following conditions are true, then a [1, 2, 1] reference sample filter may be applied (depending on the MDIS condition) to reference samples to further copy these filtered values into an intra predictor according to the selected direction, but no interpolation filters are applied::
- refIdx is equal to 0 (no MRL)
- TU size is greater than 32
- Luma
- No ISP block

如果当前角度模式属于A，则不对参考像素使用任何滤波器来生成预测像素，直接生成

如果当前模式属于B同时这个模式属于角度模式，且满足以下所有条件，则用[1, 2, 1]滤波器为参考像素滤波(取决于MDIS条件),来将这些滤波后的值给预测像素且不进行插值滤波

refIdx is equal to 0 (no MRL) (参考像素索引号为0且不是MRL)
TU size大于32
亮度样本
没有ISP块

Otherwise, if a mode is classified as belonging to group C, MRL index is equal to 0, and the current block is not ISP block, then only an intra reference sample interpolation filter is applied to reference samples to generate a predicted sample that falls into a fractional or integer position between reference samples according to a selected direction (no reference sample filtering is performed). The interpolation filter type is determined as follows :
- Set minDistVerHor equal to Min( Abs( predModeIntra − 50 ), Abs( predModeIntra − 18 ) )
- Set nTbS equal to ( Log2 (W) + Log2 (H) ) >> 1
- Set intraHorVerDistThres[ nTbS ] as specified below :,

	nTbS = 2	nTbS = 3	nTbS = 4	nTbS = 5	nTbS = 6	nTbS = 7
intraHorVerDistThres[ nTbS ]	24	14	2	0	0	0

- If minDistVerHor is greater than intraHorVerDistThres[ nTbS ], SIF is used for the interpolation
- Otherwise, DCTIF is used for the interpolation

如果一个模式属于C，MRL索引号为0，并且当前块不为ISP块，则对落到整数和分数位置的参考像素进行进行插值滤波（不在进行平滑滤波），插值类型由下面决定：

设置minDistVerHor = Min( Abs( predModeIntra − 50 ), Abs( predModeIntra − 18 ) )
设置nTbS = ( Log2 (W) + Log2 (H) ) >> 1
设置 intraHorVerDistThres[ nTbS ] 如表
如果minDistVerHor 大于 intraHorVerDistThres[ nTbS ],用SIF插值
否则用

3.3.2 CCLM预测

H.266帧内色度预测模式：分量间线性模型(CCLM)预测_岳麓吹雪的博客-CSDN博客

VVC学习之五：帧内预测之色度预测——CCLM及代码学习_Aidoneus_y的博客-CSDN博客

To reduce the cross-component redundancy, a cross-component linear model (CCLM) prediction mode is used in the VVC, for which the chroma samples are predicted based on the reconstructed luma samples of the same CU by using a linear model as follows:

(3-1)

为了减少分量间冗余，分量间线性模式预测在VVC中被运用，在同一个CU中，通过运用式(3-1)，色度像素值可以基于重建亮度像素的值来预测

where represents the predicted chroma samples in a CU and represents the downsampled reconstructed luma samples of the same CU.

其中， $pred_C$ 代表CU中预测的色度像素值，表示已重建的亮度下采样后的值，在同一个CU中

The CCLM parameters ( $\alpha$ and $\beta$ ) are derived with at most four neighbouring chroma samples and their corresponding down-sampled luma samples. Suppose the current chroma block dimensions are W×H, then W'’ and H’ are set as

W’ = W, H’ = H when LM mode is applied;
W’ =W + H when LM-A mode is applied;
H’ = H + W when LM-L mode is applied;

CCLM的参数α和β由相邻的最多4个色度像素和相关的下采样亮度像素值值确定。假设当前色度块的尺寸为W×H，则W "和H '设为

LM模式：W’ = W, H’ = H
LM-A模式：W’ =W + H
LM-B模式：H’ = H + W

LM-A 表示只使用上方相邻的像素，LM-L 表示只使用左侧相邻的像素

The above neighbouring positions are denoted as S[ 0, −1 ]…S[ W’ − 1, −1 ] and the the left neighbouring positions are denoted as S[ −1, 0 ]…S[ −1, H’ − 1 ]. Then the four samples are selected as

S[W’ / 4, −1 ], S[ 3 * W’ / 4, −1 ], S[ −1, H’ / 4 ], S[ −1, 3 * H’ / 4 ] when LM mode is applied and both above and left neighbouring samples are available;
S[ W’ / 8, −1 ], S[ 3 * W’ / 8, −1 ], S[ 5 * W’ / 8, −1 ], S[ 7 * W’ / 8, −1 ] when LM-A mode is applied or only the above neighbouring samples are available;
S[ −1, H’ / 8 ], S[ −1, 3 * H’ / 8 ], S[ −1, 5 * H’ / 8 ], S[ −1, 7 * H’ / 8 ] when LM-L mode is applied or only the left neighbouring samples are available;

上方相邻的像素位置表示为S[ 0, −1 ]…S[ W’ − 1, −1 ]，同时左方相邻的像素位置表示为S[ −1, 0 ]…S[ −1, H’ − 1 ].则需要的4个像素选取如下

LM模式且左方上方相邻像素全存在：S[W’ / 4, −1 ], S[ 3 * W’ / 4, −1 ], S[ −1, H’ / 4 ], S[ −1, 3 * H’ / 4 ]
LM-A模式且只有上方相邻像素存在：S[ W’ / 8, −1 ], S[ 3 * W’ / 8, −1 ], S[ 5 * W’ / 8, −1 ], S[ 7 * W’ / 8, −1 ]
LM-L模式且只有左方相邻像素存在：S[ −1, H’ / 8 ], S[ −1, 3 * H’ / 8 ], S[ −1, 5 * H’ / 8 ], S[ −1, 7 * H’ / 8 ]

这四个在选定位置且相邻的亮度像素要下采样同时比较四次来找出两个大值： $x^{0}_{A}$ 和 $x^{1}_{A}$ ，和两个较小的值： $x^{0}_{B}$ 和 $x^{1}_{B}$ .相关的色度像素值被设置为 $Y^{0}_{A}$ ， $Y^{1}_{A}$ ， $Y^{0}_{B}$ 和 $Y^{1}_{B}$ . $x_{A}$ , $x_{B}$ , $y_{A}$ 和 $y_{B}$ 要设置为

最终 CCLM 线性模型的参数α和β 计算方式如下

Figure 15 shows an example of the location of the left and above samples and the sample of the current block involved in the CCLM mode

图15展示了例子，选定的亮度通过CCLM求出色度像素位置，然后以此求出α和β

Figure 15 – Locations of the samples used for the derivation of α and β

The division operation to calculate parameter α is implemented with a look-up table. To reduce the memory required for storing the table, the diff value (difference between maximum and minimum values) and the parameter are expressed by an exponential notation. For example, diff is approximated with a 4-bit significant part and an exponent. Consequently, the table for 1/diff is reduced into 16 elements for 16 values of the significand as follows:

DivTable [ ] = { 0, 7, 6, 5, 5, 4, 4, 3, 3, 2, 2, 1, 1, 1, 1, 0 } (3-5)

计算参数α的除法运算是用查找表实现的，为了减少存储这个表所占得空间，diff值(最大与最小的插值) 和参数α 用指数符号表示。

This would have a benefit of both reducing the complexity of the calculation as well as the memory size required for storing the needed tables

Besides the above template and left template can be used to calculate the linear model coefficients together, they also can be used alternatively in the other 2 LM modes, called LM_A, and LM_L modes.

In LM_T mode, only the above template are used to calculate the linear model coefficients. To get more samples, the above template are extended to (W+H) samples. In LM_L mode, only left template are used to calculate the linear model coefficients. To get more samples, the left template are extended to (H+W) samples.

In LM_LT mode, left and above templates are used to calculate the linear model coefficients.

To match the chroma sample locations for 4:2:0 video sequences, two types of downsampling filter are applied to luma samples to achieve 2 to 1 downsampling ratio in both horizontal and vertical directions. The selection of downsampling filter is specified by a SPS level flag. The two downsmapling filters are as follows, which are corresponding to “type-0” and “type-2” content, respectively.

Note that only one luma line (general line buffer in intra prediction) is used to make the downsampled luma samples when the upper reference line is at the CTU boundary.

This parameter computation is performed as part of the decoding process, and is not just as an encoder search operation. As a result, no syntax is used to convey the α and β values to the decoder.

For chroma intra mode coding, a total of 8 intra modes are allowed for chroma intra mode coding. Those modes include five traditional intra modes and three cross-component linear model modes (CCLM, LM_A, and LM_L). Chroma mode signalling and derivation process are shown in Table 33. Chroma mode coding directly depends on the intra prediction mode of the corresponding luma block. Since separate block partitioning structure for luma and chroma components is enabled in I slices, one chroma block may correspond to multiple luma blocks. Therefore, for Chroma DM mode, the intra prediction mode of the corresponding luma block covering the center position of the current chroma block is directly inherited.

Table 3-3 – Derivation of chroma prediction mode from luma mode when cclm_is enabled

Chroma prediction mode	Corresponding luma intra prediction mode
Chroma prediction mode	0	50	18	1	X ( 0 <= X <= 66 )
0	66	0	0	0	0
1	50	66	50	50	50
2	18	18	66	18	18
3	1	1	1	66	1
4	0	50	18	1	X
5	81	81	81	81	81
6	82	82	82	82	82
7	83	83	83	83	83

A single binarization table is used regardless of the value of sps_cclm_enabled_flag as shown in Table 3-4.

Table 3-4– Unified binarization table for chroma prediction mode

Value of intra_chroma_pred_mode	Bin string
4	00
0	0100
1	0101
2	0110
3	0111
5	10
6	110
7	111

In Table 3-4, the first bin indicates whether it is regular (0) or LM modes (1). If it is LM mode, then the next bin indicates whether it is LM_CHROMA (0) or not. If it is not LM_CHROMA, next 1 bin indicates whether it is LM_L (0) or LM_A (1). For this case, when sps_cclm_enabled_flag is 0, the first bin of the binarization table for the corresponding intra_chroma_pred_mode can be discarded prior to the entropy coding. Or, in other words, the first bin is inferred to be 0 and hence not coded. This single binarization table is used for both sps_cclm_enabled_flag equal to 0 and 1 cases. The first two bins in Table 34 are context coded with its own context model, and the rest bins are bypass coded.

In addition, in order to reduce luma-chroma latency in dual tree, when the 64x64 luma coding tree node is partitioned with Not Split (and ISP is not used for the 64x64 CU) or QT, the chroma CUs in 32x32 / 32x16 chroma coding tree node are allowed to use CCLM in the following way:

If the 32x32 chroma node is not split or partitioned QT split, all chroma CUs in the 32x32 node can use CCLM
If the 32x32 chroma node is partitioned with Horizontal BT, and the 32x16 child node does not split or uses Vertical BT split, all chroma CUs in the 32x16 chroma node can use CCLM.

In all the other luma and chroma coding tree split conditions, CCLM is not allowed for chroma CU.