HEVC学习之码流分析

很难绷得住

已于 2022-07-14 19:30:32 修改

阅读量1.2k

点赞数

分类专栏：视频增强与编解码文章标签：学习网络视频编解码

于 2022-06-20 22:20:13 首次发布

本文链接：https://blog.csdn.net/qq_42567607/article/details/125228275

版权

视频增强与编解码专栏收录该内容

10 篇文章 2 订阅

订阅专栏

一，从分层编解码框架到NAL单元

以H.264为例：
H.264适应不同网络之间的传输，主要原因是引入了分层结构，分为视频编码层（VCL）与网络抽象层（NAL），从而实现压缩编码与网络传输分离。
通过H.264算法压缩的后的数据通过NAL-VCL接口封装成NAL包
在这里插入图片描述
NAL的基本单元是NALU,而VCL层自上而下的结构如下所示：

其中划分条带（slice）的目的是为了适应不同传输网络的最大传输单元（MTU）
分组的目的是为了使数据独立于其他分组，从而实现特定的目的，比如防止误差扩散保证图像质量，区分前景背景以分别编码

每个NALU由包头信息和VCL层信息组合而成，一个NALU包含一个slice：
在这里插入图片描述

其中RBSP（Raw Byte Sequence payload，原始字节序列负载）
SoDB(String of Data BITS，原始数据比特流）
填充比特为rbsP_trailing是为了使码流按字节对齐

二、NALU的包头信息

H.264的包头信息占一字节，即8bit，NAL类型只有32种（0-31），二进制只有5位，浪费三位，如果这三位不用的话大量的nal单元会造成大量的浪费，因此这三位也要利用上：
第1位：1比特的禁止位，当网络识别此单元中存在比特错误时，可将其设置为1，以方便接收方丢弃该单元。
第2位~第3位，2比特的优先级位（NRI），按照11,10,01,00的顺序优先级递减，当解码器忙碌时从优先级高的开始解码。

NAL的32种类型如下：
在这里插入图片描述
而HEVC和AVC的NAL包头主要有三个区别：
01、AVC包头信息占一字节,在HEVC中包头信息占两字节，足以支持HEVC可分级编码，多视点编码和3D视频编码的扩展
02、AVC的视频参数封存于pps，sps的NAL包中，在HEVC中还新增了vps（视频参数集），用于存放prfile,Level等
02、HEVC的NAL包头加入了该NAL所在的时间层的标志，去掉了NRI，并将该信息放在nal_unit_type中

在这里插入图片描述
01，1比特禁止位F，与AVC的不同，它的作用就是在尚存MPEG-2系统环境中，防止产生可以解释为MPEG-2起始码的比特模式。
02，6比特的类型位NAL_TYPE，新增了32位用作non-VCL单元
03, 6比特的Layer_ID，为层识别信息，表示当前NAL为哪一层，比如在可分级扩展中，它将用于联合标注空间和质量分级层，在3D扩展中，layer_id将标注视点和深度
04, 3比特的TID，temporal_id，表示HEVC的接入单元属于哪个时域子层

HEVC的NAL单元类型：
在这里插入图片描述

三、使用软件进行码流分析

先用HM对篮球测试序列进行压缩

配置如下：

#======== File I/O ===============
InputFile                     : C:\Users\梁昊霖\Desktop\HM-16.20\BasketballDrill_832x480_50.yuv
InputBitDepth                 : 8           # Input bitdepth
InputChromaFormat             : 420         # Ratio of luminance to chrominance samples
FrameRate                     : 50          # Frame Rate per second
FrameSkip                     : 0           # Number of frames to be skipped in input
SourceWidth                   : 832         # Input  frame width
SourceHeight                  : 480         # Input  frame height
FramesToBeEncoded             : 5        # Number of frames to be coded

Level                         : 3.1

采用lowdelay模式，即除了第一帧全B帧

#======== File I/O =====================
BitstreamFile                 : 50LP.bin
ReconFile                     : 50LP.yuv

#======== Profile ================
Profile                       : main

#======== Unit definition ================
MaxCUWidth                    : 64          # Maximum coding unit width in pixel
MaxCUHeight                   : 64          # Maximum coding unit height in pixel
MaxPartitionDepth             : 4           # Maximum coding unit depth
QuadtreeTULog2MaxSize         : 5           # Log2 of maximum transform size for
                                            # quadtree-based TU coding (2...6)
QuadtreeTULog2MinSize         : 2           # Log2 of minimum transform size for
                                            # quadtree-based TU coding (2...6)
QuadtreeTUMaxDepthInter       : 3
QuadtreeTUMaxDepthIntra       : 3

#======== Coding Structure =============
IntraPeriod                   : -1          # Period of I-Frame ( -1 = only first)
DecodingRefreshType           : 0           # Random Accesss 0:none, 1:CRA, 2:IDR, 3:Recovery Point SEI
GOPSize                       : 4           # GOP Size (number of B slice = GOPSize-1)
ReWriteParamSetsFlag          : 1           # Write parameter sets with every IRAP

IntraQPOffset                 : -1 
LambdaFromQpEnable            : 1           # see JCTVC-X0038 for suitable parameters for IntraQPOffset, QPoffset, QPOffsetModelOff, QPOffsetModelScale when enabled
#        Type POC QPoffset QPOffsetModelOff QPOffsetModelScale CbQPoffset CrQPoffset QPfactor tcOffsetDiv2 betaOffsetDiv2 temporal_id #ref_pics_active #ref_pics reference pictures     predict deltaRPS #ref_idcs reference idcs
Frame1:  B    1   5       -6.5                      0.2590         0          0          1.0      0            0               0           4                4         -1 -5 -9 -13       0
Frame2:  B    2   4       -6.5                      0.2590         0          0          1.0      0            0               0           4                4         -1 -2 -6 -10       1      -1       5         1 1 1 0 1
Frame3:  B    3   5       -6.5                      0.2590         0          0          1.0      0            0               0           4                4         -1 -3 -7 -11       1      -1       5         0 1 1 1 1
Frame4:  B    4   1        0.0                      0.0            0          0          1.0      0            0               0           4                4         -1 -4 -8 -12       1      -1       5         0 1 1 1 1

#=========== Motion Search =============
FastSearch                    : 1           # 0:Full search  1:TZ search
SearchRange                   : 64          # (0: Search range is a Full frame)
BipredSearchRange             : 4           # Search range for bi-prediction refinement
HadamardME                    : 1           # Use of hadamard measure for fractional ME
FEN                           : 1           # Fast encoder decision
FDM                           : 1           # Fast Decision for Merge RD cost

#======== Quantization =============
QP                            : 32          # Quantization parameter(0-51)
MaxDeltaQP                    : 0           # CU-based multi-QP optimization
MaxCuDQPDepth                 : 0           # Max depth of a minimum CuDQP for sub-LCU-level delta QP
DeltaQpRD                     : 0           # Slice-based multi-QP optimization
RDOQ                          : 1           # RDOQ
RDOQTS                        : 1           # RDOQ for transform skip
SliceChromaQPOffsetPeriodicity: 0           # Used in conjunction with Slice Cb/Cr QpOffsetIntraOrPeriodic. Use 0 (default) to disable periodic nature.
SliceCbQpOffsetIntraOrPeriodic: 0           # Chroma Cb QP Offset at slice level for I slice or for periodic inter slices as defined by SliceChromaQPOffsetPeriodicity. Replaces offset in the GOP table.
SliceCrQpOffsetIntraOrPeriodic: 0           # Chroma Cr QP Offset at slice level for I slice or for periodic inter slices as defined by SliceChromaQPOffsetPeriodicity. Replaces offset in the GOP table.

#=========== Deblock Filter ============
LoopFilterOffsetInPPS         : 1           # Dbl params: 0=varying params in SliceHeader, param = base_param + GOP_offset_param; 1 (default) =constant params in PPS, param = base_param)
LoopFilterDisable             : 0           # Disable deblocking filter (0=Filter, 1=No Filter)
LoopFilterBetaOffset_div2     : 0           # base_param: -6 ~ 6
LoopFilterTcOffset_div2       : 0           # base_param: -6 ~ 6
DeblockingFilterMetric        : 0           # blockiness metric (automatically configures deblocking parameters in bitstream). Applies slice-level loop filter offsets (LoopFilterOffsetInPPS and LoopFilterDisable must be 0)

#=========== Misc. ============
InternalBitDepth              : 8           # codec operating bit-depth

#=========== Coding Tools =================
SAO                           : 1           # Sample adaptive offset  (0: OFF, 1: ON)
AMP                           : 1           # Asymmetric motion partitions (0: OFF, 1: ON)
TransformSkip                 : 1           # Transform skipping (0: OFF, 1: ON)
TransformSkipFast             : 1           # Fast Transform skipping (0: OFF, 1: ON)
SAOLcuBoundary                : 0           # SAOLcuBoundary using non-deblocked pixels (0: OFF, 1: ON)

#============ Slices ================
SliceMode                : 0                # 0: Disable all slice options.
                                            # 1: Enforce maximum number of LCU in an slice,
                                            # 2: Enforce maximum number of bytes in an 'slice'
                                            # 3: Enforce maximum number of tiles in a slice
SliceArgument            : 1500             # Argument for 'SliceMode'.
                                            # If SliceMode==1 it represents max. SliceGranularity-sized blocks per slice.
                                            # If SliceMode==2 it represents max. bytes per slice.
                                            # If SliceMode==3 it represents max. tiles per slice.

LFCrossSliceBoundaryFlag : 1                # In-loop filtering, including ALF and DB, is across or not across slice boundary.
                                            # 0:not across, 1: across

#============ PCM ================
PCMEnabledFlag                      : 0                # 0: No PCM mode
PCMLog2MaxSize                      : 5                # Log2 of maximum PCM block size.
PCMLog2MinSize                      : 3                # Log2 of minimum PCM block size.
PCMInputBitDepthFlag                : 1                # 0: PCM bit-depth is internal bit-depth. 1: PCM bit-depth is input bit-depth.
PCMFilterDisableFlag                : 0                # 0: Enable loop filtering on I_PCM samples. 1: Disable loop filtering on I_PCM samples.

#============ Tiles ================
TileUniformSpacing                  : 0                # 0: the column boundaries are indicated by TileColumnWidth array, the row boundaries are indicated by TileRowHeight array
                                                       # 1: the column and row boundaries are distributed uniformly
NumTileColumnsMinus1                : 0                # Number of tile columns in a picture minus 1
TileColumnWidthArray                : 2 3              # Array containing tile column width values in units of CTU (from left to right in picture)   
NumTileRowsMinus1                   : 0                # Number of tile rows in a picture minus 1
TileRowHeightArray                  : 2                # Array containing tile row height values in units of CTU (from top to bottom in picture)

LFCrossTileBoundaryFlag             : 1                # In-loop filtering is across or not across tile boundary.
                                                       # 0:not across, 1: across 

#============ WaveFront ================
WaveFrontSynchro                    : 0                # 0:  No WaveFront synchronisation (WaveFrontSubstreams must be 1 in this case).
                                                       # >0: WaveFront synchronises with the LCU above and to the right by this many LCUs.

#=========== Quantization Matrix =================
ScalingList                   : 0                      # ScalingList 0 : off, 1 : default, 2 : file read
ScalingListFile               : scaling_list.txt       # Scaling List file name. If file is not exist, use Default Matrix.

#============ Lossless ================
TransquantBypassEnable     : 0                         # Value of PPS flag.
CUTransquantBypassFlagForce: 0                         # Force transquant bypass mode, when transquant_bypass_enable_flag is enabled

#============ Rate Control ======================
RateControl                         : 0                # Rate control: enable rate control
TargetBitrate                       : 1000000          # Rate control: target bitrate, in bps
KeepHierarchicalBit                 : 2                # Rate control: 0: equal bit allocation; 1: fixed ratio bit allocation; 2: adaptive ratio bit allocation
LCULevelRateControl                 : 1                # Rate control: 1: LCU level RC; 0: picture level RC
RCLCUSeparateModel                  : 1                # Rate control: use LCU level separate R-lambda model
InitialQP                           : 0                # Rate control: initial QP
RCForceIntraQP                      : 0                # Rate control: force intra QP to be equal to initial QP

### DO NOT ADD ANYTHING BELOW THIS LINE ###
### DO NOT DELETE THE EMPTY LINE BELOW ###

再对编码得到的二进制文件进行码流分析：
在这里插入图片描述
可以看到除了第一帧为I帧，其质量最高，其余帧为B帧
在B帧中每隔4帧出现一个较高质量的B帧，因为在配置文件中设置为：

GOPSize                       : 4           # GOP Size (number of B slice = GOPSize-1)
（B条带的数量=GOP数量-1，因为第一个条带为I条带？）GOP不一定以I帧结尾？

I帧只包含I条带，P帧只包含P条带，B帧只包含B条带
I条带只包含I宏块，P条带可以包含P宏块也可以包含I宏块，同样B条带可以包含B宏块也可以包含I宏块

可以看到第4,5帧都含有intra，即I宏块，并且intra含量越高，B帧质量越高
在这里插入图片描述

通过16进制查看其码流：
在这里插入图片描述
框中的意义是起始地址，每个地址的最小单位中可以放两个16进制数，如EF:E的10进制为14,转为二进制为1110，F的10进制为15，转为二进制为1111，[1110 1111 ]就放在一个地址单元中。

当采用其他传输协议时，一个UDP包就是一个NAL单元，解码器可以很方便检测出NAL分界和解码。但在字节流格式中，NAL单元被编码成字节的码流，解码器无法确定每个NAL的起始位置和终止位置，因此定义了一个起同步作用的起始码前缀：0X 00 00 01，在上图中用红框框出。
每个NAL单元用0X 00 00 01分割开，紧跟着起始码前缀后面的是NAL头，如40 01，转为二进制为0100 0000 0000 0001对照NAL单元头结构：
在这里插入图片描述
可以看到其中的NAL_TYPE为为0[100 000]0 0000 0001，将[ ]中转为10进制为32，其对应NAL类型为VPS，同理可以看到后面两个NALU依次为SPS,PPS

很难绷得住

关注

0
点赞
踩
7

收藏

觉得还不错? 一键收藏
2
评论
HEVC学习之码流分析

以H.264为例：H.264适应不同网络之间的传输，主要原因是引入了分层结构，分为视频编码层（VCL）与网络抽象层（NAL），从而实现压缩编码与网络传输分离。通过H.264算法压缩的后的数据通过NAL-VCL接口封装成NAL包NAL的基本单元是NALU,而VCL层自上而下的结构如下所示：其中划分条带（slice）的目的是为了适应不同传输网络的最大传输单元（MTU）分组的目的是为了使数据独立于其他分组，从而实现特定的目的，比如防止误差扩散保证图像质量，区分前景背景以分别编码每个NALU由包头信息
复制链接

扫一扫

专栏目录