x264代码剖析(十八):核心算法之滤波
H.264/MPEG-4 AVC视频编码标准中,在编解码器反变换量化后,图像会出现方块效应,主要原因是:1)基于块的帧内和帧间预测残差的DCT变换,变换系数的量化过程相对粗糙,因而反量化过程恢复的变换系数有误差,会造成在图像块边界上的视觉不连续;2)运动补偿可能是从不是同一帧的不同位置上内插样点数据复制而来,因为运动补偿块的匹配不可能是绝对准确的,所以就会在复制块的边界上产生数据不连续;3)参考帧中的存在的不连续也被复制到需要补偿的图像块内。
尽管H.264采用较小的4×4变换尺寸可以降低这种不连续现象,但仍需要一个去方块滤波器,以最大程度提高编码性能。在x264中,x264_slice_write()函数中调用了x264_fdec_filter_row()的源代码。x264_fdec_filter_row()对应着x264中的滤波模块。滤波模块主要完成了下面3个方面的功能:
(1)环路滤波(去块效应滤波);
(2)半像素内插;
(3)视频质量指标PSNR和SSIM的计算。
滤波模块对应的函数关系调用图如下:
下面对x264中的滤波模块对应的主要函数分别进行分析。
1、x264_slice_write()函数
x264_slice_write()函数中调用了x264_fdec_filter_row()函数,对应于滤波模块。具体的代码分析见《x264代码剖析(九):x264_encoder_encode()函数之x264_slice's'_write()函数》。
2、x264_fdec_filter_row()函数
x264_fdec_filter_row()函数用于对一行宏块进行滤波。该函数的定义位于encoder\encoder.c,x264_fdec_filter_row()完成了三步工作:
(1)环路滤波(去块效应滤波)。通过调用x264_frame_deblock_row()函数实现。
(2)半像素内插。通过调用x264_frame_filter()函数实现。
(3)视频质量SSIM和PSNR的计算。PSNR通过调用x264_pixel_ssd_wxh()函数实现,在这里只计算了SSD;SSIM的计算则是通过x264_pixel_ssim_wxh()函数实现。
对应的代码分析如下:
/******************************************************************/
/******************************************************************/
/*
======Analysed by RuiDong Fang
======Csdn Blog:http://blog.csdn.net/frd2009041510
======Date:2016.04.06
*/
/******************************************************************/
/******************************************************************/
/************====== x264_fdec_filter_row()函数 ======************/
/*
功能:对一行宏块进行滤波-去块效应滤波、半像素插值、SSIM/PSNR计算等
*/
static void x264_fdec_filter_row( x264_t *h, int mb_y, int pass )
{
/* mb_y is the mb to be encoded next, not the mb to be filtered here */
int b_hpel = h->fdec->b_kept_as_ref;
int b_deblock = h->sh.i_disable_deblocking_filter_idc != 1;
int b_end = mb_y == h->i_threadslice_end;
int b_measure_quality = 1;
int min_y = mb_y - (1 << SLICE_MBAFF);
int b_start = min_y == h->i_threadslice_start;
/* Even in interlaced mode, deblocking never modifies more than 4 pixels
* above each MB, as bS=4 doesn't happen for the top of interlaced mbpairs. */
int minpix_y = min_y*16 - 4 * !b_start;
int maxpix_y = mb_y*16 - 4 * !b_end;
b_deblock &= b_hpel || h->param.b_full_recon || h->param.psz_dump_yuv;
if( h->param.b_sliced_threads )
{
switch( pass )
{
/* During encode: only do deblock if asked for */
default:
case 0:
b_deblock &= h->param.b_full_recon;
b_hpel = 0;
break;
/* During post-encode pass: do deblock if not done yet, do hpel for all
* rows except those between slices. */
case 1:
b_deblock &= !h->param.b_full_recon;
b_hpel &= !(b_start && min_y > 0);
b_measure_quality = 0;
break;
/* Final pass: do the rows between slices in sequence. */
case 2:
b_deblock = 0;
b_measure_quality = 0;
break;
}
}
if( mb_y & SLICE_MBAFF )
return;
if( min_y < h->i_threadslice_start )
return;
if( b_deblock )
for( int y = min_y; y < mb_y; y += (1 << SLICE_MBAFF) )
x264_frame_deblock_row( h, y ); 去块效应滤波
/* FIXME: Prediction requires different borders for interlaced/progressive mc,
* but the actual image data is equivalent. For now, maintain this
* consistency by copying deblocked pixels between planes. */
if( PARAM_INTERLACED && (!h->param.b_sliced_threads || pass == 1) )
for( int p = 0; p < h->fdec->i_plane; p++ )
for( int i = minpix_y>>(CHROMA_V_SHIFT && p); i < maxpix_y>>(CHROMA_V_SHIFT && p); i++ )
memcpy( h->fdec->plane_fld[p] + i*h->fdec->i_stride[p],
h->fdec->plane[p] + i*h->fdec->i_stride[p],
h->mb.i_mb_width*16*sizeof(pixel) );
if( h->fdec->b_kept_as_ref && (!h->param.b_sliced_threads || pass == 1) )