解码端运动矢量细化(Decoder side motion vector refinement, DMVR)
为了提高Merge模式的MV的准确性,在VVC中使用了基于双边匹配(BM)的解码端运动矢量细化技术。 双向预测是在list0和list1中分别找一个运动向量MV0和MV1,然后将MV0和MV1所指向的预测块进行加权得到最终的预测块。而DMVR是在进行双向预测时,以MV0和MV1作为初始MV,分别在MV0和MV1附近搜索更精确的MV0’和MV1’。
BM方法计算参考帧列表L0和列表L1中的两个候选块之间的失真。 如下图所示,在初始MV基础上进行搜索获得MV0’和MV1’,计算红色块之间的SAD, 具有最低SAD的MV将成为最终修正的MV,并用于生成双向预测信号。
在VVC中,DMVR仅适用于满足以下条件的CU:
- CU级Merge模式,且是双向预测
- 两个参考帧分别位于当前帧之前和之后
- 前向和后向参考帧到当前帧的距离(即POC差)相同
- 两张参考帧均为短期参考帧
- CU至少有64个亮度像素(宽度乘以高度大于等于64)
- CU高度和CU宽度均大于或等于8
- BCW使用相等的权重
- 当前块未启用WP
- 当前块不使用CIIP模式
bool PU::checkDMVRCondition(const PredictionUnit& pu)
{
if (pu.cs->sps->getUseDMVR() && !pu.cs->picHeader->getDisDmvrFlag())
{
const int refIdx0 = pu.refIdx[REF_PIC_LIST_0];
const int refIdx1 = pu.refIdx[REF_PIC_LIST_1];
const WPScalingParam *wp0 = pu.cu->slice->getWpScaling(REF_PIC_LIST_0, refIdx0);
const WPScalingParam *wp1 = pu.cu->slice->getWpScaling(REF_PIC_LIST_1, refIdx1);
const bool ref0IsScaled = refIdx0 < 0 || refIdx0 >= MAX_NUM_REF
? false
: pu.cu->slice->getRefPic(REF_PIC_LIST_0, refIdx0)->isRefScaled(pu.cs->pps);
const bool ref1IsScaled = refIdx1 < 0 || refIdx1 >= MAX_NUM_REF
? false
: pu.cu->slice->getRefPic(REF_PIC_LIST_1, refIdx1)->isRefScaled(pu.cs->pps);
return pu.mergeFlag && pu.mergeType == MRG_TYPE_DEFAULT_N && !pu.ciipFlag && !pu.cu->affine && !pu.mmvdMergeFlag
&& !pu.cu->mmvdSkip && PU::isBiPredFromDifferentDirEqDistPoc(pu) && (pu.lheight() >= 8) && (pu.lwidth() >= 8)
&& ((pu.lheight() * pu.lwidth()) >= 128) && (pu.cu->BcwIdx == BCW_DEFAULT)
&& !WPScalingParam::isWeighted(wp0) && !WPScalingParam::isWeighted(wp1) && !ref0IsScaled && !ref1IsScaled;
}
else
{
return false;
}
}
DMVR生成的修正MV用于生成帧间预测值和后续图像编码的时域运动矢量预测值(TMVP)。 原始MV用于去块滤波和后续CU编码的空域运动矢量预测值。
DMVR搜索过程的最大单位大小限制为16x16。当CU的宽度和/或高度大于16个亮度像素时,它将被进一步分成宽度和/或高度等于16个亮度像素的子块,如下代码所示。为了方便起见,以下将DMVR进行搜索的子块称为DMVR搜索单元。
static const int DMVR_SUBCU_WIDTH = 16;
static const int DMVR_SUBCU_HEIGHT = 16;
int dy = std::min<int>(pu.lumaSize().height, DMVR_SUBCU_HEIGHT);
int dx = std::min<int>(pu.lumaSize().width, DMVR_SUBCU_WIDTH);
搜索方案
在DMVR中,搜索点围绕初始MV,并且MV偏移遵循MV差异镜像规则。 换句话说,由DMVR的任何搜索点(用候选MV对(MV0,MV1)表示)都遵循以下两个方程式:
其中MV_offset表示参考帧中初始MV和搜索MV'之间的搜索偏移。搜索范围是初始MV附近两个整数亮度像素点内。 搜索包括两个阶段,第一阶段搜索整像素偏移第二阶段搜索分像素。
整像素搜索:
以最多2个整像素为偏移,搜索初始MV和初始MV附近的25个点。 首先计算初始MV0和MV1的SAD(以Merge MV作为初始MV)。 如果初始MV0和MV1的SAD小于阈值(阈值是DMVR搜索单元的像素数,对于16x16子PU阈值为256),则终止DMVR的整数搜索过程(为了减少DMVR改进不确定性的损失,进行阈值比较使用的SAD为原始SAD减去其SAD值的1/4,阈值为DMVR搜索单元的像素数)。 否则,按照光栅扫描顺序计算并检查其余24点的SAD。 选择SAD最小的点作为整像素搜索的结果。
注意:这里的SAD是比较的是通过参考帧L0获得的预测子PU和参考帧L1获得的预测子PU之间的SAD。
其中搜索的25点如下所示:
Mv m_pSearchOffset[25] = { Mv(-2,-2), Mv(-1,-2), Mv(0,-2), Mv(1,-2), Mv(2,-2),
Mv(-2,-1), Mv(-1,-1), Mv(0,-1), Mv(1,-1), Mv(2,-1),
Mv(-2, 0), Mv(-1, 0), Mv(0, 0), Mv(1, 0), Mv(2, 0),
Mv(-2, 1), Mv(-1, 1), Mv(0, 1), Mv(1, 1), Mv(2, 1),
Mv(-2, 2), Mv(-1, 2), Mv(0, 2), Mv(1, 2), Mv(2, 2) };
分像素搜索:
为了节省计算复杂度,分像素搜索是通过使用参数误差曲面方程(parametric error surface equation)代替SAD。在整像素搜索中,如果初始MV0和MV1的SAD小于阈值,提前终止整像素搜索过程,也不再进行分像素搜索过程。
在基于参数误差曲面的子像素偏移估计中,以整像素搜索得到的结果为中心,其相邻的四个位置的Cost用于拟合以下形式的二维抛物线误差曲面方程:
其中(xmin, ymin)对应于Cost最小的分数位置,C对应于最小Cost值。 通过使用五个搜索点的Cost求解上述等式,其中(xmin, ymin)计算公式为:
由于所有Cost值均为正且最小值为E(0,0),因此xmin和ymin的值会自动限制在-8到8之间。 将计算出的分像素(xmin, ymin)加到整像素细化的MV以获得分像素精确细化MV。
双线性插值和像素填充
在VVC中,MV是1/16亮度像素精度。这些分数像素是使用8抽头插值滤波器插值出的。在DMVR中,为了降低计算复杂度,使用双线性插值滤波器生成分数像素,用于DMVR中的搜索过程。通过使用双线性插值滤波器,在2个像素的搜索范围内,与正常的运动补偿过程相比,DVMR不会访问更多的参考像素。
在通过DMVR搜索过程获得修正MV之后,将使用常规的8抽头插值滤波器来生成最终预测。
注意:为了保持编解码的一致性,DMVR在编解码端都要执行。
DMVR整个处理流程如下:
- 以MergeMV作为初始MV
- 确定DMVR搜索单元的大小(最大为16x16)
- 以DMVR搜索单元为子PU,遍历整个PU
- 初始化,获得子PU,并将子PU在参考帧对应的块复制到DMVR的BUFFER中
- 整像素搜索 xBIPMVRefine
- 分像素搜索 xDMVRSubPixelErrorSurface
- 运动补偿 xFinalPaddedMCForDMVR
- 双向加权预测 xWeightedAverage
DMVR处理过程代码及注释如下所示:(基于VTM10.0)
// DMVR过程
void InterPrediction::xProcessDMVR(PredictionUnit& pu, PelUnitBuf &pcYuvDst, const ClpRngs &clpRngs, const bool bioApplied)
{
int iterationCount = 1;
/*Always High Precision*/
int mvShift = MV_FRACTIONAL_BITS_INTERNAL;
/*use merge MV as starting MV*/
Mv mergeMv[] = { pu.mv[REF_PIC_LIST_0] , pu.mv[REF_PIC_LIST_1] }; //使用Merge MV作为初始MV
m_biLinearBufStride = (MAX_CU_SIZE + (2 * DMVR_NUM_ITERATION));
// DMVR搜索过程的最大单元为16x16,dy/dx表示DMVR搜索过程的单元大小
int dy = std::min<int>(pu.lumaSize().height, DMVR_SUBCU_HEIGHT);
int dx = std::min<int>(pu.lumaSize().width, DMVR_SUBCU_WIDTH);
Position puPos = pu.lumaPos(); //PU的位置
int bd = pu.cs->slice->getClpRngs().comp[COMPONENT_Y].bd;
int bioEnabledThres = 2 * dy * dx;
bool bioAppliedType[MAX_NUM_SUBCU_DMVR];
#if JVET_J0090_MEMORY_BANDWITH_MEASURE
JVET_J0090_SET_CACHE_ENABLE(true);
for (int k = 0; k < NUM_REF_PIC_LIST_01; k++)
{
RefPicList refId = (RefPicList)k;
const Picture* refPic = pu.cu->slice->getRefPic(refId, pu.refIdx[refId]);
for (int compID = 0; compID < MAX_NUM_COMPONENT; compID++)
{
Mv cMv = pu.mv[refId];
int mvshiftTemp = mvShift + getComponentScaleX((ComponentID)compID, pu.chromaFormat);
int filtersize = (compID == (COMPONENT_Y)) ? NTAPS_LUMA : NTAPS_CHROMA;
cMv += Mv(-(((filtersize >> 1) - 1) << mvshiftTemp), -(((filtersize >> 1) - 1) << mvshiftTemp));
bool wrapRef = false;
if ( pu.cs->pps->getWrapAroundEnabledFlag() )
{
wrapRef = wrapClipMv(cMv, pu.blocks[0].pos(), pu.blocks[0].size(), pu.cs->sps, pu.cs->pps);
}
else
{
clipMv(cMv, pu.lumaPos(), pu.lumaSize(), *pu.cs->sps, *pu.cs->pps);
}
int width = pcYuvDst.bufs[compID].width + (filtersize - 1);
int height = pcYuvDst.bufs[compID].height + (filtersize - 1);
CPelBuf refBuf;
Position recOffset = pu.blocks[compID].pos().offset(cMv.getHor() >> mvshiftTemp, cMv.getVer() >> mvshiftTemp);
refBuf = refPic->getRecoBuf(CompArea((ComponentID)compID, pu.chromaFormat, recOffset, pu.blocks[compID].size()), wrapRef);
JVET_J0090_SET_REF_PICTURE(refPic, (ComponentID)compID);
for (int row = 0; row < height; row++)
{
for (int col = 0; col < width; col++)
{
JVET_J0090_CACHE_ACCESS(((Pel *)refBuf.buf) + row * refBuf.stride + col, __FILE__, __LINE__);
}
}
}
}
JVET_J0090_SET_CACHE_ENABLE(false);
#endif
{
int num = 0; // DMVR搜索子PU
int scaleX = getComponentScaleX(COMPONENT_Cb, pu.chromaFormat);
int scaleY = getComponentScaleY(COMPONENT_Cb, pu.chromaFormat);
m_biLinearBufStride = (dx + (2 * DMVR_NUM_ITERATION));
// point mc buffer to cetre point to avoid multiplication to reach each iteration to the begining
// 将mc缓冲区指向中心点,以避免相乘以达到每次迭代的起点
Pel *biLinearPredL0 = m_cYuvPredTempDMVRL0 + (DMVR_NUM_ITERATION * m_biLinearBufStride) + DMVR_NUM_ITERATION;
Pel *biLinearPredL1 = m_cYuvPredTempDMVRL1 + (DMVR_NUM_ITERATION * m_biLinearBufStride) + DMVR_NUM_ITERATION;
PredictionUnit subPu = pu;
subPu.UnitArea::operator=(UnitArea(pu.chromaFormat, Area(puPos.x, puPos.y, dx, dy)));
// 填充数据的缓冲区
m_cYuvRefBuffDMVRL0 = (pu.chromaFormat == CHROMA_400 ?
PelUnitBuf(pu.chromaFormat, PelBuf(m_cRefSamplesDMVRL0[0], pcYuvDst.Y())) :
PelUnitBuf(pu.chromaFormat, PelBuf(m_cRefSamplesDMVRL0[0], pcYuvDst.Y()),
PelBuf(m_cRefSamplesDMVRL0[1], pcYuvDst.Cb()), PelBuf(m_cRefSamplesDMVRL0[2], pcYuvDst.Cr())));
m_cYuvRefBuffDMVRL0 = m_cYuvRefBuffDMVRL0.subBuf(UnitAreaRelative(pu, subPu));
m_cYuvRefBuffDMVRL1 = (pu.chromaFormat == CHROMA_400 ?
PelUnitBuf(pu.chromaFormat, PelBuf(m_cRefSamplesDMVRL1[0], pcYuvDst.Y())) :
PelUnitBuf(pu.chromaFormat, PelBuf(m_cRefSamplesDMVRL1[0], pcYuvDst.Y()), PelBuf(m_cRefSamplesDMVRL1[1], pcYuvDst.Cb()),
PelBuf(m_cRefSamplesDMVRL1[2], pcYuvDst.Cr())));
m_cYuvRefBuffDMVRL1 = m_cYuvRefBuffDMVRL1.subBuf(UnitAreaRelative(pu, subPu));
PelUnitBuf srcPred0 = (pu.chromaFormat == CHROMA_400 ?
PelUnitBuf(pu.chromaFormat, PelBuf(m_acYuvPred[0][0], pcYuvDst.Y())) :
PelUnitBuf(pu.chromaFormat, PelBuf(m_acYuvPred[0][0], pcYuvDst.Y()), PelBuf(m_acYuvPred[0][1], pcYuvDst.Cb()), PelBuf(m_acYuvPred[0][2], pcYuvDst.Cr())));
PelUnitBuf srcPred1 = (pu.chromaFormat == CHROMA_400 ?
PelUnitBuf(pu.chromaFormat, PelBuf(m_acYuvPred[1][0], pcYuvDst.Y())) :
PelUnitBuf(pu.chromaFormat, PelBuf(m_acYuvPred[1][0], pcYuvDst.Y()), PelBuf(m_acYuvPred[1][1], pcYuvDst.Cb()), PelBuf(m_acYuvPred[1][2], pcYuvDst.Cr())));
srcPred0 = srcPred0.subBuf(UnitAreaRelative(pu, subPu));
srcPred1 = srcPred1.subBuf(UnitAreaRelative(pu, subPu));
int yStart = 0;
// 以DMVR搜索单元为基本单元,遍历整个PU
for (int y = puPos.y; y < (puPos.y + pu.lumaSize().height); y = y + dy, yStart = yStart + dy)
{
for (int x = puPos.x, xStart = 0; x < (puPos.x + pu.lumaSize().width); x = x + dx, xStart = xStart + dx)
{
PredictionUnit subPu = pu;
// 得到当前DMVR搜索单元大小的子PU
subPu.UnitArea::operator=(UnitArea(pu.chromaFormat, Area(x, y, dx, dy)));
// 将参考帧的相应区域复制到DMVR的BUFFER中
xPrefetch(subPu, m_cYuvRefBuffDMVRL0, REF_PIC_LIST_0, 1);
xPrefetch(subPu, m_cYuvRefBuffDMVRL1, REF_PIC_LIST_1, 1);
xinitMC(subPu, clpRngs); //初始运动补偿
uint64_t minCost = MAX_UINT64;
bool notZeroCost = true;// 是否进行分像素搜索
int16_t totalDeltaMV[2] = { 0,0 };
int16_t deltaMV[2] = { 0, 0 };
uint64_t *pSADsArray;
for (int i = 0; i < (((2 * DMVR_NUM_ITERATION) + 1) * ((2 * DMVR_NUM_ITERATION) + 1)); i++)
{
// 25个整像素点SAD值的初始化操作
m_SADsArray[i] = MAX_UINT64;
}
// pSADsArray指向m_SADsArray的中间位置,也就是对应的25点搜索中的MV(0,0)处
pSADsArray = &m_SADsArray[(((2 * DMVR_NUM_ITERATION) + 1) * ((2 * DMVR_NUM_ITERATION) + 1)) >> 1];
//===============================整像素搜索=============================
for (int i = 0; i < iterationCount; i++)
{
deltaMV[0] = 0;
deltaMV[1] = 0;
// 获得子PU在参考帧0相应的预测子PU
Pel *addrL0 = biLinearPredL0 + totalDeltaMV[0] + (totalDeltaMV[1] * m_biLinearBufStride);
// 获得子PU在参考帧1相应的预测子PU
Pel *addrL1 = biLinearPredL1 - totalDeltaMV[0] - (totalDeltaMV[1] * m_biLinearBufStride);
if (i == 0)
{
//计算初始MV对应的Cost
minCost = xDMVRCost(clpRngs.comp[COMPONENT_Y].bd, addrL0, m_biLinearBufStride, addrL1, m_biLinearBufStride, dx, dy);
minCost -= (minCost >>2); //将初始MV的Cost减去其Cost的1/4
if (minCost < (dx * dy))
{
//判断初始MV的Cost值是否小于阈值,提前终止整像素搜索
notZeroCost = false;
break;
}
pSADsArray[0] = minCost; //将初始MV的Cost保存在MV(0,0)对应的Cost
}
if (!minCost)
{
notZeroCost = false;
break;
}
//计算25个整像素点的对应的Cost
xBIPMVRefine(bd, addrL0, addrL1, minCost, deltaMV, pSADsArray, dx, dy);
if (deltaMV[0] == 0 && deltaMV[1] == 0) //整像素是搜索阶段是否结束,这两个变量为整像素偏移量
{
break;
}
totalDeltaMV[0] += deltaMV[0];//记录整像素搜索阶段,水平偏移量
totalDeltaMV[1] += deltaMV[1];//记录整像素搜索阶段,垂直平移量
pSADsArray += ((deltaMV[1] * (((2 * DMVR_NUM_ITERATION) + 1))) + deltaMV[0]);//将SAD指针移动到整像素搜索得到的MV对应的SAD
}
//===============================分像素搜索=============================
bioAppliedType[num] = (minCost < bioEnabledThres) ? false : bioApplied;
totalDeltaMV[0] = (totalDeltaMV[0] << mvShift);
totalDeltaMV[1] = (totalDeltaMV[1] << mvShift);
//计算分像素代价
xDMVRSubPixelErrorSurface(notZeroCost, totalDeltaMV, deltaMV, pSADsArray);
pu.mvdL0SubPu[num] = Mv(totalDeltaMV[0], totalDeltaMV[1]); //将搜索后得到的MV赋值给mvdL0SubPu
PelUnitBuf subPredBuf = pcYuvDst.subBuf(UnitAreaRelative(pu, subPu));
bool blockMoved = false;
if (pu.mvdL0SubPu[num] != Mv(0, 0)) //细化后的MV不为(0,0)
{
blockMoved = true;
if (isChromaEnabled(pu.chromaFormat))
{
xPrefetch(subPu, m_cYuvRefBuffDMVRL0, REF_PIC_LIST_0, 0);
xPrefetch(subPu, m_cYuvRefBuffDMVRL1, REF_PIC_LIST_1, 0);
}
xPad(subPu, m_cYuvRefBuffDMVRL0, REF_PIC_LIST_0);
xPad(subPu, m_cYuvRefBuffDMVRL1, REF_PIC_LIST_1);
}
int dstStride[MAX_NUM_COMPONENT] = { pcYuvDst.bufs[COMPONENT_Y].stride,
isChromaEnabled(pu.chromaFormat) ? pcYuvDst.bufs[COMPONENT_Cb].stride : 0,
isChromaEnabled(pu.chromaFormat) ? pcYuvDst.bufs[COMPONENT_Cr].stride : 0};
subPu.mv[0] = mergeMv[REF_PIC_LIST_0] + pu.mvdL0SubPu[num];
subPu.mv[1] = mergeMv[REF_PIC_LIST_1] - pu.mvdL0SubPu[num];
subPu.mv[0].clipToStorageBitDepth();
subPu.mv[1].clipToStorageBitDepth();
//===============================搜索结束,进行运动补偿,获得预测像素=============================
xFinalPaddedMCForDMVR(subPu, srcPred0, srcPred1, m_cYuvRefBuffDMVRL0, m_cYuvRefBuffDMVRL1, bioAppliedType[num],
mergeMv, blockMoved);
subPredBuf.bufs[COMPONENT_Y].buf = pcYuvDst.bufs[COMPONENT_Y].buf + xStart + yStart * dstStride[COMPONENT_Y];
if (isChromaEnabled(pu.chromaFormat))
{
subPredBuf.bufs[COMPONENT_Cb].buf = pcYuvDst.bufs[COMPONENT_Cb].buf + (xStart >> scaleX) + ((yStart >> scaleY) * dstStride[COMPONENT_Cb]);
subPredBuf.bufs[COMPONENT_Cr].buf = pcYuvDst.bufs[COMPONENT_Cr].buf + (xStart >> scaleX) + ((yStart >> scaleY) * dstStride[COMPONENT_Cr]);
}
// 双向加权预测
xWeightedAverage(subPu, srcPred0, srcPred1, subPredBuf, subPu.cu->slice->getSPS()->getBitDepths(), subPu.cu->slice->clpRngs(), bioAppliedType[num]);
num++;
}
}
}
JVET_J0090_SET_CACHE_ENABLE(true);
}
xBIPMVRefine函数用于整像素搜索,遍历25个搜索点
void InterPrediction::xBIPMVRefine(int bd, Pel *pRefL0, Pel *pRefL1, uint64_t& minCost, int16_t *deltaMV, uint64_t *pSADsArray, int width, int height)
{
const int32_t refStrideL0 = m_biLinearBufStride;
const int32_t refStrideL1 = m_biLinearBufStride;
Pel *pRefL0Orig = pRefL0;
Pel *pRefL1Orig = pRefL1;
// 遍历25种搜索点
for (int nIdx = 0; (nIdx < 25); ++nIdx)
{ //计算25个整像素点对应的Cost,并找出Cost值小的那一个
int32_t sadOffset = ((m_pSearchOffset[nIdx].getVer() * ((2 * DMVR_NUM_ITERATION) + 1)) + m_pSearchOffset[nIdx].getHor());
// 双边匹配,MV差异镜像规则
// 参考帧0得到的预测子PU
pRefL0 = pRefL0Orig + m_pSearchOffset[nIdx].hor + (m_pSearchOffset[nIdx].ver * refStrideL0);
// 参考帧1得到的预测子PU
pRefL1 = pRefL1Orig - m_pSearchOffset[nIdx].hor - (m_pSearchOffset[nIdx].ver * refStrideL1);
if (*(pSADsArray + sadOffset) == MAX_UINT64)
{ //计算DMVR的SAD
const uint64_t cost = xDMVRCost(bd, pRefL0, refStrideL0, pRefL1, refStrideL1, width, height);
*(pSADsArray + sadOffset) = cost;
}
if (*(pSADsArray + sadOffset) < minCost)
{
minCost = *(pSADsArray + sadOffset);
deltaMV[0] = m_pSearchOffset[nIdx].getHor();
deltaMV[1] = m_pSearchOffset[nIdx].getVer();
}
}
}
通过在xDMVRSubPixelErrorSurface函数中调用xSubPelErrorSrfc函数进行分像素搜索
void xDMVRSubPixelErrorSurface(bool notZeroCost, int16_t *totalDeltaMV, int16_t *deltaMV, uint64_t *pSADsArray)
{
int sadStride = (((2 * DMVR_NUM_ITERATION) + 1));
uint64_t sadbuffer[5];
if (notZeroCost && (abs(totalDeltaMV[0]) != (2 << MV_FRACTIONAL_BITS_INTERNAL))
&& (abs(totalDeltaMV[1]) != (2 << MV_FRACTIONAL_BITS_INTERNAL)))
{
int32_t tempDeltaMv[2] = { 0,0 };
// 整像素搜索结果附近的五个点(上、下、左、右、中)
sadbuffer[0] = pSADsArray[0];
sadbuffer[1] = pSADsArray[-1];
sadbuffer[2] = pSADsArray[-sadStride];
sadbuffer[3] = pSADsArray[1];
sadbuffer[4] = pSADsArray[sadStride];
xSubPelErrorSrfc(sadbuffer, tempDeltaMv);
totalDeltaMV[0] += tempDeltaMv[0];
totalDeltaMV[1] += tempDeltaMv[1];
}
}
void xSubPelErrorSrfc(uint64_t *sadBuffer, int32_t *deltaMv)
{
int64_t numerator, denominator;
int32_t mvDeltaSubPel;
int32_t mvSubPelLvl = 4;/*1: half pel, 2: Qpel, 3:1/8, 4: 1/16*/
/*horizontal水平x*/
numerator = (int64_t)((sadBuffer[1] - sadBuffer[3]) << mvSubPelLvl);
denominator = (int64_t)((sadBuffer[1] + sadBuffer[3] - (sadBuffer[0] << 1)));
if (0 != denominator)
{
if ((sadBuffer[1] != sadBuffer[0]) && (sadBuffer[3] != sadBuffer[0]))
{
mvDeltaSubPel = div_for_maxq7(numerator, denominator);
deltaMv[0] = (mvDeltaSubPel);
}
else
{
if (sadBuffer[1] == sadBuffer[0])
{
deltaMv[0] = -8; // half pel 半像素
}
else
{
deltaMv[0] = 8; // half pel 半像素
}
}
}
/*vertical 垂直y*/
numerator = (int64_t)((sadBuffer[2] - sadBuffer[4]) << mvSubPelLvl);
denominator = (int64_t)((sadBuffer[2] + sadBuffer[4] - (sadBuffer[0] << 1)));
if (0 != denominator)
{
if ((sadBuffer[2] != sadBuffer[0]) && (sadBuffer[4] != sadBuffer[0]))
{
mvDeltaSubPel = div_for_maxq7(numerator, denominator);
deltaMv[1] = (mvDeltaSubPel);
}
else
{
if (sadBuffer[2] == sadBuffer[0])
{
deltaMv[1] = -8; // half pel
}
else
{
deltaMv[1] = 8; // half pel
}
}
}
return;
}
xFinalPaddedMCForDMVR函数进行运动补偿,分别获得参考帧1产生的预测子PU和参考帧2产生的预测子PU
void InterPrediction::xFinalPaddedMCForDMVR(PredictionUnit &pu, PelUnitBuf &pcYuvSrc0, PelUnitBuf &pcYuvSrc1,
PelUnitBuf &pcPad0, PelUnitBuf &pcPad1, const bool bioApplied,
const Mv mergeMV[NUM_REF_PIC_LIST_01], bool blockMoved)
{
int offset, deltaIntMvX, deltaIntMvY;
PelUnitBuf pcYUVTemp = pcYuvSrc0;
PelUnitBuf pcPadTemp = pcPad0;
/*always high precision MVs are used 始终使用高精度MV*/
int mvShift = MV_FRACTIONAL_BITS_INTERNAL;
for (int k = 0; k < NUM_REF_PIC_LIST_01; k++)
{
RefPicList refId = (RefPicList)k;
Mv cMv = pu.mv[refId];//细化后的MV
m_iRefListIdx = refId;
// 参考帧
const Picture* refPic = pu.cu->slice->getRefPic( refId, pu.refIdx[refId] )->unscaledPic;
Mv cMvClipped = cMv;
if( !pu.cs->pps->getWrapAroundEnabledFlag() )
{
clipMv( cMvClipped, pu.lumaPos(), pu.lumaSize(), *pu.cs->sps, *pu.cs->pps );
}
Mv startMv = mergeMV[refId];//初始MV,Merge的MV
if( g_mctsDecCheckEnabled && !MCTSHelper::checkMvForMCTSConstraint( pu, startMv, MV_PRECISION_INTERNAL ) )
{
const Area& tileArea = pu.cs->picture->mctsInfo.getTileArea();
printf( "Attempt an access over tile boundary at block %d,%d %d,%d with MV %d,%d (in Tile TL: %d,%d BR: %d,%d)\n",
pu.lx(), pu.ly(), pu.lwidth(), pu.lheight(), startMv.getHor(), startMv.getVer(), tileArea.topLeft().x, tileArea.topLeft().y, tileArea.bottomRight().x, tileArea.bottomRight().y );
THROW( "MCTS constraint failed!" );
}
// 使用8抽头滤波器生成最终的预测值
for (int compID = 0; compID < getNumberValidComponents(pu.chromaFormat); compID++)
{
Pel *srcBufPelPtr = NULL;
int pcPadstride = 0;
if (blockMoved || (compID == 0))
{
pcPadstride = pcPadTemp.bufs[compID].stride;
int mvshiftTempHor = mvShift + getComponentScaleX((ComponentID)compID, pu.chromaFormat);
int mvshiftTempVer = mvShift + getComponentScaleY((ComponentID)compID, pu.chromaFormat);
int leftPixelExtra;
if (compID == COMPONENT_Y)
{
leftPixelExtra = (NTAPS_LUMA >> 1) - 1;
}
else
{
leftPixelExtra = (NTAPS_CHROMA >> 1) - 1;
}
PelBuf &srcBuf = pcPadTemp.bufs[compID];
deltaIntMvX = (cMv.getHor() >> mvshiftTempHor) - (startMv.getHor() >> mvshiftTempHor);
deltaIntMvY = (cMv.getVer() >> mvshiftTempVer) - (startMv.getVer() >> mvshiftTempVer);
CHECK((abs(deltaIntMvX) > DMVR_NUM_ITERATION) || (abs(deltaIntMvY) > DMVR_NUM_ITERATION), "not expected DMVR movement");
offset = (DMVR_NUM_ITERATION + leftPixelExtra) * (pcPadTemp.bufs[compID].stride + 1);
offset += (deltaIntMvY)* pcPadTemp.bufs[compID].stride;
offset += (deltaIntMvX);
srcBufPelPtr = (srcBuf.buf + offset);
}
JVET_J0090_SET_CACHE_ENABLE(false);
xPredInterBlk((ComponentID) compID, pu, refPic, cMvClipped, pcYUVTemp, true,
pu.cs->slice->getClpRngs().comp[compID], bioApplied, false,
pu.cu->slice->getScalingRatio(refId, pu.refIdx[refId]), 0, 0, 0, srcBufPelPtr, pcPadstride);
JVET_J0090_SET_CACHE_ENABLE(false);
}
pcYUVTemp = pcYuvSrc1;
pcPadTemp = pcPad1;
}
}