X265码率控制——ABR算法基本原理 源码解读

作用:视频编码器里面的码率控制模块,从功能上来说,就是负责给编码器实际编码时(量化模块),提供合适的量化参数QP值,对于某一帧甚至某个宏块,到底是用高QP编码性能好,还是用低QP编码好,这个策略需要码率控制模块来做。

原理解读参考x265代码阅读:码率控制(一)_编码视界的博客-CSDN博客_x265码率控制

x265中码率控制算法与x264的码率控制算法基本相同,基本上是经验性的,与ITU-T/MPEG各类标准推荐的码率控制算法均不同。
x265的率控应该只是帧级率控,虽然有与CU相关的率控参数,但其实那是块级的率失真优化技术,并非块级率控。x265支持三种率控模式:

/* rate tolerance method */
typedef enum
{
    X265_RC_ABR, // average bit rate 对应学界的CBR算法
    X265_RC_CQP, // constant QP      CQP对应HM不开率控时的配置
    X265_RC_CRF  // constant rate fator
                // X265_RC_CRF是“Quality-controlled VBR”学界叫做consistent quality的码率控制
} X265_RC_METHODS;

一、实现方法:ABR算法细节(QP的计算)

x265--速率控制模块理解_进击的研究僧的博客-CSDN博客

整体流程为

1、计算当前帧的模糊复杂度为cplx_blur(i)

利用当前帧的SATD计算图像的模糊复杂度(Blurred Complexity), 设当前帧的SATD为SATD(i),累积复杂度为cplx_sum(i)
在这里插入图片描述

当前帧的模糊复杂度为cplx_blur(i)

在这里插入图片描述

(其中cplx_count表示累计加权帧数)
在这里插入图片描述

 /* 1pass ABR */

            /* Calculate the quantizer which would have produced the desired
             * average bitrate if it had been applied to all frames so far.
             * Then modulate that quant based on the current frame's complexity
             * relative to the average complexity so far (using the 2pass RCEQ).
             * Then bias the quant up or down if total size so far was far from
             * the target.
             * Result: Depending on the value of rate_tolerance, there is a
             * tradeoff between quality and bitrate precision. But at large
             * tolerances, the bit distribution approaches that of 2pass. */
            //当前帧的模糊复杂度
            double overflow = 1;
            double lqmin = MIN_QPSCALE, lqmax = MAX_MAX_QPSCALE;
            m_shortTermCplxSum *= 0.5;
            m_shortTermCplxCount *= 0.5;
            m_shortTermCplxSum += m_currentSatd / (CLIP_DURATION(m_frameDuration) / BASE_FRAME_DURATION);
            m_shortTermCplxCount++;
            /* coeffBits to be used in 2-pass */
            rce->coeffBits = (int)m_currentSatd;
            //得到当前帧的模糊复杂度
            rce->blurredComplexity = m_shortTermCplxSum / m_shortTermCplxCount;
            rce->mvBits = 0;
            rce->sliceType = m_sliceType;

 2.原始量化参数为qscale_raw,
在这里插入图片描述
其中qc为压制参数,用来调控qscale_raw的幅度

else
    //原始量化参数为qscale_raw,pow函数:取n次方返回
        q = pow(rce->blurredComplexity, 1 - m_param->rc.qCompress);
3、qscale_raw需要两次修正(qscale_raw重新计算)

第一次修正利用rate_factor修正。
在这里插入图片描述          其中在这里插入图片描述

CRF和ABR调用getQScale时给的rateFactor的参数不同,CRF时是常数,ABR等于m_wantedBitsWindow / m_cplxrSum,使得计算得到的QScale与已编码视频的复杂度成正比。
其中wanted_bits_window表示到当前编码帧为止所有目标比特累计值;cplxr_sum(i)为根据前一帧的量化等级参数求取情况估计出的当前帧复杂度,是一个迭代量,初始值为
m_cplxrSum = .01 * pow(7.0e5, m_qCompress) * pow(m_ncu, 0.5) * tuneCplxFactor;
其中
double tuneCplxFactor = (m_param->rc.cuTree && m_ncu > 3600) ? 2.5 :1;//720p以上参数为2.5 以下为1.0

复杂度情况计算:rate_factor修正后的复杂度情况,计算公式如下:
在这里插入图片描述

/* After encoding one frame, update rate control state */
int RateControl::rateControlEnd(Frame* curFrame, int64_t bits, RateControlEntry* rce)
{
    int orderValue = m_startEndOrder.get();
    int endOrdinal = (rce->encodeOrder + m_param->frameNumThreads) * 2 - 1;
    while (orderValue < endOrdinal && !m_bTerminated)
    {
        /* no more frames are being encoded, so fake the start event if we would
         * have blocked on it. Note that this does not enforce rateControlEnd()
         * ordering during flush, but this has no impact on the outputs */
        if (m_finalFrameCount && orderValue >= 2 * m_finalFrameCount)
            break;
        orderValue = m_startEndOrder.waitForChange(orderValue);
    }

    FrameData& curEncData = *curFrame->m_encData;
    int64_t actualBits = bits;
    Slice *slice = curEncData.m_slice;

    if (m_param->rc.aqMode || m_isVbv)
    {
        if (m_isVbv)
        {
            /* determine avg QP decided by VBV rate control */
            for (uint32_t i = 0; i < slice->m_sps->numCuInHeight; i++)
                curEncData.m_avgQpRc += curEncData.m_rowStat[i].sumQpRc;

            curEncData.m_avgQpRc /= slice->m_sps->numCUsInFrame;
            rce->qpaRc = curEncData.m_avgQpRc;
        }

        if (m_param->rc.aqMode)
        {
            /* determine actual avg encoded QP, after AQ/cutree adjustments */
            for (uint32_t i = 0; i < slice->m_sps->numCuInHeight; i++)
                curEncData.m_avgQpAq += curEncData.m_rowStat[i].sumQpAq;

            curEncData.m_avgQpAq /= (slice->m_sps->numCUsInFrame * NUM_4x4_PARTITIONS);
        }
        else
            curEncData.m_avgQpAq = curEncData.m_avgQpRc;
    }

    if (m_isAbr)
    {
        if (m_param->rc.rateControlMode == X265_RC_ABR && !m_param->rc.bStatRead)
            checkAndResetABR(rce, true);

        if (m_param->rc.rateControlMode == X265_RC_CRF)
        {
            if (int(curEncData.m_avgQpRc + 0.5) == slice->m_sliceQp)
                curEncData.m_rateFactor = m_rateFactorConstant;
            else
            {
                /* If vbv changed the frame QP recalculate the rate-factor */
                double baseCplx = m_ncu * (m_param->bframes ? 120 : 80);
                double mbtree_offset = m_param->rc.cuTree ? (1.0 - m_param->rc.qCompress) * 13.5 : 0;
                curEncData.m_rateFactor = pow(baseCplx, 1 - m_qCompress) /
                    x265_qp2qScale(int(curEncData.m_avgQpRc + 0.5) + mbtree_offset);
            }
        }
    }

    if (m_isAbr && !m_isAbrReset)
    {
        /* amortize part of each I slice over the next several frames, up to
         * keyint-max, to avoid over-compensating for the large I slice cost */
        if (!m_param->rc.bStatWrite && !m_param->rc.bStatRead)
        {
            if (rce->sliceType == I_SLICE)
            {
                /* previous I still had a residual; roll it into the new loan */
                if (m_residualFrames)
                    bits += m_residualCost * m_residualFrames;
                m_residualFrames = X265_MIN((int)rce->amortizeFrames, m_param->keyframeMax);
                m_residualCost = (int)((bits * rce->amortizeFraction) / m_residualFrames);
                bits -= m_residualCost * m_residualFrames;
            }
            else if (m_residualFrames)
            {
                bits += m_residualCost;
                m_residualFrames--;
            }
        }
        if (rce->sliceType != B_SLICE)
        {
            /* The factor 1.5 is to tune up the actual bits, otherwise the cplxrSum is scaled too low
                * to improve short term compensation for next frame. */
            m_cplxrSum += (bits * x265_qp2qScale(rce->qpaRc) / rce->qRceq) - (rce->rowCplxrSum);
        }
        else
        {
            /* Depends on the fact that B-frame's QP is an offset from the following P-frame's.
                * Not perfectly accurate with B-refs, but good enough. */
            m_cplxrSum += (bits * x265_qp2qScale(rce->qpaRc) / (rce->qRceq * fabs(m_param->rc.pbFactor))) - (rce->rowCplxrSum);
        }
        m_wantedBitsWindow += m_frameDuration * m_bitrate;
        m_totalBits += bits - rce->rowTotalBits;
        m_encodedBits += actualBits;
        int pos = m_sliderPos - m_param->frameNumThreads;
        if (pos >= 0)
            m_encodedBitsWindow[pos % s_slidingWindowFrames] = actualBits;
    }

    if (m_2pass)
    {
        m_expectedBitsSum += qScale2bits(rce, x265_qp2qScale(rce->newQp));
        m_totalBits += bits - rce->rowTotalBits;
    }

    if (m_isVbv)
    {
        updateVbv(actualBits, rce);

        if (m_param->bEmitHRDSEI)
        {
            const VUI *vui = &curEncData.m_slice->m_sps->vuiParameters;
            const HRDInfo *hrd = &vui->hrdParameters;
            const TimingInfo *time = &vui->timingInfo;
            if (!curFrame->m_poc)
            {
                // first access unit initializes the HRD
                rce->hrdTiming->cpbInitialAT = 0;
                rce->hrdTiming->cpbRemovalTime = m_nominalRemovalTime = (double)m_bufPeriodSEI.m_initialCpbRemovalDelay / 90000;
            }
            else
            {
                rce->hrdTiming->cpbRemovalTime = m_nominalRemovalTime + (double)rce->picTimingSEI->m_auCpbRemovalDelay * time->numUnitsInTick / time->timeScale;
                double cpbEarliestAT = rce->hrdTiming->cpbRemovalTime - (double)m_bufPeriodSEI.m_initialCpbRemovalDelay / 90000;
                if (!curFrame->m_lowres.bKeyframe)
                    cpbEarliestAT -= (double)m_bufPeriodSEI.m_initialCpbRemovalDelayOffset / 90000;

                rce->hrdTiming->cpbInitialAT = hrd->cbrFlag ? m_prevCpbFinalAT : X265_MAX(m_prevCpbFinalAT, cpbEarliestAT);
            }

            uint32_t cpbsizeUnscale = hrd->cpbSizeValue << (hrd->cpbSizeScale + CPB_SHIFT);
            rce->hrdTiming->cpbFinalAT = m_prevCpbFinalAT = rce->hrdTiming->cpbInitialAT + actualBits / cpbsizeUnscale;
            rce->hrdTiming->dpbOutputTime = (double)rce->picTimingSEI->m_picDpbOutputDelay * time->numUnitsInTick / time->timeScale + rce->hrdTiming->cpbRemovalTime;
        }
    }
    rce->isActive = false;
    // Allow rateControlStart of next frame only when rateControlEnd of previous frame is over
    m_startEndOrder.incr();
    return 0;
}

4、 qscale_raw第二次修正

利用溢出判断因子overflow来修正,它可以表示出总目标比特和实际产生的总比特的之间的偏差,修正公式如下:

overflow限定在0.5到2之间。


其中,total_bits(i-1)为到前一帧为止编码所产生的实际比特数之和;wanted_bits(i-1)为到前一帧为止累计的目标比特数之和。abr_buffer(i)为平均比特率缓冲区,初始值是两倍的平均目标比特和瞬时码率容忍度(默认为1)的乘积,是根据当前帧数和编码帧率增长的。

对应代码(tuneAbrQScaleFromFeedback())

//第二次修正,利用溢出判断因子overflow来修正,它可以表示出总目标比特和实际产生的总比特的之间的偏差
double RateControl::tuneAbrQScaleFromFeedback(double qScale)
{
    double abrBuffer = 2 * m_rateTolerance * m_bitrate;
    if (m_currentSatd)
    {
        /* use framesDone instead of POC as poc count is not serial with bframes enabled */
        double overflow = 1.0;
        double timeDone = (double)(m_framesDone - m_param->frameNumThreads + 1) * m_frameDuration;
        double wantedBits = timeDone * m_bitrate;
        int64_t encodedBits = m_totalBits;
        if (m_param->totalFrames && m_param->totalFrames <= 2 * m_fps)
        {
            abrBuffer = m_param->totalFrames * (m_bitrate / m_fps);
            encodedBits = m_encodedBits;
        }

        if (wantedBits > 0 && encodedBits > 0 && (!m_partialResidualFrames || 
            m_param->rc.bStrictCbr))
        {
            abrBuffer *= X265_MAX(1, sqrt(timeDone));
            overflow = x265_clip3(.5, 2.0, 1.0 + (encodedBits - wantedBits) / abrBuffer);
            qScale *= overflow;
        }
    }
    return qScale;
}

 5.用计算出的qscale_raw  计算qp

R-D模型x264和x265编码器码率控制之基本模型 - 知乎

R表示目标码率(比特数),X是当前的画面帧复杂度,α是码控参数。上式意味着,当前帧的复杂度越大时,qscale就越大(qscale就是拉格朗日乘子λ),帧级QP就相对越小(按x265里面的模型来说,SATD cost越大,帧级QP会越大。)

展开公式相当于,需要计算每一帧的SATD  cost

 

上式中下标n表示当前帧号,取值从0开始。当前帧的复杂度用SATD cost来表征,其值实际上不仅仅是当前帧的SATD,还包括先前已编码帧的SATD加权。comp是考虑人眼视觉特性对SATD的非线性映射,默认取0.6,CBR时取0。

上面计算QP式中分母部分,编码器配置参数目标码率和帧率用于计算期望每帧比特数,随着编码帧数增大不断累加。而分子部分的右边一项下标是i-1,表示它需要前一帧的编码信息,包括前一帧的实际bit数,实际qscale,实际rceq。

但对于编码的第一帧(而且是IDR帧)来说,因为它没有前一帧信息,所以,对于前面计算QP的公式,需要考虑其边界值,n=0的情况。也就是每个码率控制算法里面,都需要加以考虑的"首帧QP问题"。

/* The qscale - qp conversion is specified in the standards.
 * Approx qscale increases by 12%  with every qp increment */
double x265_qScale2qp(double qScale)
{
    return 12.0 + 6.0 * (double)X265_LOG2(qScale / 0.85);
}

double x265_qp2qScale(double qp)
{
    return 0.85 * pow(2.0, (qp - 12.0) / 6.0);
}

二、代码模块细节

1、编码线程与预处理线程
x265主要由两个线程组成:编码线程(coding thread)和预处理线程(lookahead thread)。在编码线程中,速率控制包括两级:MB-level(CTU-level)、frame-level


RateControl构造函数只在开始时调用一次。这个函数执行全局资源和参数初始化(包括指定CRF模式时的RateFactor、速率控制参数以及VBV的初始化等)。
2、RateControlStart模块

 

 

(1)RateControlStart每帧调用一次,由三个功能块组成:

校验RC复位条件
FrameStartQp-计算启动QP(除非qpfile模式是开启的)
溢出校正
(2)RCReset:如果检测到在当前和前一帧之间存在场景切换,RC被reset。Reset的条件:current frame average satdCost > 4 * moving average of satdCosts

(3)StartQP:该模块被用来计算当前帧的初始QP,如果在从流开始或最后RC重置的时候应用,则它会得到所需的bitsize。 startQP = 12.0 + 6.0*log2(qscale/0.85)

(4)溢出校正:

timeDone = framesDone *m_frameDuration
wantedBits = timeDone * bitrate
if (wantedBits > 0 && m_totalBits > 0 && !m_partialResidualFrames)

{
abrBuffer *= MAX(1, sqrt(timeDone));
overflow = Clip3(.5, 2.0, 1.0 +
(m_totalBits - wantedBits) / abrBuffer);
startFrameQp *= overflow;
}

(5)RC模型参数更新:RC模型参数复杂度每帧更新一次,但不在帧开始时更新。在基于帧的并行情况下,不允许下一个帧进入rateControlStart,直到这个帧更新了它的中间帧模型参数。
m_refLagRows = 1 + (search_range + 63) / 64

3、VBV模块
用于控制接收端缓存不上溢不下溢,实质是对视频短时码率进行限制。

流程图如下:

 


4、RateControlEnd模块 
在每一帧的末尾调用rateControlEnd函数。此函数执行以下操作:

更新统计数据
I-frame摊销
更新/计算的统计数字如下:

(1)平均Qp (HVS Qp调整后),存储在m_avgQpAq中

(2)更新m_wantedBitsWindowbybitrate*frameDuration

(3)更新累积复杂性m_cplxrSumbybits*avgQp

(4)更新m_totalBits

为防止I帧峰值对RC的影响,不向RC提供实际的I帧比特大小,而是提供比实际小的I帧比特数,剩下的比特被后面的I帧摊销。
 

  • 2
    点赞
  • 13
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值