【x265编码器】章节1——x265的lookahead模块分析


 系列文章目录

   HEVC视频编解码标准简介

【x264编码器】章节1——x264编码流程及基于x264的编码器demo

【x264编码器】章节2——x264的lookahead流程分析

【x264编码器】章节3——x264的码率控制

【x264编码器】章节4——x264的帧内预测流程

【x264编码器】章节5——x264的帧间预测流程

【x264编码器】章节6——x264的变换量化

【x265编码器】章节1——lookahead模块分析

【x265编码器】章节2——编码流程及基于x265的编码器demo

【x265编码器】章节3——帧内预测流程

【x265编码器】章节4——帧间预测流程

【x265编码器】章节5——x265帧间运动估计流程

【x265编码器】章节6——x265的码率控制

【x265编码器】章节7——滤波模块

【x265编码器】章节8——变换量化模块


目录

 系列文章目录

一、模块功能

1.场景检测切换

2.帧结构确定

3.CU tree

4.VBV

5.lookahead大体流程

二、lookahead模块分析

1.调用流程

2.代码分析

1.用于向前向预测模块添加图片Lookahead::addPicture()

2.检查前向预测队列Lookahead::checkLookaheadQueue()

3.获取已决定的图片Lookahead::getDecidedPicture()

4.查找并执行工作任务Lookahead::findJob()

5.进行类型分析Lookahead::slicetypeDecide()

6.任务分配模块PreLookaheadGroup::processTasks

7.低分辨率帧的帧内估计LookaheadTLD::lowresIntraEstimate()

8.进行类型分析Lookahead::slicetypeAnalyse

9.低分辨率帧间估计CostEstimateGroup::estimateFrameCost

10.低分辨率单个CU帧间估计CostEstimateGroup::estimateCUCost

11.VBV码率和缓冲区Lookahead::vbvLookahead

12.场景切换检测Lookahead::scenecut

13.帧结构路径成本计算Lookahead::slicetypePathCost

14.CU tree的构建和处理Lookahead::cuTree

点赞、收藏,会是我继续写作的动力!赠人玫瑰,手有余香。


前言

x265完整的流程框架如下:

一、模块功能

在x265中,前向预测(lookahead)是一种技术,用于改善视频编码的效率和质量。x265的前向预测功能涉及分析未来的视频帧,以在当前帧的编码过程中做出更好的决策。同时也会进行帧内预测和帧间预测,不同点有两点:

1.在1/4的低分辨率情况下(宽高各是原始视频的一半),lookahead的cu块大小8x8,跟x264一样,进行帧内和帧间预测;

2.帧间预测时,遍历CU的顺序是从下到上,从右到左,具体可以看estimateFrameCost();

主要的功能有以下四项:场景检测、帧结构确定、CU tree和VBV;

1.场景检测切换

基本流程与x264的scenecut类似,细节有不同,第一轮搜索过滤与x264方案一致,之后的处理与x264的不同,x265场景切换检测的大体流程如下,详细代码分析见Lookahead::scenecut:

对应的场景检测时计算的帧内和帧间coet方式如下,左边是帧内,右边是帧间:

2.帧结构确定

帧结构方案目前有三种,分别是X265_B_ADAPT_NONE、X265_B_ADAPT_FAST和X265_B_ADAPT_TRELLIS,对应的大体流程如下:

X265_B_ADAPT_NONE方案:与x264的方案一致,主要就是按照固定IBBBPBBBP进行展开,对应代码见Lookahead::slicetypeAnalyse;

X265_B_ADAPT_FAST方案:与x264的方案大体一致,但有差异的部分,第一步都会计算BP和PP帧结构类型cost,选取cost最低的帧类型,之后的处理会遍历(i+2,bframes)范围内是否都应为B,假设都是B帧,则在最后添加一个P则,并作为下一轮起始位置,重复往后展开,对应代码见Lookahead::slicetypeAnalyse;

X265_B_ADAPT_TRELLIS方案:与x264方案一致,会保留前面每次计算得到的最优帧结构方案,从而插入0到bframes逐次的B帧,求得当前长度最优方案,之后再这个基础上,计算长度+1的最优方案,不断迭代,直到处理完整个GOP,对应代码见Lookahead::slicetypePath;

3.CU tree

 CU tree跟x264的MB tree基本一致,比较简单的解释作用就是:帧与帧之间存在参考的关系,如果被参考的帧拥有更高的质量,那么通过调整一个帧,就可以改善一批帧质量,因此CU tree是根据帧被引用得程度,也可以认为是遗传给了其他帧多少信息,作为衡量该帧的重要性;

因为考虑到遗传是可以累加的,所以采用的逆序遍历的方式进行CU tree中遗传信息的计算,比如b参考p0,则p0的遗传信息对应的公式如下:

遗传信息公式 =(propagate_in + intra_cost * inv_qscales*fps_factor) * (1 - inter_cost / intra_cost) * dist_scale_factor

遗传信息要求正值,所以只有inter_cost<intra_cost即选择了帧间预测的CU块才会有,inter_cost越小,intra_cost越大,越会体现遗传信息的重要,因此遗传信息值越大,采用了帧内预测模式的CU块,遗传信息为0;

propagate_in:为当前帧b帧作为其他帧的参考帧,遗传给其他帧的信息,在计算p0的遗传信息的时候,需要加上作为修正;

dist_scale_factor:距离比例,上面公式需要一个距离来修正;

inv_qscales:量化系数,变得模糊的MB携带的信息更少,理应再加一个修正因素。量化的过程就是把系数除以QStep;

fps_factor:针对可变帧率,这每帧占比的时间不同,占比时间长的帧,理应更重要,因此提出这个参数对公式做修正;

对应代码如下,以及Lookahead::estimateCUPropagate函数中:

/* Estimate the total amount of influence on future quality that could be had if we
 * were to improve the reference samples used to inter predict any given CU. */
static void estimateCUPropagateCost(int* dst, const uint16_t* propagateIn, const int32_t* intraCosts, const uint16_t* interCosts,
                                    const int32_t* invQscales, const double* fpsFactor, int len)
{
    double fps = *fpsFactor / 256;  // range[0.01, 1.00]
    for (int i = 0; i < len; i++)
    {
        int intraCost = intraCosts[i];
        int interCost = X265_MIN(intraCosts[i], interCosts[i] & LOWRES_COST_MASK);
        double propagateIntra = intraCost * invQscales[i]; // Q16 x Q8.8 = Q24.8
        double propagateAmount = (double)propagateIn[i] + propagateIntra * fps; // Q16.0 + Q24.8 x Q0.x = Q25.0
        double propagateNum = (double)(intraCost - interCost); // Q32 - Q32 = Q33.0
        double propagateDenom = (double)intraCost;             // Q32
        dst[i] = (int)(propagateAmount * propagateNum / propagateDenom + 0.5);
        }
    //}
}

根据遗传信息调整QP,公式如下:

qpoffset =5\ast \left ( 1-qcompress \right )\ast log_{2}\left ( 1+\frac{propagate }{intra * invQscaleFactor * fpsFactor} \right )

其中propagate表示遗传给后续帧的信息;intra表示自身的信息;qcompress对外参数代表调整QP的强度,QP=0表示完成ABR,QP任意调整,QP=1,完全的CBR,固定QP;fpsFactor主要针对可变码率视频,当前帧停留的越久越重要;

对应代码:

void Lookahead::cuTreeFinish(Lowres *frame, double averageDuration, int ref0Distance)
{   //省略多余代码
    for (int cuIndex = 0; cuIndex < m_cuCount; cuIndex++)
            {   //CU的intracost(MB自身包含的信息)
                int intracost = (frame->intraCost[cuIndex] * frame->invQscaleFactor[cuIndex] + 128) >> 8;
                if (intracost)
                {   //propagateCost(遗传给后续帧的信息)
                    int propagateCost = (frame->propagateCost[cuIndex] * fpsFactor + 128) >> 8;
                    double log2_ratio = X265_LOG2(intracost + propagateCost) - X265_LOG2(intracost) + weightdelta;
                    frame->qpCuTreeOffset[cuIndex] = frame->qpAqOffset[cuIndex] - m_cuTreeStrength * log2_ratio;
                }
            }
}

4.VBV

与x264基本一致,通过Lookahead::vbvLookahead计算低分辨率情况下的plannedSatd,为实际编码vbv码控时提供数据;

5.lookahead大体流程

与x264基本一致

二、lookahead模块分析

1.调用流程

lookahead流程如下图1,与整体x265的关系如图2 (其中黄色部分):

完整的x265编码流程如下:

2.代码分析

1.用于向前向预测模块添加图片Lookahead::addPicture()

addPicture()方法:该方法由API线程调用,用于向前向预测模块添加图片。

void Lookahead::addPicture(Frame& curFrame, int sliceType)
{   //如果启用了参数analysisLoad且禁用了前向预测(bDisableLookahead),那么会将图片直接添加到输出队列中,并增加m_inputCount计数器。
    if (m_param->analysisLoad && m_param->bDisableLookahead)
    {
        if (!m_filled)
            m_filled = true;
        m_outputLock.acquire();
        m_outputQueue.pushBack(curFrame);
        m_outputLock.release();
        m_inputCount++;
    }
    //否则,会调用checkLookaheadQueue()方法来检查输入队列的状态,并将图片添加到前向预测模块中
    else
    {
        checkLookaheadQueue(m_inputCount);
        curFrame.m_lowres.sliceType = sliceType;
        addPicture(curFrame);
    }
}

2.检查前向预测队列Lookahead::checkLookaheadQueue()

用于检查前向预测队列(lookahead queue)的状态。以下是对代码的解释:

void Lookahead::checkLookaheadQueue(int &frameCnt)
{
    /* determine if the lookahead is (over) filled enough for frames to begin to
     * be consumed by frame encoders */
    //如果m_filled为false(即前向预测队列还未填满),则
    if (!m_filled)
    {   //如果参数bframes和lookaheadDepth都为零,表示使用零延迟模式,此时将m_filled设置为true(表示前向预测队列已满)
        if (!m_param->bframes & !m_param->lookaheadDepth)
            m_filled = true; /* zero-latency */
        //否则,如果已输入的帧数(frameCnt)大于等于前向预测深度(lookaheadDepth)加2加bframes,则将m_filled设置为true(表示前向预测队列已满)
        else if (frameCnt >= m_param->lookaheadDepth + 2 + m_param->bframes)
            m_filled = true; /* full capacity plus mini-gop lag */
    }

    m_inputLock.acquire();
    //如果存在线程池(m_pool)并且输入队列(m_inputQueue)的大小大于等于m_fullQueueSize,则尝试唤醒一个线程
    if (m_pool && m_inputQueue.size() >= m_fullQueueSize)
        tryWakeOne();
    m_inputLock.release();
}

3.获取已决定的图片Lookahead::getDecidedPicture()

这段代码是前向预测(lookahead)模块中的一部分,用于从输出队列中获取已决定的图片(decided picture)。该方法从输出队列中移除图片,并且只会在没有其他可用图片时阻塞。它只在m_filled为true时开始移除图片,而m_filled在超过前向预测深度的图片已经输入后才设置为true,因此在输出图片被取出之前,slicetypeDecide()应该已经开始运行。第一次slicetypeDecide()显然仍然需要阻塞等待,但之后的slicetypeDecide()将保持领先于编码器(因为每次从输出队列中移除一张图片,就会向输入队列中添加一张图片),并在编码器需要它们之前决定图片的切片类型。以下是对代码的解释:

Frame* Lookahead::getDecidedPicture()
{   //检查m_filled变量是否为true,即是否已经填充了足够的图片到输出队列中
    if (m_filled)//表示已经可以从输出队列中获取图片
    {   //获取输出锁(m_outputLock)以确保线程安全地访问输出队列
        m_outputLock.acquire();
        //使用popFront()方法从输出队列中弹出一张图片,并将其赋值给指针out
        Frame *out = m_outputQueue.popFront();
        //释放输出锁
        m_outputLock.release();
        //如果成功获取到一张图片(out非空),则将m_inputCount计数器减1,并返回该图片
        if (out)
        {
            m_inputCount--;
            return out;
        }
        //如果未能获取到图片(out为空),则根据参数analysisLoad和bDisableLookahead的设置来判断是否需要运行slicetypeDecide()方法
        if (m_param->analysisLoad && m_param->bDisableLookahead)
            return NULL;

        findJob(-1); /* run slicetypeDecide() if necessary */

        m_inputLock.acquire();
        //根据slicetypeDecide()方法是否忙碌(m_sliceTypeBusy)来判断是否需要等待输出信号
        bool wait = m_outputSignalRequired = m_sliceTypeBusy;
        m_inputLock.release();
        //如果需要等待输出信号,则调用wait()方法等待信号到来
        if (wait)
            m_outputSignal.wait();
        //再次使用popFront()方法从输出队列中弹出一张图片,并将其赋值给指针out
        out = m_outputQueue.popFront();
        //如果成功获取到一张图片(out非空),则将m_inputCount计数器减1,并返回该图片
        if (out)
            m_inputCount--;
        return out;
    }
    else//表示还没有填充足够的图片到输出队列中,此时返回空指针
        return NULL;
}

4.查找并执行工作任务Lookahead::findJob()

用于查找并执行工作任务,该方法轮询输入队列的占用情况。如果队列已满,它将运行slicetypeDecide()方法,并将一组帧输出到输出队列中(形成一个mini-gop)。如果调用了flush()方法(意味着不会再收到新的图片),则只要输入队列中还剩下一张图片,就认为输入队列已满。

void Lookahead::findJob(int /*workerThreadID*/)
{
    bool doDecide;
    //获取输入锁(m_inputLock)以确保线程安全地访问相关变量
    m_inputLock.acquire();
    //如果输入队列的大小(m_inputQueue.size())大于等于满队列大小(m_fullQueueSize),且切片类型任务不忙碌(!m_sliceTypeBusy)且前向预测模块处于活动状态(m_isActive),则将doDecide、m_sliceTypeBusy都设置为true
    if (m_inputQueue.size() >= m_fullQueueSize && !m_sliceTypeBusy && m_isActive)
        doDecide = m_sliceTypeBusy = true;
    else//否则,将doDecide设置为false,并将m_helpWanted设置为false
        doDecide = m_helpWanted = false;
    m_inputLock.release();//释放输入锁

    if (!doDecide)
        return;
    //记录切片类型决策的时间和计数
    ProfileLookaheadTime(m_slicetypeDecideElapsedTime, m_countSlicetypeDecide);
    ProfileScopeEvent(slicetypeDecideEV);
    //执行切片类型决策的具体方法(slicetypeDecide())
    slicetypeDecide();
    //获取输入锁
    m_inputLock.acquire();
    //如果需要输出信号(m_outputSignalRequired为true),则触发输出信号(m_outputSignal.trigger()),并将m_outputSignalRequired设置为false
    if (m_outputSignalRequired)
    {
        m_outputSignal.trigger();
        m_outputSignalRequired = false;
    }
    m_sliceTypeBusy = false;
    m_inputLock.release();//释放输入锁
}

5.进行类型分析Lookahead::slicetypeDecide()

以下是代码的解释:

oid Lookahead::slicetypeDecide()
{   //创建 PreLookaheadGroup 类的实例 pre,并传入当前 Lookahead 对象的引用
    PreLookaheadGroup pre(*this);
    //创建 Lowres 指针数组 frames 和 Frame 指针数组 list,并将它们初始化为零
    Lowres* frames[X265_LOOKAHEAD_MAX + X265_BFRAME_MAX + 4];
    Frame*  list[X265_BFRAME_MAX + 4];
    memset(frames, 0, sizeof(frames));
    memset(list, 0, sizeof(list));
    //计算最大搜索范围 maxSearch,取 m_param->lookaheadDepth 和 X265_LOOKAHEAD_MAX 中的最小值,并确保至少为 1
    int maxSearch = X265_MIN(m_param->lookaheadDepth, X265_LOOKAHEAD_MAX);   
    maxSearch = X265_MAX(1, maxSearch);

    {   //获取输入锁 m_inputLock 的互斥访问权限
        ScopedLock lock(m_inputLock);
        //获取输入队列中的当前帧 curFrame,并定义整数变量 j
        Frame *curFrame = m_inputQueue.first();
        int j;
		if (m_param->bResetZoneConfig)
		{   //遍历 m_param->rc.zones 数组中的每个区域配置
			for (int i = 0; i < m_param->rc.zonefileCount; i++)
			{   //如果当前帧的 m_poc 等于区域配置的 startFrame,将 m_param 更新为该区域配置的 zoneParam
				if (m_param->rc.zones[i].startFrame == curFrame->m_poc)
					m_param = m_param->rc.zones[i].zoneParam;
			}
		}
        //遍历 m_param->bframes + 2 次,将当前帧 curFrame 添加到 list 数组中,并将 curFrame 更新为下一帧
        for (j = 0; j < m_param->bframes + 2; j++)
        {
            if (!curFrame) break;
            list[j] = curFrame;
            curFrame = curFrame->m_next;
        }
        //将输入队列中的第一帧赋值给 curFrame,将 m_lastNonB 赋值给 frames[0]
        curFrame = m_inputQueue.first();
        frames[0] = m_lastNonB;
        //遍历最大搜索范围 maxSearch 次,将当前帧的低分辨率帧 curFrame->m_lowres 添加到 frames 数组中的相应位置
        for (j = 0; j < maxSearch; j++)
        {
            if (!curFrame) break;
            frames[j + 1] = &curFrame->m_lowres;
            //如果当前帧的低分辨率帧尚未初始化,将当前帧添加到 pre.m_preframes 数组中,并增加 pre.m_jobTotal 的计数
            if (!curFrame->m_lowresInit)
                pre.m_preframes[pre.m_jobTotal++] = curFrame;

            curFrame = curFrame->m_next;
        }
        //更新最大搜索范围 maxSearch 为实际遍历的次数
        maxSearch = j;
        //结束输入锁的使用
    }
    //如果存在需要进行预分析的帧(pre.m_jobTotal > 0),执行以下操作
    /* perform pre-analysis on frames which need it, using a bonded task group */
    if (pre.m_jobTotal)
    {   //如果线程池 m_pool 存在,尝试将预分析任务与其他任务进行绑定
        if (m_pool)
            pre.tryBondPeers(*m_pool, pre.m_jobTotal);
        //调用 pre.processTasks(-1) 执行预分析任务
        pre.processTasks(-1);
        //等待所有任务执行完毕
        pre.waitForExit();
    }
    //根据启用淡入区域检测的设置来处理编码器的帧列表
    if(m_param->bEnableFades)
    {   //初始化一些变量,包括 endIndex、length 和 m_frameVariance 数组
        int j, endIndex = 0, length = X265_BFRAME_MAX + 4;
        for (j = 0; j < length; j++)
            m_frameVariance[j] = -1;
        //遍历帧列表 list,将每个帧的低分辨率帧方差(frameVariance)存储在 m_frameVariance 数组中相应位置
        for (j = 0; list[j] != NULL; j++)
            m_frameVariance[list[j]->m_poc % length] = list[j]->m_lowres.frameVariance;
        //根据 m_frameVariance 数组中的值判断是否存在淡入区域。遍历 m_frameVariance 数组的索引 k,并执行以下操作
        for (int k = list[0]->m_poc % length; k <= list[j - 1]->m_poc % length; k++)
        {   //如果当前索引 k 对应的 m_frameVariance 值为 -1,则跳出循环
            if (m_frameVariance[k]  == -1)
                break;
            //如果当前索引 k 大于 0 并且当前 m_frameVariance[k] 大于等于前一个位置的 m_frameVariance 值,或者如果当前索引 k 等于 0 并且当前 m_frameVariance[k] 大于等于 m_frameVariance[length - 1](数组的最后一个元素),则表示进入了淡入区域
            if((k > 0 && m_frameVariance[k] >= m_frameVariance[k - 1]) || 
                (k == 0 && m_frameVariance[k] >= m_frameVariance[length - 1]))
            {
                m_isFadeIn = true;
                //如果 m_fadeCount 和 m_fadeStart 均为初始值(0 和 -1),则根据当前帧列表中的帧的 POC(Presentation Order Count)值来确定 m_fadeStart 的值
                if (m_fadeCount == 0 && m_fadeStart == -1)
                {
                    for(int temp = list[0]->m_poc; temp <= list[j - 1]->m_poc; temp++)
                        if (k == temp % length) {
                            m_fadeStart = temp ? temp - 1 : 0;
                            break;
                        }
                }
                //更新 m_fadeCount 的值为 list[endIndex]->m_poc - m_fadeStart,其中 endIndex 是当前帧列表中的索引
                m_fadeCount = list[endIndex]->m_poc > m_fadeStart ? list[endIndex]->m_poc - m_fadeStart : 0;
                endIndex++;
            }
            else
            {   //否则,如果当前已经处于淡入区域,并且 m_fadeCount 大于等于 m_param->fpsNum / m_param->fpsDenom(每秒帧数的分子除以分母),则表示淡入区域已经结束。将 m_lowres.bIsFadeEnd 设置为 true,以指示当前帧是淡入区域的结束帧
                if (m_isFadeIn && m_fadeCount >= m_param->fpsNum / m_param->fpsDenom)
                {
                    for (int temp = 0; list[temp] != NULL; temp++)
                    {
                        if (list[temp]->m_poc == m_fadeStart + (int)m_fadeCount)
                        {
                            list[temp]->m_lowres.bIsFadeEnd = true;
                            break;
                        }
                    }
                }
                m_isFadeIn = false;
                m_fadeCount = 0;
                m_fadeStart = -1;
            }
            //如果当前索引 k 等于数组的最后一个索引(length - 1),则将 k 重置为 -1,以便下一次循环时 k 递增为 0
            if (k == length - 1)
                k = -1;
        }
    }
    //在满足一定条件时进行帧分析和码率控制相关的操作
    /*首先,代码检查了以下条件:*/
    if (m_lastNonB &&
        ((m_param->bFrameAdaptive && m_param->bframes) ||
         m_param->rc.cuTree || m_param->scenecutThreshold || m_param->bHistBasedSceneCut ||
         (m_param->lookaheadDepth && m_param->rc.vbvBufferSize)))
    {   //如果 m_param->rc.bStatRead 为假,则调用 slicetypeAnalyse 函数,对帧进行分析
        if (!m_param->rc.bStatRead)
            slicetypeAnalyse(frames, false);
        //根据一些条件判断是否需要进行 VBV(Video Buffering Verifier)预测
        bool bIsVbv = m_param->rc.vbvBufferSize > 0 && m_param->rc.vbvMaxBitrate > 0;
        if ((m_param->analysisLoad && m_param->scaleFactor && bIsVbv) || m_param->bliveVBV2pass)
        {
            int numFrames;
            //遍历帧列表 frames,直到达到最大搜索数 maxSearch 或者遇到空帧(即指针为空),每次递增 numFrames。
            for (numFrames = 0; numFrames < maxSearch; numFrames++)
            {
                Lowres *fenc = frames[numFrames + 1];
                if (!fenc)
                    break;
            }
            //调用 vbvLookahead 函数,传递帧列表 frames、numFrames 和 false 参数,进行 VBV 预测
            vbvLookahead(frames, numFrames, false);
        }
    }

    int bframes, brefs;
    if (!m_param->analysisLoad || m_param->bAnalysisType == HEVC_INFO)
    {
        bool isClosedGopRadl = m_param->radl && (m_param->keyframeMax != m_param->keyframeMin);
        for (bframes = 0, brefs = 0;; bframes++)
        {
            Lowres& frm = list[bframes]->m_lowres;

            if (frm.sliceType == X265_TYPE_BREF && !m_param->bBPyramid && brefs == m_param->bBPyramid)
            {
                frm.sliceType = X265_TYPE_B;
                x265_log(m_param, X265_LOG_WARNING, "B-ref at frame %d incompatible with B-pyramid\n",
                    frm.frameNum);
            }

            /* pyramid with multiple B-refs needs a big enough dpb that the preceding P-frame stays available.
             * smaller dpb could be supported by smart enough use of mmco, but it's easier just to forbid it. */
            else if (frm.sliceType == X265_TYPE_BREF && m_param->bBPyramid && brefs &&
                m_param->maxNumReferences <= (brefs + 3))
            {
                frm.sliceType = X265_TYPE_B;
                x265_log(m_param, X265_LOG_WARNING, "B-ref at frame %d incompatible with B-pyramid and %d reference frames\n",
                    frm.sliceType, m_param->maxNumReferences);
            }//frm.frameNum与上一个关键帧之间的距离是否满足m_param->keyframeMax和m_extendGopBoundary的条件。根据不同的条件,将帧的slice类型更改为X265_TYPE_I或X265_TYPE_IDR
            if (((!m_param->bIntraRefresh || frm.frameNum == 0) && frm.frameNum - m_lastKeyframe >= m_param->keyframeMax &&
                (!m_extendGopBoundary || frm.frameNum - m_lastKeyframe >= m_param->keyframeMax + m_param->gopLookahead)) ||
                (frm.frameNum == (m_param->chunkStart - 1)) || (frm.frameNum == m_param->chunkEnd))
            {
                if (frm.sliceType == X265_TYPE_AUTO || frm.sliceType == X265_TYPE_I)
                    frm.sliceType = m_param->bOpenGOP && m_lastKeyframe >= 0 ? X265_TYPE_I : X265_TYPE_IDR;
                bool warn = frm.sliceType != X265_TYPE_IDR;
                if (warn && m_param->bOpenGOP)
                    warn &= frm.sliceType != X265_TYPE_I;
                if (warn)
                {
                    x265_log(m_param, X265_LOG_WARNING, "specified frame type (%d) at %d is not compatible with keyframe interval\n",
                        frm.sliceType, frm.frameNum);
                    frm.sliceType = m_param->bOpenGOP && m_lastKeyframe >= 0 ? X265_TYPE_I : X265_TYPE_IDR;
                }
            }
            if (frm.bIsFadeEnd){
                frm.sliceType = m_param->bOpenGOP && m_lastKeyframe >= 0 ? X265_TYPE_I : X265_TYPE_IDR;
            }
            if (m_param->bResetZoneConfig)
            {
                for (int i = 0; i < m_param->rc.zonefileCount; i++)
                {
                    int curZoneStart = m_param->rc.zones[i].startFrame;
                    curZoneStart += curZoneStart ? m_param->rc.zones[i].zoneParam->radl : 0;
                    if (curZoneStart == frm.frameNum)
                        frm.sliceType = X265_TYPE_IDR;
                }
            }
            if ((frm.sliceType == X265_TYPE_I && frm.frameNum - m_lastKeyframe >= m_param->keyframeMin) || (frm.frameNum == (m_param->chunkStart - 1)) || (frm.frameNum == m_param->chunkEnd))
            {
                if (m_param->bOpenGOP)
                {
                    m_lastKeyframe = frm.frameNum;
                    frm.bKeyframe = true;
                }
                else
                    frm.sliceType = X265_TYPE_IDR;
            }
            if (frm.sliceType == X265_TYPE_IDR && frm.bScenecut && isClosedGopRadl)
            {
                for (int i = bframes; i < bframes + m_param->radl; i++)
                    list[i]->m_lowres.sliceType = X265_TYPE_B;
                list[(bframes + m_param->radl)]->m_lowres.sliceType = X265_TYPE_IDR;
            }
            if (frm.sliceType == X265_TYPE_IDR)
            {
                /* Closed GOP */
                m_lastKeyframe = frm.frameNum;
                frm.bKeyframe = true;
                int zoneRadl = 0;
                if (m_param->bResetZoneConfig)
                {
                    for (int i = 0; i < m_param->rc.zonefileCount; i++)
                    {
                        int zoneStart = m_param->rc.zones[i].startFrame;
                        zoneStart += zoneStart ? m_param->rc.zones[i].zoneParam->radl : 0;
                        if (zoneStart == frm.frameNum)
                        {
                            zoneRadl = m_param->rc.zones[i].zoneParam->radl;
                            m_param->radl = 0;
                            m_param->rc.zones->zoneParam->radl = i < m_param->rc.zonefileCount - 1 ? m_param->rc.zones[i + 1].zoneParam->radl : 0;
                            break;
                        }
                    }
                }
                if (bframes > 0 && !m_param->radl && !zoneRadl)
                {
                    list[bframes - 1]->m_lowres.sliceType = X265_TYPE_P;
                    bframes--;
                }
            }
            if (bframes == m_param->bframes || !list[bframes + 1])
            {
                if (IS_X265_TYPE_B(frm.sliceType))
                    x265_log(m_param, X265_LOG_WARNING, "specified frame type is not compatible with max B-frames\n");
                if (frm.sliceType == X265_TYPE_AUTO || IS_X265_TYPE_B(frm.sliceType))
                    frm.sliceType = X265_TYPE_P;
            }
            if (frm.sliceType == X265_TYPE_BREF)
                brefs++;
            if (frm.sliceType == X265_TYPE_AUTO)
                frm.sliceType = X265_TYPE_B;
            else if (!IS_X265_TYPE_B(frm.sliceType))
                break;
        }
    }
    else
    {
        for (bframes = 0, brefs = 0;; bframes++)
        {
            Lowres& frm = list[bframes]->m_lowres;
            if (frm.sliceType == X265_TYPE_BREF)
                brefs++;
            if ((IS_X265_TYPE_I(frm.sliceType) && frm.frameNum - m_lastKeyframe >= m_param->keyframeMin)
                || (frm.frameNum == (m_param->chunkStart - 1)) || (frm.frameNum == m_param->chunkEnd))
            {
                m_lastKeyframe = frm.frameNum;
                frm.bKeyframe = true;
            }
            if (!IS_X265_TYPE_B(frm.sliceType))
                break;
        }
    }

    if (m_param->bEnableTemporalSubLayers > 2)
    {
        //Split the partial mini GOP into sub mini GOPs when temporal sub layers are enabled
        if (bframes < m_param->bframes)
        {
            int leftOver = bframes + 1;
            int8_t gopId = m_gopId - 1;
            int gopLen = x265_gop_ra_length[gopId];
            int listReset = 0;

            m_outputLock.acquire();

            while ((gopId >= 0) && (leftOver > 3))
            {
                if (leftOver < gopLen)
                {
                    gopId = gopId - 1;
                    gopLen = x265_gop_ra_length[gopId];
                    continue;
                }
                else
                {
                    int newbFrames = listReset + gopLen - 1;
                    //Re-assign GOP
                    list[newbFrames]->m_lowres.sliceType = IS_X265_TYPE_I(list[newbFrames]->m_lowres.sliceType) ? list[newbFrames]->m_lowres.sliceType : X265_TYPE_P;
                    if (newbFrames)
                        list[newbFrames - 1]->m_lowres.bLastMiniGopBFrame = true;
                    list[newbFrames]->m_lowres.leadingBframes = newbFrames;
                    m_lastNonB = &list[newbFrames]->m_lowres;

                    /* insert a bref into the sequence */
                    if (m_param->bBPyramid && newbFrames)
                    {
                        placeBref(list, listReset, newbFrames, newbFrames + 1, &brefs);
                    }
                    if (m_param->rc.rateControlMode != X265_RC_CQP)
                    {
                        int p0, p1, b;
                        /* For zero latency tuning, calculate frame cost to be used later in RC */
                        if (!maxSearch)
                        {
                            for (int i = listReset; i <= newbFrames; i++)
                                frames[i + 1] = &list[listReset + i]->m_lowres;
                        }

                        /* estimate new non-B cost */
                        p1 = b = newbFrames + 1;
                        p0 = (IS_X265_TYPE_I(frames[newbFrames + 1]->sliceType)) ? b : listReset;

                        CostEstimateGroup estGroup(*this, frames);

                        estGroup.singleCost(p0, p1, b);

                        if (newbFrames)
                            compCostBref(frames, listReset, newbFrames, newbFrames + 1);
                    }

                    m_inputLock.acquire();
                    /* dequeue all frames from inputQueue that are about to be enqueued
                     * in the output queue. The order is important because Frame can
                     * only be in one list at a time */
                    int64_t pts[X265_BFRAME_MAX + 1];
                    for (int i = 0; i < gopLen; i++)
                    {
                        Frame *curFrame;
                        curFrame = m_inputQueue.popFront();
                        pts[i] = curFrame->m_pts;
                        maxSearch--;
                    }
                    m_inputLock.release();

                    int idx = 0;
                    /* add non-B to output queue */
                    list[newbFrames]->m_reorderedPts = pts[idx++];
                    list[newbFrames]->m_gopOffset = 0;
                    list[newbFrames]->m_gopId = gopId;
                    list[newbFrames]->m_tempLayer = x265_gop_ra[gopId][0].layer;
                    m_outputQueue.pushBack(*list[newbFrames]);

                    /* add B frames to output queue */
                    int i = 1, j = 1;
                    while (i < gopLen)
                    {
                        int offset = listReset + (x265_gop_ra[gopId][j].poc_offset - 1);
                        if (!list[offset] || offset == newbFrames)
                            continue;

                        // Assign gop offset and temporal layer of frames
                        list[offset]->m_gopOffset = j;
                        list[bframes]->m_gopId = gopId;
                        list[offset]->m_tempLayer = x265_gop_ra[gopId][j++].layer;

                        list[offset]->m_reorderedPts = pts[idx++];
                        m_outputQueue.pushBack(*list[offset]);
                        i++;
                    }

                    listReset += gopLen;
                    leftOver = leftOver - gopLen;
                    gopId -= 1;
                    gopLen = (gopId >= 0) ? x265_gop_ra_length[gopId] : 0;
                }
            }

            if (leftOver > 0 && leftOver < 4)
            {
                int64_t pts[X265_BFRAME_MAX + 1];
                int idx = 0;

                int newbFrames = listReset + leftOver - 1;
                list[newbFrames]->m_lowres.sliceType = IS_X265_TYPE_I(list[newbFrames]->m_lowres.sliceType) ? list[newbFrames]->m_lowres.sliceType : X265_TYPE_P;
                if (newbFrames)
                        list[newbFrames - 1]->m_lowres.bLastMiniGopBFrame = true;
                list[newbFrames]->m_lowres.leadingBframes = newbFrames;
                m_lastNonB = &list[newbFrames]->m_lowres;

                /* insert a bref into the sequence */
                if (m_param->bBPyramid && (newbFrames- listReset) > 1)
                    placeBref(list, listReset, newbFrames, newbFrames + 1, &brefs);

                if (m_param->rc.rateControlMode != X265_RC_CQP)
                {
                    int p0, p1, b;
                    /* For zero latency tuning, calculate frame cost to be used later in RC */
                    if (!maxSearch)
                    {
                        for (int i = listReset; i <= newbFrames; i++)
                            frames[i + 1] = &list[listReset + i]->m_lowres;
                    }

                        /* estimate new non-B cost */
                    p1 = b = newbFrames + 1;
                    p0 = (IS_X265_TYPE_I(frames[newbFrames + 1]->sliceType)) ? b : listReset;

                    CostEstimateGroup estGroup(*this, frames);

                    estGroup.singleCost(p0, p1, b);

                    if (newbFrames)
                        compCostBref(frames, listReset, newbFrames, newbFrames + 1);
                }

                m_inputLock.acquire();
                /* dequeue all frames from inputQueue that are about to be enqueued
                 * in the output queue. The order is important because Frame can
                 * only be in one list at a time */
                for (int i = 0; i < leftOver; i++)
                {
                    Frame *curFrame;
                    curFrame = m_inputQueue.popFront();
                    pts[i] = curFrame->m_pts;
                    maxSearch--;
                }
                m_inputLock.release();

                m_lastNonB = &list[newbFrames]->m_lowres;
                list[newbFrames]->m_reorderedPts = pts[idx++];
                list[newbFrames]->m_gopOffset = 0;
                list[newbFrames]->m_gopId = -1;
                list[newbFrames]->m_tempLayer = 0;
                m_outputQueue.pushBack(*list[newbFrames]);
                if (brefs)
                {
                    for (int i = listReset; i < newbFrames; i++)
                    {
                        if (list[i]->m_lowres.sliceType == X265_TYPE_BREF)
                        {
                            list[i]->m_reorderedPts = pts[idx++];
                            list[i]->m_gopOffset = 0;
                            list[i]->m_gopId = -1;
                            list[i]->m_tempLayer = 0;
                            m_outputQueue.pushBack(*list[i]);
                        }
                    }
                }

                /* add B frames to output queue */
                for (int i = listReset; i < newbFrames; i++)
                {
                    /* push all the B frames into output queue except B-ref, which already pushed into output queue */
                    if (list[i]->m_lowres.sliceType != X265_TYPE_BREF)
                    {
                        list[i]->m_reorderedPts = pts[idx++];
                        list[i]->m_gopOffset = 0;
                        list[i]->m_gopId = -1;
                        list[i]->m_tempLayer = 1;
                        m_outputQueue.pushBack(*list[i]);
                    }
                }
            }
        }
        else
        // Fill the complete mini GOP when temporal sub layers are enabled
        {

            list[bframes - 1]->m_lowres.bLastMiniGopBFrame = true;
            list[bframes]->m_lowres.leadingBframes = bframes;
            m_lastNonB = &list[bframes]->m_lowres;

            /* insert a bref into the sequence */
            if (m_param->bBPyramid && !brefs)
            {
                placeBref(list, 0, bframes, bframes + 1, &brefs);
            }

            /* calculate the frame costs ahead of time for estimateFrameCost while we still have lowres */
            if (m_param->rc.rateControlMode != X265_RC_CQP)
            {
                int p0, p1, b;
                /* For zero latency tuning, calculate frame cost to be used later in RC */
                if (!maxSearch)
                {
                    for (int i = 0; i <= bframes; i++)
                        frames[i + 1] = &list[i]->m_lowres;
                }

                /* estimate new non-B cost */
                p1 = b = bframes + 1;
                p0 = (IS_X265_TYPE_I(frames[bframes + 1]->sliceType)) ? b : 0;

                CostEstimateGroup estGroup(*this, frames);
                estGroup.singleCost(p0, p1, b);

                compCostBref(frames, 0, bframes, bframes + 1);
            }

            m_inputLock.acquire();
            /* dequeue all frames from inputQueue that are about to be enqueued
            * in the output queue. The order is important because Frame can
            * only be in one list at a time */
            int64_t pts[X265_BFRAME_MAX + 1];
            for (int i = 0; i <= bframes; i++)
            {
                Frame *curFrame;
                curFrame = m_inputQueue.popFront();
                pts[i] = curFrame->m_pts;
                maxSearch--;
            }
            m_inputLock.release();

            m_outputLock.acquire();

            int idx = 0;
            /* add non-B to output queue */
            list[bframes]->m_reorderedPts = pts[idx++];
            list[bframes]->m_gopOffset = 0;
            list[bframes]->m_gopId = m_gopId;
            list[bframes]->m_tempLayer = x265_gop_ra[m_gopId][0].layer;
            m_outputQueue.pushBack(*list[bframes]);

            int i = 1, j = 1;
            while (i <= bframes)
            {
                int offset = x265_gop_ra[m_gopId][j].poc_offset - 1;
                if (!list[offset] || offset == bframes)
                    continue;

                // Assign gop offset and temporal layer of frames
                list[offset]->m_gopOffset = j;
                list[offset]->m_gopId = m_gopId;
                list[offset]->m_tempLayer = x265_gop_ra[m_gopId][j++].layer;

                /* add B frames to output queue */
                list[offset]->m_reorderedPts = pts[idx++];
                m_outputQueue.pushBack(*list[offset]);
                i++;
            }
        }

        bool isKeyFrameAnalyse = (m_param->rc.cuTree || (m_param->rc.vbvBufferSize && m_param->lookaheadDepth));
        if (isKeyFrameAnalyse && IS_X265_TYPE_I(m_lastNonB->sliceType))
        {
            m_inputLock.acquire();
            Frame *curFrame = m_inputQueue.first();
            frames[0] = m_lastNonB;
            int j;
            for (j = 0; j < maxSearch; j++)
            {
                frames[j + 1] = &curFrame->m_lowres;
                curFrame = curFrame->m_next;
            }
            m_inputLock.release();

            frames[j + 1] = NULL;
            if (!m_param->rc.bStatRead)
                slicetypeAnalyse(frames, true);
            bool bIsVbv = m_param->rc.vbvBufferSize > 0 && m_param->rc.vbvMaxBitrate > 0;
            if ((m_param->analysisLoad && m_param->scaleFactor && bIsVbv) || m_param->bliveVBV2pass)
            {
                int numFrames;
                for (numFrames = 0; numFrames < maxSearch; numFrames++)
                {
                    Lowres *fenc = frames[numFrames + 1];
                    if (!fenc)
                        break;
                }
                vbvLookahead(frames, numFrames, true);
            }
        }


        m_outputLock.release();
    }
    else
    {

        if (bframes)
            list[bframes - 1]->m_lowres.bLastMiniGopBFrame = true;
        list[bframes]->m_lowres.leadingBframes = bframes;
        m_lastNonB = &list[bframes]->m_lowres;
        //接下来的代码段是关于插入B参考帧(B reference frame)的。如果满足条件m_param->bBPyramid为真,且bframes大于1,且brefs为0,则会调用placeBref函数将B参考帧插入到序列中
        /* insert a bref into the sequence */
        if (m_param->bBPyramid && bframes > 1 && !brefs)
        {
            placeBref(list, 0, bframes, bframes + 1, &brefs);
        }
        /* calculate the frame costs ahead of time for estimateFrameCost while we still have lowres */
        if (m_param->rc.rateControlMode != X265_RC_CQP)
        {
            int p0, p1, b;
            /* For zero latency tuning, calculate frame cost to be used later in RC */
            if (!maxSearch)
            {
                for (int i = 0; i <= bframes; i++)
                    frames[i + 1] = &list[i]->m_lowres;
            }

            /* estimate new non-B cost */
            p1 = b = bframes + 1;
            p0 = (IS_X265_TYPE_I(frames[bframes + 1]->sliceType)) ? b : 0;

            CostEstimateGroup estGroup(*this, frames);
            estGroup.singleCost(p0, p1, b);

            if (m_param->bEnableTemporalSubLayers > 1 && bframes)
            {
                compCostBref(frames, 0, bframes, bframes + 1);
            }
            else
            {
                if (bframes)
                {
                    p0 = 0; // last nonb
                    bool isp0available = frames[bframes + 1]->sliceType == X265_TYPE_IDR ? false : true;

                    for (b = 1; b <= bframes; b++)
                    {
                        if (!isp0available)
                            p0 = b;

                        if (frames[b]->sliceType == X265_TYPE_B)
                            for (p1 = b; frames[p1]->sliceType == X265_TYPE_B; p1++)
                                ; // find new nonb or bref
                        else
                            p1 = bframes + 1;

                        estGroup.singleCost(p0, p1, b);

                        if (frames[b]->sliceType == X265_TYPE_BREF)
                        {
                            p0 = b;
                            isp0available = true;
                        }
                    }
                }
            }
        }
        //使用m_inputLock进行锁定,以确保线程安全
        m_inputLock.acquire();
        /* dequeue all frames from inputQueue that are about to be enqueued
         * in the output queue. The order is important because Frame can
         * only be in one list at a time */
        int64_t pts[X265_BFRAME_MAX + 1];
        for (int i = 0; i <= bframes; i++)
        {
            Frame *curFrame;
            curFrame = m_inputQueue.popFront();
            pts[i] = curFrame->m_pts;
            maxSearch--;
        }
        m_inputLock.release();

        m_outputLock.acquire();

        /* add non-B to output queue */
        int idx = 0;
        list[bframes]->m_reorderedPts = pts[idx++];
        m_outputQueue.pushBack(*list[bframes]);
        //如果存在B参考帧(brefs为真),则遍历list中的帧,找到类型为B参考帧(X265_TYPE_BREF)的帧,并将其添加到m_outputQueue中。这些帧的时间戳也从pts数组中取出
        /* Add B-ref frame next to P frame in output queue, the B-ref encode before non B-ref frame */
        if (brefs)
        {
            for (int i = 0; i < bframes; i++)
            {
                if (list[i]->m_lowres.sliceType == X265_TYPE_BREF)
                {
                    list[i]->m_reorderedPts = pts[idx++];
                    m_outputQueue.pushBack(*list[i]);
                }
            }
        }
        //代码遍历B帧(除了B参考帧),将它们添加到m_outputQueue中,并从pts数组中取出相应的时间戳
        /* add B frames to output queue */
        for (int i = 0; i < bframes; i++)
        {
            /* push all the B frames into output queue except B-ref, which already pushed into output queue */
            if (list[i]->m_lowres.sliceType != X265_TYPE_BREF)
            {
                list[i]->m_reorderedPts = pts[idx++];
                m_outputQueue.pushBack(*list[i]);
            }
        }

        //如果满足条件isKeyFrameAnalyse为真且最后一个非B帧的类型为I帧,则进入关键帧分析的逻辑
        bool isKeyFrameAnalyse = (m_param->rc.cuTree || (m_param->rc.vbvBufferSize && m_param->lookaheadDepth));
        if (isKeyFrameAnalyse && IS_X265_TYPE_I(m_lastNonB->sliceType))
        {
            m_inputLock.acquire();
            Frame *curFrame = m_inputQueue.first();
            frames[0] = m_lastNonB;
            int j;
            for (j = 0; j < maxSearch; j++)
            {
                frames[j + 1] = &curFrame->m_lowres;
                curFrame = curFrame->m_next;
            }
            m_inputLock.release();

            frames[j + 1] = NULL;
            if (!m_param->rc.bStatRead)
                slicetypeAnalyse(frames, true);
            bool bIsVbv = m_param->rc.vbvBufferSize > 0 && m_param->rc.vbvMaxBitrate > 0;
            if ((m_param->analysisLoad && m_param->scaleFactor && bIsVbv) || m_param->bliveVBV2pass)
            {
                int numFrames;
                for (numFrames = 0; numFrames < maxSearch; numFrames++)
                {
                    Lowres *fenc = frames[numFrames + 1];
                    if (!fenc)
                        break;
                }
                vbvLookahead(frames, numFrames, true);
            }
        }

        m_outputLock.release();
    }
}

6.任务分配模块PreLookaheadGroup::processTasks

这段代码是PreLookaheadGroup类中的processTasks函数。以下是代码的解释:

void PreLookaheadGroup::processTasks(int workerThreadID)
{
    //如果 workerThreadID 小于 0,则将其设置为 m_lookahead 对象的线程池中的工作线程数量,否则将其设置为 0
    if (workerThreadID < 0)
        workerThreadID = m_lookahead.m_pool ? m_lookahead.m_pool->m_numWorkers : 0;
    //获取与工作线程ID对应的 LookaheadTLD 对象引用 tld,即预先分析任务相关的线程本地数据
    LookaheadTLD& tld = m_lookahead.m_tld[workerThreadID];
    //获取锁 m_lock 的互斥访问权限
    m_lock.acquire();
    //在循环中,只要已经获取的任务数量 m_jobAcquired 小于总任务数量 m_jobTotal
    while (m_jobAcquired < m_jobTotal)
    {   //获取当前需要处理的预先分析帧 preFrame,并将 m_jobAcquired 自增
        Frame* preFrame = m_preframes[m_jobAcquired++];
        //在预先分析任务开始的位置进行性能分析
        ProfileLookaheadTime(m_lookahead.m_preLookaheadElapsedTime, m_lookahead.m_countPreLookahead);
        ProfileScopeEvent(prelookahead);
        //释放锁 m_lock
        m_lock.release();
        //初始化预先分析帧的低分辨率帧 preFrame->m_lowres,使用 preFrame->m_fencPic 和 preFrame->m_poc 初始化
        preFrame->m_lowres.init(preFrame->m_fencPic, preFrame->m_poc);
        //如果启用了自适应量化 (m_lookahead.m_bAdaptiveQuant),则调用 tld.calcAdaptiveQuantFrame 方法计算自适应量化帧
        if (m_lookahead.m_bAdaptiveQuant)
            tld.calcAdaptiveQuantFrame(preFrame, m_lookahead.m_param);
        //如果启用了基于直方图的场景切换检测 (m_lookahead.m_param->bHistBasedSceneCut),则调用 tld.collectPictureStatistics 方法收集图片统计信息
        if (m_lookahead.m_param->bHistBasedSceneCut)
            tld.collectPictureStatistics(preFrame);
        //调用 tld.lowresIntraEstimate 方法进行低分辨率帧的帧内估计
        tld.lowresIntraEstimate(preFrame->m_lowres, m_lookahead.m_param->rc.qgSize);
        preFrame->m_lowresInit = true;
        //获取锁 m_lock 的互斥访问权限
        m_lock.acquire();
    }
    //释放锁 m_lock
    m_lock.release();
}

7.低分辨率帧的帧内估计LookaheadTLD::lowresIntraEstimate()

该方法用于进行低分辨率帧的帧内估计,以下是代码的解释:

//该方法用于进行低分辨率帧的帧内估计
void LookaheadTLD::lowresIntraEstimate(Lowres& fenc, uint32_t qgSize)
{   //定义了一些局部变量和常量,包括像素数组 prediction、fencIntra、neighbours,以及指向 neighbours 中两个不同位置的指针 samples 和 filtered
    ALIGN_VAR_32(pixel, prediction[X265_LOWRES_CU_SIZE * X265_LOWRES_CU_SIZE]);
    pixel fencIntra[X265_LOWRES_CU_SIZE * X265_LOWRES_CU_SIZE];
    pixel neighbours[2][X265_LOWRES_CU_SIZE * 4 + 1];
    pixel* samples = neighbours[0], *filtered = neighbours[1];
    //初始化一些参数,如预测模式相关的 lambda 值、帧内预测的惩罚值、CU(Coding Unit)的大小和索引等
    const int lookAheadLambda = (int)x265_lambda_tab[X265_LOOKAHEAD_QP];
    const int intraPenalty = 5 * lookAheadLambda;
    const int lowresPenalty = 4; /* fixed CU cost overhead */

    const int cuSize  = X265_LOWRES_CU_SIZE;
    const int cuSize2 = cuSize << 1;
    const int sizeIdx = X265_LOWRES_CU_BITS - 2;

    pixelcmp_t satd = primitives.pu[sizeIdx].satd;
    int planar = !!(cuSize >= 8);

    int costEst = 0, costEstAq = 0;
    //对于每个 CU 的 Y 坐标(cuY)循环遍历,范围是从 0 到 heightInCU - 1
    for (int cuY = 0; cuY < heightInCU; cuY++)
    {
        fenc.rowSatds[0][0][cuY] = 0;
        //在每个 CU 的 X 坐标(cuX)循环遍历,范围是从 0 到 widthInCU - 1
        for (int cuX = 0; cuX < widthInCU; cuX++)
        {   //计算当前 CU 的索引 cuXY 和像素偏移量 pelOffset
            const int cuXY = cuX + cuY * widthInCU;
            const intptr_t pelOffset = cuSize * cuX + cuSize * cuY * fenc.lumaStride;
            pixel *pixCur = fenc.lowresPlane[0] + pelOffset;

            /* copy fenc pixels *///将当前 CU 的像素拷贝到 fencIntra 数组中
            primitives.cu[sizeIdx].copy_pp(fencIntra, cuSize, pixCur, fenc.lumaStride);

            /* collect reference sample pixels */
            //收集邻域样本像素,并存储在 samples 数组中。拷贝顶部样本和左侧样本
            pixCur -= fenc.lumaStride + 1;
            memcpy(samples, pixCur, (2 * cuSize + 1) * sizeof(pixel)); /* top */
            for (int i = 1; i <= 2 * cuSize; i++)
                samples[cuSize2 + i] = pixCur[i * fenc.lumaStride];    /* left */

            primitives.cu[sizeIdx].intra_filter(samples, filtered);

            int cost, icost = me.COST_MAX;
            uint32_t ilowmode = 0;
            //对于 DC 和 Planar 两种预测模式,分别进行帧内预测,并计算预测残差的 SATD(Sum of Absolute Transformed Differences)代价。选择较小的代价作为当前 CU 的最佳预测模式
            /* DC and planar */
            primitives.cu[sizeIdx].intra_pred[DC_IDX](prediction, cuSize, samples, 0, cuSize <= 16);
            cost = satd(fencIntra, cuSize, prediction, cuSize);
            COPY2_IF_LT(icost, cost, ilowmode, DC_IDX);

            primitives.cu[sizeIdx].intra_pred[PLANAR_IDX](prediction, cuSize, neighbours[planar], 0, 0);
            cost = satd(fencIntra, cuSize, prediction, cuSize);
            COPY2_IF_LT(icost, cost, ilowmode, PLANAR_IDX);

            /* scan angular predictions */
            int filter, acost = me.COST_MAX;
            uint32_t mode, alowmode = 4;
            //遍历角度预测模式,计算每个模式的预测残差的 SATD 代价,并选择最小的代价作为当前 CU 的最佳预测模式
            for (mode = 5; mode < 35; mode += 5)
            {
                filter = !!(g_intraFilterFlags[mode] & cuSize);
                primitives.cu[sizeIdx].intra_pred[mode](prediction, cuSize, neighbours[filter], mode, cuSize <= 16);
                cost = satd(fencIntra, cuSize, prediction, cuSize);
                COPY2_IF_LT(acost, cost, alowmode, mode);
            }
            //在最佳预测模式周围的两个模式中,再次计算预测残差的 SATD 代价,并选择最小的代价作为当前 CU 的最终预测模式
            for (uint32_t dist = 2; dist >= 1; dist--)
            {
                int minusmode = alowmode - dist;
                int plusmode = alowmode + dist;

                mode = minusmode;
                filter = !!(g_intraFilterFlags[mode] & cuSize);
                primitives.cu[sizeIdx].intra_pred[mode](prediction, cuSize, neighbours[filter], mode, cuSize <= 16);
                cost = satd(fencIntra, cuSize, prediction, cuSize);
                COPY2_IF_LT(acost, cost, alowmode, mode);

                mode = plusmode;
                filter = !!(g_intraFilterFlags[mode] & cuSize);
                primitives.cu[sizeIdx].intra_pred[mode](prediction, cuSize, neighbours[filter], mode, cuSize <= 16);
                cost = satd(fencIntra, cuSize, prediction, cuSize);
                COPY2_IF_LT(acost, cost, alowmode, mode);
            }
            COPY2_IF_LT(icost, acost, ilowmode, alowmode);
            //根据预测模式的代价和惩罚值,估计当前 CU 的帧内信号代价,并更新相关数据结构
            icost += intraPenalty + lowresPenalty; /* estimate intra signal cost */

            fenc.lowresCosts[0][0][cuXY] = (uint16_t)(X265_MIN(icost, LOWRES_COST_MASK) | (0 << LOWRES_COST_SHIFT));
            fenc.intraCost[cuXY] = icost;
            fenc.intraMode[cuXY] = (uint8_t)ilowmode;
            /* do not include edge blocks in the 
            frame cost estimates, they are not very accurate */
            //如果当前 CU 不在边缘位置,则将其帧内信号代价累加到整个帧的代价估计中
            const bool bFrameScoreCU = (cuX > 0 && cuX < widthInCU - 1 &&
                                        cuY > 0 && cuY < heightInCU - 1) || widthInCU <= 2 || heightInCU <= 2;
            int icostAq;
            if (qgSize == 8)
                icostAq = (bFrameScoreCU && fenc.invQscaleFactor) ? ((icost * fenc.invQscaleFactor8x8[cuXY] + 128) >> 8) : icost;
            else
                icostAq = (bFrameScoreCU && fenc.invQscaleFactor) ? ((icost * fenc.invQscaleFactor[cuXY] +128) >> 8) : icost;

            if (bFrameScoreCU)
            {
                costEst += icost;
                costEstAq += icostAq;
            }

            fenc.rowSatds[0][0][cuY] += icostAq;
        }
    }
    //更新整个帧的代价估计
    fenc.costEst[0][0] = costEst;
    fenc.costEstAq[0][0] = costEstAq;
}

8.进行类型分析Lookahead::slicetypeAnalyse

类型分析

void Lookahead::slicetypeAnalyse(Lowres **frames, bool bKeyframe)
{
    int numFrames, origNumFrames, keyintLimit, framecnt;
    //根据条件计算最大搜索帧数 maxSearch,取 m_param->lookaheadDepth 和 X265_LOOKAHEAD_MAX 中的较小值
    int maxSearch = X265_MIN(m_param->lookaheadDepth, X265_LOOKAHEAD_MAX);
    int cuCount = m_8x8Blocks;
    int resetStart;
    bool bIsVbvLookahead = m_param->rc.vbvBufferSize && m_param->lookaheadDepth;

    /* count undecided frames */
    //统计未决帧数。遍历帧列表 frames,直到达到最大搜索帧数 maxSearch 或遇到切片类型不为 X265_TYPE_AUTO 的帧,每次递增 framecnt。这一步统计了未决帧的数量
    for (framecnt = 0; framecnt < maxSearch; framecnt++)
    {
        Lowres *fenc = frames[framecnt + 1];
        if (!fenc || fenc->sliceType != X265_TYPE_AUTO)
            break;
    }
    //如果 framecnt 为 0,表示未找到未决帧。根据条件判断是否需要进行 CU 树的处理,如果需要,则调用 cuTree 函数进行处理,然后返回
    if (!framecnt)
    {
        if (m_param->rc.cuTree)
            cuTree(frames, 0, bKeyframe);
        return;
    }//将 frames[framecnt + 1] 设置为 NULL,表示未决帧之后的帧为空
    frames[framecnt + 1] = NULL;
    //如果启用了区域配置重置(m_param->bResetZoneConfig 为真),则根据区域配置的设置更新 m_param->keyframeMax
    if (m_param->bResetZoneConfig)
    {
        for (int i = 0; i < m_param->rc.zonefileCount; i++)
        {
            int curZoneStart = m_param->rc.zones[i].startFrame, nextZoneStart = 0;
            curZoneStart += curZoneStart ? m_param->rc.zones[i].zoneParam->radl : 0;
            nextZoneStart += (i + 1 < m_param->rc.zonefileCount) ? m_param->rc.zones[i + 1].startFrame + m_param->rc.zones[i + 1].zoneParam->radl : m_param->totalFrames;
            if (curZoneStart <= frames[0]->frameNum && nextZoneStart > frames[0]->frameNum)
                m_param->keyframeMax = nextZoneStart - curZoneStart;
            if (m_param->rc.zones[m_param->rc.zonefileCount - 1].startFrame <= frames[0]->frameNum && nextZoneStart == 0)
                m_param->keyframeMax = m_param->rc.zones[0].keyframeMax;
        }
    }//根据当前帧的帧号和区块的设置,更新 keylimit 的值
    int keylimit = m_param->keyframeMax;
    if (frames[0]->frameNum < m_param->chunkEnd)
    {
        int chunkStart = (m_param->chunkStart - m_lastKeyframe - 1);
        int chunkEnd = (m_param->chunkEnd - m_lastKeyframe);
        if ((chunkStart > 0) && (chunkStart < m_param->keyframeMax))
            keylimit = chunkStart;
        else if ((chunkEnd > 0) && (chunkEnd < m_param->keyframeMax))
            keylimit = chunkEnd;
    }
    //根据 GOP 的设置和可用的关键帧限制,计算 keyFrameLimit 的值
    int keyFrameLimit = keylimit + m_lastKeyframe - frames[0]->frameNum - 1;
    if (m_param->gopLookahead && keyFrameLimit <= m_param->bframes + 1)
        keyintLimit = keyFrameLimit + m_param->gopLookahead;
    else
        keyintLimit = keyFrameLimit;
    //根据不同情况更新 numFrames 的值,包括是否启用 VBV 预测、是否为开放式 GOP 和是否存在未决帧
    origNumFrames = numFrames = m_param->bIntraRefresh ? framecnt : X265_MIN(framecnt, keyintLimit);
    if (bIsVbvLookahead)
        numFrames = framecnt;
    else if (m_param->bOpenGOP && numFrames < framecnt)
        numFrames++;
    else if (numFrames == 0)
    {
        frames[1]->sliceType = X265_TYPE_I;
        return;
    }
    //首先判断是否需要进行批处理的运动搜索
    if (m_bBatchMotionSearch)
    {   //创建一个CostEstimateGroup对象estGroup,该对象用于存储成本估计,使用嵌套循环遍历帧(frames)中的每个参考帧(b)和其之前的帧(p0),以及其之后的帧(p1),并添加到estGroup中进行运动搜索
        /* pre-calculate all motion searches, using many worker threads */
        CostEstimateGroup estGroup(*this, frames);
        for (int b = 2; b < numFrames; b++)
        {   //这个循环仅增加前后帧距离相等的参考关系
            for (int i = 1; i <= m_param->bframes + 1; i++)
            {
                int p0 = b - i;
                if (p0 < 0)
                    continue;

                /* Skip search if already done */
                if (frames[b]->lowresMvs[0][i][0].x != 0x7FFF)
                    continue;

                /* perform search to p1 at same distance, if possible */
                int p1 = b + i;
                if (p1 >= numFrames || frames[b]->lowresMvs[1][i][0].x != 0x7FFF)
                    p1 = b;

                estGroup.add(p0, p1, b);
            }
        }//自动禁用批处理运动搜索(m_bBatchMotionSearch)如果线程池(m_pool)的工作线程数量小于4
        /* auto-disable after the first batch if pool is small */
        m_bBatchMotionSearch &= m_pool->m_numWorkers >= 4;
        estGroup.finishBatch();

        if (m_bBatchFrameCosts)
        {   //这边在上面的前后帧距离相等的基础上,再补充其他的组合方式
            /* pre-calculate all frame cost estimates, using many worker threads */
            for (int b = 2; b < numFrames; b++)
            {
                for (int i = 1; i <= m_param->bframes + 1; i++)
                {   
                    if (b < i)
                        continue;

                    /* only measure frame cost in this pass if motion searches
                     * are already done */
                    if (frames[b]->lowresMvs[0][i][0].x == 0x7FFF)
                        continue;

                    int p0 = b - i;

                    for (int j = 0; j <= m_param->bframes; j++)
                    {
                        int p1 = b + j;
                        if (p1 >= numFrames)
                            break;

                        /* ensure P1 search is done */
                        if (j && frames[b]->lowresMvs[1][j][0].x == 0x7FFF)
                            continue;

                        /* ensure frame cost is not done */
                        if (frames[b]->costEst[i][j] >= 0)
                            continue;

                        estGroup.add(p0, p1, b);
                    }
                }
            }

            /* auto-disable after the first batch if the pool is not large */
            m_bBatchFrameCosts &= m_pool->m_numWorkers > 12;
            estGroup.finishBatch();
        }
    }

    int numBFrames = 0;
    int numAnalyzed = numFrames;
    bool isScenecut = false;

    if (m_param->bHistBasedSceneCut)
        isScenecut = histBasedScenecut(frames, 0, 1, origNumFrames);
    else//判断当前帧是否是场景切换
        isScenecut = scenecut(frames, 0, 1, true, origNumFrames);

    /* When scenecut threshold is set, use scenecut detection for I frame placements */
    if (m_param->scenecutThreshold && isScenecut)
    {   //将第二帧的 sliceType 设置为关键帧(I 帧)类型,并返回
        frames[1]->sliceType = X265_TYPE_I;
        return;
    }
    if (m_param->gopLookahead && (keyFrameLimit >= 0) && (keyFrameLimit <= m_param->bframes + 1))
    {
        bool sceneTransition = m_isSceneTransition;
        m_extendGopBoundary = false;
        for (int i = m_param->bframes + 1; i < origNumFrames; i += m_param->bframes + 1)
        {
            scenecut(frames, i, i + 1, true, origNumFrames);

            for (int j = i + 1; j <= X265_MIN(i + m_param->bframes + 1, origNumFrames); j++)
            {
                if (frames[j]->bScenecut && scenecutInternal(frames, j - 1, j, true))
                {
                    m_extendGopBoundary = true;
                    break;
                }
            }
            if (m_extendGopBoundary)
                break;
        }
        m_isSceneTransition = sceneTransition;
    }
    if (m_param->bframes)
    {
        if (m_param->bFrameAdaptive == X265_B_ADAPT_TRELLIS)
        {
            if (numFrames > 1)
            {   //并初始化第一行为空字符串,第二行为"P"
                char best_paths[X265_BFRAME_MAX + 1][X265_LOOKAHEAD_MAX + 1] = { "", "P" };
                int best_path_index = numFrames % (X265_BFRAME_MAX + 1);
                //调用slicetypePath函数确定最佳的切片路径,并将结果保存在best_paths数组中
                /* Perform the frame type analysis. */
                for (int j = 2; j <= numFrames; j++)
                    slicetypePath(frames, j, best_paths);
                //使用strspn函数计算best_paths[best_path_index]中连续的"B"字符数量,得到B帧的数量(numBFrames)
                numBFrames = (int)strspn(best_paths[best_path_index], "B");
                //将分析结果加载到frames数组中
                /* Load the results of the analysis into the frame types. */
                for (int j = 1; j < numFrames; j++)
                    frames[j]->sliceType = best_paths[best_path_index][j - 1] == 'B' ? X265_TYPE_B : X265_TYPE_P;
            }//将最后一帧(frames[numFrames])的切片类型设置为P帧
            frames[numFrames]->sliceType = X265_TYPE_P;
        }
        else if (m_param->bFrameAdaptive == X265_B_ADAPT_FAST)
        {
            CostEstimateGroup estGroup(*this, frames);

            int64_t cost1p0, cost2p0, cost1b1, cost2p1;

            for (int i = 0; i <= numFrames - 2; )
            {
                cost2p1 = estGroup.singleCost(i + 0, i + 2, i + 2, true);
                if (frames[i + 2]->intraMbs[2] > cuCount / 2)
                {
                    frames[i + 1]->sliceType = X265_TYPE_P;
                    frames[i + 2]->sliceType = X265_TYPE_P;
                    i += 2;
                    continue;
                }

                cost1b1 = estGroup.singleCost(i + 0, i + 2, i + 1);
                cost1p0 = estGroup.singleCost(i + 0, i + 1, i + 1);
                cost2p0 = estGroup.singleCost(i + 1, i + 2, i + 2);

                if (cost1p0 + cost2p0 < cost1b1 + cost2p1)
                {
                    frames[i + 1]->sliceType = X265_TYPE_P;
                    i += 1;
                    continue;
                }

// arbitrary and untuned
#define INTER_THRESH 300
#define P_SENS_BIAS (50 - m_param->bFrameBias)
                frames[i + 1]->sliceType = X265_TYPE_B;

                int j;
                for (j = i + 2; j <= X265_MIN(i + m_param->bframes, numFrames - 1); j++)
                {
                    int64_t pthresh = X265_MAX(INTER_THRESH - P_SENS_BIAS * (j - i - 1), INTER_THRESH / 10);
                    int64_t pcost = estGroup.singleCost(i + 0, j + 1, j + 1, true);
                    if (pcost > pthresh * cuCount || frames[j + 1]->intraMbs[j - i + 1] > cuCount / 3)
                        break;
                    frames[j]->sliceType = X265_TYPE_B;
                }

                frames[j]->sliceType = X265_TYPE_P;
                i = j;
            }
            frames[numFrames]->sliceType = X265_TYPE_P;
            numBFrames = 0;
            while (numBFrames < numFrames && frames[numBFrames + 1]->sliceType == X265_TYPE_B)
                numBFrames++;
        }
        else
        {
            numBFrames = X265_MIN(numFrames - 1, m_param->bframes);
            for (int j = 1; j < numFrames; j++)
                frames[j]->sliceType = (j % (numBFrames + 1)) ? X265_TYPE_B : X265_TYPE_P;

            frames[numFrames]->sliceType = X265_TYPE_P;
        }
        //根据条件判断是否强制使用RADL
        int zoneRadl = m_param->rc.zonefileCount && m_param->bResetZoneConfig ? m_param->rc.zones->zoneParam->radl : 0;
        bool bForceRADL = zoneRadl || (m_param->radl && (m_param->keyframeMax == m_param->keyframeMin));
        bool bLastMiniGop = (framecnt >= m_param->bframes + 1) ? false : true;//根据条件判断是否为最后一个小GOP
        int radl = m_param->radl ? m_param->radl : zoneRadl;
        int preRADL = m_lastKeyframe + m_param->keyframeMax - radl - 1; /*Frame preceeding RADL in POC order*/
        if (bForceRADL && (frames[0]->frameNum == preRADL) && !bLastMiniGop)
        {//如果满足强制使用RADL的条件,并且第一个帧的frameNum等于preRADL,并且不是最后一个小GOP,则执行以下操作
            int j = 1;
            numBFrames = m_param->radl ? m_param->radl : zoneRadl;
            for (; j <= numBFrames; j++)//循环设置帧类型为B帧,从第2帧到第numBFrames帧
                frames[j]->sliceType = X265_TYPE_B;
            frames[j]->sliceType = X265_TYPE_I;
        }
        else /* Check scenecut and RADL on the first minigop. */
        {
            for (int j = 1; j < numBFrames + 1; j++)
            {   //对于每个帧,检查是否满足场景切换条件或者强制使用RADL的条件,如果满足条件,将该帧的帧类型设置为P帧,并将numAnalyzed设置为当前帧的索引,并跳出循环
                if (scenecut(frames, j, j + 1, false, origNumFrames) ||
                    (bForceRADL && (frames[j]->frameNum == preRADL)))
                {
                    frames[j]->sliceType = X265_TYPE_P;
                    numAnalyzed = j;
                    break;
                }
            }
        }
        resetStart = bKeyframe ? 1 : X265_MIN(numBFrames + 2, numAnalyzed + 1);
    }
    else
    {
        for (int j = 1; j <= numFrames; j++)
            frames[j]->sliceType = X265_TYPE_P;

        resetStart = bKeyframe ? 1 : 2;
    }
    if (m_param->bAQMotion)
        aqMotion(frames, bKeyframe);
    //调用cuTree函数处理帧的CU树
    if (m_param->rc.cuTree)
        cuTree(frames, X265_MIN(numFrames, m_param->keyframeMax), bKeyframe);

    if (m_param->gopLookahead && (keyFrameLimit >= 0) && (keyFrameLimit <= m_param->bframes + 1) && !m_extendGopBoundary)
        keyintLimit = keyFrameLimit;

    if (!m_param->bIntraRefresh)
        for (int j = keyintLimit + 1; j <= numFrames; j += m_param->keyframeMax)
        {
            frames[j]->sliceType = X265_TYPE_I;
            resetStart = X265_MIN(resetStart, j + 1);
        }
    
    if (bIsVbvLookahead)
        vbvLookahead(frames, numFrames, bKeyframe);
    int maxp1 = X265_MIN(m_param->bframes + 1, origNumFrames);

    /* Restore frame types for all frames that haven't actually been decided yet. */
    for (int j = resetStart; j <= numFrames; j++)
    {
        frames[j]->sliceType = X265_TYPE_AUTO;
        /* If any frame marked as scenecut is being restarted for sliceDecision, 
         * undo scene Transition flag */
        if (j <= maxp1 && frames[j]->bScenecut && m_isSceneTransition)
            m_isSceneTransition = false;
    }
}

9.低分辨率帧间估计CostEstimateGroup::estimateFrameCost

用于估算一个Frame的成本

int64_t CostEstimateGroup::estimateFrameCost(LookaheadTLD& tld, int p0, int p1, int b, bool bIntraPenalty)
{
    Lowres*     fenc  = m_frames[b];
    x265_param* param = m_lookahead.m_param;
    int64_t     score = 0;

    if (fenc->costEst[b - p0][p1 - b] >= 0 && fenc->rowSatds[b - p0][p1 - b][0] != -1)
        score = fenc->costEst[b - p0][p1 - b];
    else
    {
        bool bDoSearch[2];
        bDoSearch[0] = fenc->lowresMvs[0][b - p0][0].x == 0x7FFF;
        bDoSearch[1] = p1 > b && fenc->lowresMvs[1][p1 - b][0].x == 0x7FFF;

#if CHECKED_BUILD
        X265_CHECK(!(p0 < b && fenc->lowresMvs[0][b - p0][0].x == 0x7FFE), "motion search batch duplication L0\n");
        X265_CHECK(!(p1 > b && fenc->lowresMvs[1][p1 - b][0].x == 0x7FFE), "motion search batch duplication L1\n");
        if (bDoSearch[0]) fenc->lowresMvs[0][b - p0][0].x = 0x7FFE;
        if (bDoSearch[1]) fenc->lowresMvs[1][p1 - b][0].x = 0x7FFE;
#endif

        fenc->weightedRef[b - p0].isWeighted = false;
        if (param->bEnableWeightedPred && bDoSearch[0])
            tld.weightsAnalyse(*m_frames[b], *m_frames[p0]);

        fenc->costEst[b - p0][p1 - b] = 0;
        fenc->costEstAq[b - p0][p1 - b] = 0;
        //如果不处于批处理模式,并且协同模式的切片数大于1,并且需要进行运动搜索或双向测量,则进入协同模式
        if (!m_batchMode && m_lookahead.m_numCoopSlices > 1 && ((p1 > b) || bDoSearch[0] || bDoSearch[1]))
        {
            /* Use cooperative mode if a thread pool is available and the cost estimate is
             * going to need motion searches or bidir measurements */

            memset(&m_slice, 0, sizeof(Slice) * m_lookahead.m_numCoopSlices);

            m_lock.acquire();
            X265_CHECK(!m_batchMode, "single CostEstimateGroup instance cannot mix batch modes\n");
            m_coop.p0 = p0;
            m_coop.p1 = p1;
            m_coop.b = b;
            m_coop.bDoSearch[0] = bDoSearch[0];
            m_coop.bDoSearch[1] = bDoSearch[1];
            m_jobTotal = m_lookahead.m_numCoopSlices;
            m_jobAcquired = 0;
            m_lock.release();

            tryBondPeers(*m_lookahead.m_pool, m_jobTotal);

            processTasks(-1);

            waitForExit();
            //通过使用线程池来并行处理多个任务,计算每个任务的成本估算值,并将结果累加到costEst和costEstAq中
            for (int i = 0; i < m_lookahead.m_numCoopSlices; i++)
            {
                fenc->costEst[b - p0][p1 - b] += m_slice[i].costEst;
                fenc->costEstAq[b - p0][p1 - b] += m_slice[i].costEstAq;
                if (p1 == b)
                    fenc->intraMbs[b - p0] += m_slice[i].intraMbs;
            }
        }
        else
        {   //计算1/16分辨率下的运动矢量(MV
            /* Calculate MVs for 1/16th resolution*/
            bool lastRow;
            if (param->bEnableHME)
            {
                lastRow = true;
                for (int cuY = m_lookahead.m_4x4Height - 1; cuY >= 0; cuY--)
                {
                    for (int cuX = m_lookahead.m_4x4Width - 1; cuX >= 0; cuX--)
                        estimateCUCost(tld, cuX, cuY, p0, p1, b, bDoSearch, lastRow, -1, 1);
                    lastRow = false;
                }
            }
            lastRow = true;
            for (int cuY = m_lookahead.m_8x8Height - 1; cuY >= 0; cuY--)
            {
                fenc->rowSatds[b - p0][p1 - b][cuY] = 0;

                for (int cuX = m_lookahead.m_8x8Width - 1; cuX >= 0; cuX--)
                    estimateCUCost(tld, cuX, cuY, p0, p1, b, bDoSearch, lastRow, -1, 0);

                lastRow = false;
            }
        }

        score = fenc->costEst[b - p0][p1 - b];

        if (b != p1)
            score = score * 100 / (130 + param->bFrameBias);

        fenc->costEst[b - p0][p1 - b] = score;
    }

    if (bIntraPenalty)
        // arbitrary penalty for I-blocks after B-frames
        score += score * fenc->intraMbs[b - p0] / (tld.ncu * 8);

    return score;
}

10.低分辨率单个CU帧间估计CostEstimateGroup::estimateCUCost

用于估算一个Coding Unit(CU)的成本

void CostEstimateGroup::estimateCUCost(LookaheadTLD& tld, int cuX, int cuY, int p0, int p1, int b, bool bDoSearch[2], bool lastRow, int slice, bool hme)
{
    Lowres *fref0 = m_frames[p0];
    Lowres *fref1 = m_frames[p1];
    Lowres *fenc  = m_frames[b];

    ReferencePlanes *wfref0 = fenc->weightedRef[b - p0].isWeighted && !hme ? &fenc->weightedRef[b - p0] : fref0;
    //根据帧的宽度和高度,确定CU在帧中的位置,计算CU的大小、像素偏移量等参数
    const int widthInCU = hme ? m_lookahead.m_4x4Width : m_lookahead.m_8x8Width;
    const int heightInCU = hme ? m_lookahead.m_4x4Height : m_lookahead.m_8x8Height;
    const int bBidir = (b < p1);
    const int cuXY = cuX + cuY * widthInCU;
    const int cuXY_4x4 = (cuX / 2) + (cuY / 2) * widthInCU / 2;
    const int cuSize = X265_LOWRES_CU_SIZE;
    const intptr_t pelOffset = cuSize * cuX + cuSize * cuY * (hme ? fenc->lumaStride/2 : fenc->lumaStride);

    if ((bBidir || bDoSearch[0] || bDoSearch[1]) && hme)
        tld.me.setSourcePU(fenc->lowerResPlane[0], fenc->lumaStride / 2, pelOffset, cuSize, cuSize, X265_HEX_SEARCH, m_lookahead.m_param->hmeSearchMethod[0], m_lookahead.m_param->hmeSearchMethod[1], 1);
    else if((bBidir || bDoSearch[0] || bDoSearch[1]) && !hme)
        tld.me.setSourcePU(fenc->lowresPlane[0], fenc->lumaStride, pelOffset, cuSize, cuSize, X265_HEX_SEARCH, m_lookahead.m_param->hmeSearchMethod[0], m_lookahead.m_param->hmeSearchMethod[1], 1);

    //设置一个小的偏置值lowresPenalty,用于避免由于零残差的预测块导致VBV(Video Buffering Verifier)问题
    /* A small, arbitrary bias to avoid VBV problems caused by zero-residual lookahead blocks. */
    int lowresPenalty = 4;
    int listDist[2] = { b - p0, p1 - b};

    MV mvmin, mvmax;
    int bcost = tld.me.COST_MAX;
    int listused = 0;

    // TODO: restrict to slices boundaries
    // establish search bounds that don't cross extended frame boundaries
    mvmin.x = (int32_t)(-cuX * cuSize - 8);
    mvmin.y = (int32_t)(-cuY * cuSize - 8);
    mvmax.x = (int32_t)((widthInCU - cuX - 1) * cuSize + 8);
    mvmax.y = (int32_t)((heightInCU - cuY - 1) * cuSize + 8);
    //对每个参考列表(单向或双向)进行运动估计和成本计算
    for (int i = 0; i < 1 + bBidir; i++)
    {
        int& fencCost = hme ? fenc->lowerResMvCosts[i][listDist[i]][cuXY] : fenc->lowresMvCosts[i][listDist[i]][cuXY];
        int skipCost = INT_MAX;

        if (!bDoSearch[i])
        {
            COPY2_IF_LT(bcost, fencCost, listused, i + 1);
            continue;
        }

        int numc = 0;
        MV mvc[5], mvp;
        MV* fencMV = hme ? &fenc->lowerResMvs[i][listDist[i]][cuXY] : &fenc->lowresMvs[i][listDist[i]][cuXY];
        ReferencePlanes* fref = i ? fref1 : wfref0;
        //根据特定的条件填充了数组 mvc,将运动矢量存储其中
        /* Reverse-order MV prediction */
#define MVC(mv) mvc[numc++] = mv;
        if (cuX < widthInCU - 1)
            MVC(fencMV[1]);
        if (!lastRow)
        {
            MVC(fencMV[widthInCU]);
            if (cuX > 0)
                MVC(fencMV[widthInCU - 1]);
            if (cuX < widthInCU - 1)
                MVC(fencMV[widthInCU + 1]);
        }
        if (fenc->lowerResMvs[0][0] && !hme && fenc->lowerResMvCosts[i][listDist[i]][cuXY_4x4] > 0)
        {
            MVC((fenc->lowerResMvs[i][listDist[i]][cuXY_4x4]) * 2);
        }
#undef MVC

        if (!numc)
            mvp = 0;
        else
        {
            ALIGN_VAR_32(pixel, subpelbuf[X265_LOWRES_CU_SIZE * X265_LOWRES_CU_SIZE]);
            int mvpcost = MotionEstimate::COST_MAX;

            /* measure SATD cost of each neighbor MV (estimating merge analysis)
             * and use the lowest cost MV as MVP (estimating AMVP). Since all
             * mvc[] candidates are measured here, none are passed to motionEstimate */
            for (int idx = 0; idx < numc; idx++)
            {
                intptr_t stride = X265_LOWRES_CU_SIZE;
                pixel *src = fref->lowresMC(pelOffset, mvc[idx], subpelbuf, stride, hme);
                int cost = tld.me.bufSATD(src, stride);
                COPY2_IF_LT(mvpcost, cost, mvp, mvc[idx]);
                /* Except for mv0 case, everyting else is likely to have enough residual to not trigger the skip. */
                if (!mvp.notZero() && bBidir)
                    skipCost = cost;
            }
        }

        int searchRange = m_lookahead.m_param->bEnableHME ? (hme ? m_lookahead.m_param->hmeRange[0] : m_lookahead.m_param->hmeRange[1]) : s_merange;
        /* ME will never return a cost larger than the cost @MVP, so we do not
         * have to check that ME cost is more than the estimated merge cost */
        if(!hme)//使用运动估计技术计算了 fencCost
            fencCost = tld.me.motionEstimate(fref, mvmin, mvmax, mvp, 0, NULL, searchRange, *fencMV, m_lookahead.m_param->maxSlices);
        else
            fencCost = tld.me.motionEstimate(fref, mvmin, mvmax, mvp, 0, NULL, searchRange, *fencMV, m_lookahead.m_param->maxSlices, fref->lowerResPlane[0]);
        if (skipCost < 64 && skipCost < fencCost && bBidir)
        {
            fencCost = skipCost;
            *fencMV = 0;
        }//通过调用宏 COPY2_IF_LT,将 fencCost 的值复制到 bcost
        COPY2_IF_LT(bcost, fencCost, listused, i + 1);
    }
    if (hme)
        return;
    //如果 bBidir 为真,表示当前帧为双向预测帧(B帧),则执行双向预测的成本估计过程;否则,表示当前帧为单向预测帧(P帧),则执行单向预测的成本估计过程以及考虑帧内预测的情况
    if (bBidir) /* B, also consider bidir */
    {
        /* NOTE: the wfref0 (weightp) is not used for BIDIR */
        //调用 fref0->lowresMC 和 fref1->lowresMC 函数,对参考帧进行亚像素运动补偿,得到两个亚像素平面 src0 和 src1
        /* avg(l0-mv, l1-mv) candidate */
        ALIGN_VAR_32(pixel, subpelbuf0[X265_LOWRES_CU_SIZE * X265_LOWRES_CU_SIZE]);
        ALIGN_VAR_32(pixel, subpelbuf1[X265_LOWRES_CU_SIZE * X265_LOWRES_CU_SIZE]);
        intptr_t stride0 = X265_LOWRES_CU_SIZE, stride1 = X265_LOWRES_CU_SIZE;
        pixel *src0 = fref0->lowresMC(pelOffset, fenc->lowresMvs[0][listDist[0]][cuXY], subpelbuf0, stride0, 0);
        pixel *src1 = fref1->lowresMC(pelOffset, fenc->lowresMvs[1][listDist[1]][cuXY], subpelbuf1, stride1, 0);
        //创建用于存储像素平均值的缓冲区 ref
        ALIGN_VAR_32(pixel, ref[X265_LOWRES_CU_SIZE * X265_LOWRES_CU_SIZE]);
        //使用像素平均值函数
        primitives.pu[LUMA_8x8].pixelavg_pp[NONALIGNED](ref, X265_LOWRES_CU_SIZE, src0, stride0, src1, stride1, 32);
        //计算 ref的 SATD
        int bicost = tld.me.bufSATD(ref, X265_LOWRES_CU_SIZE);
        COPY2_IF_LT(bcost, bicost, listused, 3);
        /* coloc candidate */
        //再次使用像素平均值函数,将 fref0->lowresPlane[0] 和 fref1->lowresPlane[0] 的像素平均值存储到 ref 缓冲区中
        src0 = fref0->lowresPlane[0] + pelOffset;
        src1 = fref1->lowresPlane[0] + pelOffset;
        primitives.pu[LUMA_8x8].pixelavg_pp[NONALIGNED](ref, X265_LOWRES_CU_SIZE, src0, fref0->lumaStride, src1, fref1->lumaStride, 32);
        bicost = tld.me.bufSATD(ref, X265_LOWRES_CU_SIZE);
        COPY2_IF_LT(bcost, bicost, listused, 3);
        bcost += lowresPenalty;
    }
    else /* P, also consider intra */
    {
        bcost += lowresPenalty;

        if (fenc->intraCost[cuXY] < bcost)
        {
            bcost = fenc->intraCost[cuXY];
            listused = 0;
        }
    }
    //根据条件判断当前块是否位于帧的边缘区域,并将结果存储在布尔变量 bFrameScoreCU 中
    /* do not include edge blocks in the frame cost estimates, they are not very accurate */
    const bool bFrameScoreCU = (cuX > 0 && cuX < widthInCU - 1 &&
                                cuY > 0 && cuY < heightInCU - 1) || widthInCU <= 2 || heightInCU <= 2;
    int bcostAq;
    if (m_lookahead.m_param->rc.qgSize == 8)
        bcostAq = (bFrameScoreCU && fenc->invQscaleFactor) ? ((bcost * fenc->invQscaleFactor8x8[cuXY] + 128) >> 8) : bcost;
    else
        bcostAq = (bFrameScoreCU && fenc->invQscaleFactor) ? ((bcost * fenc->invQscaleFactor[cuXY] +128) >> 8) : bcost;

    if (bFrameScoreCU)
    {   //具体的更新根据当前是整个帧还是分片进行不同的处理
        if (slice < 0)//如果 slice 小于零,表示当前处理的是整个帧(不是分片)
        {
            fenc->costEst[b - p0][p1 - b] += bcost;
            fenc->costEstAq[b - p0][p1 - b] += bcostAq;
            if (!listused && !bBidir)
                fenc->intraMbs[b - p0]++;
        }
        else
        {
            m_slice[slice].costEst += bcost;
            m_slice[slice].costEstAq += bcostAq;
            if (!listused && !bBidir)
                m_slice[slice].intraMbs++;
        }
    }

    fenc->rowSatds[b - p0][p1 - b][cuY] += bcostAq;
    fenc->lowresCosts[b - p0][p1 - b][cuXY] = (uint16_t)(X265_MIN(bcost, LOWRES_COST_MASK) | (listused << LOWRES_COST_SHIFT));
}

11.VBV码率和缓冲区Lookahead::vbvLookahead

VBV预测用于估计视频编码过程中的码率和缓冲区占用情况,以便进行码率控制和缓冲区管理。

void Lookahead::vbvLookahead(Lowres **frames, int numFrames, int keyframe)
{
    int prevNonB = 0, curNonB = 1, idx = 0;
    //根据帧类型,确定非B帧(curNonB)和下一个非B帧(nextNonB)的索引
    while (curNonB < numFrames && IS_X265_TYPE_B(frames[curNonB]->sliceType))
        curNonB++;
    int nextNonB = keyframe ? prevNonB : curNonB;
    int nextB = prevNonB + 1;
    int nextBRef = 0, curBRef = 0;
    if (m_param->bBPyramid && curNonB - prevNonB > 1)
        curBRef = (prevNonB + curNonB + 1) / 2;
    int miniGopEnd = keyframe ? prevNonB : curNonB;
    //遍历帧数组中的每个非B帧(curNonB)
    while (curNonB <= numFrames)
    {   //对于P帧或I帧,计算其与下一个非B帧之间的预测代价(plannedSatd)和帧类型(plannedType)
        /* P/I cost: This shouldn't include the cost of nextNonB */
        if (nextNonB != curNonB)
        {
            int p0 = IS_X265_TYPE_I(frames[curNonB]->sliceType) ? curNonB : prevNonB;
            frames[nextNonB]->plannedSatd[idx] = vbvFrameCost(frames, p0, curNonB, curNonB);
            frames[nextNonB]->plannedType[idx] = frames[curNonB]->sliceType;

            /* Save the nextNonB Cost in each B frame of the current miniGop */
            if (curNonB > miniGopEnd)
            {
                for (int j = nextB; j < miniGopEnd; j++)
                {
                    frames[j]->plannedSatd[frames[j]->indB] = frames[nextNonB]->plannedSatd[idx];
                    frames[j]->plannedType[frames[j]->indB++] = frames[nextNonB]->plannedType[idx];
                }
            }
            idx++;
        }
        
        /* Handle the B-frames: coded order */
        if (m_param->bBPyramid && curNonB - prevNonB > 1)
            nextBRef = (prevNonB + curNonB + 1) / 2;

        for (int i = prevNonB + 1; i < curNonB; i++, idx++)
        {
            int64_t satdCost = 0;
            int type = X265_TYPE_B;
            //如果当前非B帧之后还有B帧(curNonB - prevNonB > 1),计算B帧的预测代价和帧类型
            if (nextBRef)
            {
                if (i == nextBRef)
                {
                    satdCost = vbvFrameCost(frames, prevNonB, curNonB, nextBRef);
                    type = X265_TYPE_BREF;
                }
                else if (i < nextBRef)
                    satdCost = vbvFrameCost(frames, prevNonB, nextBRef, i);
                else
                    satdCost = vbvFrameCost(frames, nextBRef, curNonB, i);
            }
            else
                satdCost = vbvFrameCost(frames, prevNonB, curNonB, i);
            //将计算得到的预测代价和帧类型存储在下一个非B帧(nextNonB)的相应数组中
            frames[nextNonB]->plannedSatd[idx] = satdCost;
            frames[nextNonB]->plannedType[idx] = type;
            /* Save the nextB Cost in each B frame of the current miniGop */
            //根据具体情况,将预测代价和帧类型保存在当前miniGop中的每个B帧中

            for (int j = nextB; j < miniGopEnd; j++)
            {
                if (curBRef && curBRef == i)
                    break;
                if (j >= i && j !=nextBRef)
                    continue;
                frames[j]->plannedSatd[frames[j]->indB] = satdCost;
                frames[j]->plannedType[frames[j]->indB++] = type;
            }
        }
        //更新索引和计数器,继续下一个非B帧的处理,直到遍历完所有帧。
        prevNonB = curNonB;
        curNonB++;
        while (curNonB <= numFrames && IS_X265_TYPE_B(frames[curNonB]->sliceType))
            curNonB++;
    }
    //设置最后一个非B帧(nextNonB)的帧类型为自动X265_TYPE_AUTO
    frames[nextNonB]->plannedType[idx] = X265_TYPE_AUTO;
}

12.场景切换检测Lookahead::scenecut

该函数用于检测场景切换,并返回是否发生了真正的场景切换,代码如下:

bool Lookahead::scenecut(Lowres **frames, int p0, int p1, bool bRealScenecut, int numFrames)
{
    /* Only do analysis during a normal scenecut check. */
    if (bRealScenecut && m_param->bframes)
    {
        int origmaxp1 = p0 + 1;
        /* Look ahead to avoid coding short flashes as scenecuts. */
        origmaxp1 += m_param->bframes;
        int maxp1 = X265_MIN(origmaxp1, numFrames);
        bool fluctuate = false;
        bool noScenecuts = false;
        int64_t avgSatdCost = 0;
        if (frames[p0]->costEst[p1 - p0][0] > -1)
            avgSatdCost = frames[p0]->costEst[p1 - p0][0];
        int cnt = 1;
        /* Where A and B are scenes: AAAAAABBBAAAAAA
         * If BBB is shorter than (maxp1-p0), it is detected as a flash
         * and not considered a scenecut. */
        //需要避免出现这种闪回认为是场景的情况
        for (int cp1 = p1; cp1 <= maxp1; cp1++)
        {
            if (!scenecutInternal(frames, p0, cp1, false))
            {
                /* Any frame in between p0 and cur_p1 cannot be a real scenecut. */
                for (int i = cp1; i > p0; i--)
                {
                    frames[i]->bScenecut = false;
                    noScenecuts = false;
                }
            }
            else if (scenecutInternal(frames, cp1 - 1, cp1, false))
            {   //判断前一帧与当前帧是否也是场景切换帧
                /* If current frame is a Scenecut from p0 frame as well as Scenecut from
                 * preceeding frame, mark it as a Scenecut */
                frames[cp1]->bScenecut = true;
                noScenecuts = true;
            }

            /* compute average satdcost of all the frames in the mini-gop to confirm 
             * whether there is any great fluctuation among them to rule out false positives */
            X265_CHECK(frames[cp1]->costEst[cp1 - p0][0]!= -1, "costEst is not done \n");
            avgSatdCost += frames[cp1]->costEst[cp1 - p0][0];
            cnt++;
        }

        /* Identify possible scene fluctuations by comparing the satd cost of the frames.
         * This could denote the beginning or ending of scene transitions.
         * During a scene transition(fade in/fade outs), if fluctuate remains false,
         * then the scene had completed its transition or stabilized */
        if (noScenecuts)
        {
            fluctuate = false;
            avgSatdCost /= cnt;
            for (int i = p1; i <= maxp1; i++)
            {
                int64_t curCost  = frames[i]->costEst[i - p0][0];
                int64_t prevCost = frames[i - 1]->costEst[i - 1 - p0][0];
                if (fabs((double)(curCost - avgSatdCost)) > 0.1 * avgSatdCost || 
                    fabs((double)(curCost - prevCost)) > 0.1 * prevCost)//比较当前帧和前一帧的SAD成本与平均SAD成本的差异是否超过阈值的10%。如果超过阈值,将波动标志fluctuate设置为true
                {
                    fluctuate = true;
                    if (!m_isSceneTransition && frames[i]->bScenecut)
                    {
                        m_isSceneTransition = true;//只需要检测到第一个场景切换帧即可
                        /* just mark the first scenechange in the scene transition as a scenecut. */
                        for (int j = i + 1; j <= maxp1; j++)
                            frames[j]->bScenecut = false;
                        break;
                    }
                }
                frames[i]->bScenecut = false;
            }
        }
        if (!fluctuate && !noScenecuts)
            m_isSceneTransition = false; /* Signal end of scene transitioning */
    }

    if (m_param->csvLogLevel >= 2)
    {
        int64_t icost = frames[p1]->costEst[0][0];
        int64_t pcost = frames[p1]->costEst[p1 - p0][0];
        frames[p1]->ipCostRatio = (double)icost / pcost;
    }

    /* A frame is always analysed with bRealScenecut = true first, and then bRealScenecut = false,
       the former for I decisions and the latter for P/B decisions. It's possible that the first 
       analysis detected scenecuts which were later nulled due to scene transitioning, in which 
       case do not return a true scenecut for this frame */

    if (!frames[p1]->bScenecut)
        return false;
    //仅返回P1是否是转码
    return scenecutInternal(frames, p0, p1, bRealScenecut);
}

13.帧结构路径成本计算Lookahead::slicetypePathCost

实现了X265_B_ADAPT_TRELLIS帧结构的方案,代码如下:

int64_t Lookahead::slicetypePathCost(Lowres **frames, char *path, int64_t threshold)
{
    int64_t cost = 0;
    int loc = 1;//初始化变量 loc 为 1,表示路径的索引位置,从第一个路径元素开始
    int cur_p = 0;//初始化变量 cur_p 为 0,表示当前p帧的索引位置

    CostEstimateGroup estGroup(*this, frames);
    //将路径指针 path 减1,这是因为第一个路径元素实际上是第二帧
    path--; /* Since the 1st path element is really the second frame */
    while (path[loc])//在循环中,遍历路径元素,直到遇到空字符结束循环
    {
        int next_p = loc;
        /* Find the location of the next P-frame. */
        while (path[next_p] != 'P')
            next_p++;
        //根据找到的下一个P帧位置,计算该帧的代价,并将其添加到总代价 cost 中
        /* Add the cost of the P-frame found above */
        cost += estGroup.singleCost(cur_p, next_p, next_p);

        /* Early terminate if the cost we have found is larger than the best path cost so far */
        if (cost > threshold)
            break;
        //如果启用了B帧金字塔(B-frame pyramid)且下一个P帧与当前P帧的间隔大于2,则进行特殊处理
        if (m_param->bBPyramid && next_p - cur_p > 2)
        {
            int middle = cur_p + (next_p - cur_p) / 2;
            cost += estGroup.singleCost(cur_p, next_p, middle);

            for (int next_b = loc; next_b < middle && cost < threshold; next_b++)
                cost += estGroup.singleCost(cur_p, middle, next_b);

            for (int next_b = middle + 1; next_b < next_p && cost < threshold; next_b++)
                cost += estGroup.singleCost(middle, next_p, next_b);
        }
        else//如果未启用B帧金字塔或间隔小于等于2,则遍历当前P帧和下一个P帧之间的每一帧,计算其代价并添加到总代价 cost 中
        {
            for (int next_b = loc; next_b < next_p && cost < threshold; next_b++)
                cost += estGroup.singleCost(cur_p, next_p, next_b);
        }

        loc = next_p + 1;
        cur_p = next_p;
    }

    return cost;
}

14.CU tree的构建和处理Lookahead::cuTree

实现了X265_B_ADAPT_TRELLIS帧结构的方案,代码如下:

//对给定的帧数组进行CU树的构建和处理
void Lookahead::cuTree(Lowres **frames, int numframes, bool bIntra)
{
    int idx = !bIntra;
    int lastnonb, curnonb = 1;
    int bframes = 0;

    x265_emms();
    double totalDuration = 0.0;
    for (int j = 0; j <= numframes; j++)
        totalDuration += (double)m_param->fpsDenom / m_param->fpsNum;

    double averageDuration = totalDuration / (numframes + 1);

    int i = numframes;

    while (i > 0 && frames[i]->sliceType == X265_TYPE_B)
        i--;

    lastnonb = i;

    /* Lookaheadless MB-tree is not a theoretically distinct case; the same extrapolation could
     * be applied to the end of a lookahead buffer of any size.  However, it's most needed when
     * lookahead=0, so that's what's currently implemented. */
    if (!m_param->lookaheadDepth)
    {
        if (bIntra)
        {   //如果没有启用前向预测(lookaheadDepth为0),则根据帧类型进行处理,设置传播代价(propagateCost)和QP偏移
            memset(frames[0]->propagateCost, 0, m_cuCount * sizeof(uint16_t));
            if (m_param->rc.qgSize == 8)
                memcpy(frames[0]->qpCuTreeOffset, frames[0]->qpAqOffset, m_cuCount * 4 * sizeof(double));
            else
                memcpy(frames[0]->qpCuTreeOffset, frames[0]->qpAqOffset, m_cuCount * sizeof(double));
            return;
        }
        std::swap(frames[lastnonb]->propagateCost, frames[0]->propagateCost);
        memset(frames[0]->propagateCost, 0, m_cuCount * sizeof(uint16_t));
    }
    else
    {
        if (lastnonb < idx)
            return;
        memset(frames[lastnonb]->propagateCost, 0, m_cuCount * sizeof(uint16_t));
    }

    CostEstimateGroup estGroup(*this, frames);

    while (i-- > idx)
    {   //从最后一个非B帧开始,向前遍历帧序列
        curnonb = i;
        while (frames[curnonb]->sliceType == X265_TYPE_B && curnonb > 0)
            curnonb--;

        if (curnonb < idx)
            break;

        estGroup.singleCost(curnonb, lastnonb, lastnonb);

        memset(frames[curnonb]->propagateCost, 0, m_cuCount * sizeof(uint16_t));
        bframes = lastnonb - curnonb - 1;
        if (m_param->bBPyramid && bframes > 1)
        {
            int middle = (bframes + 1) / 2 + curnonb;
            estGroup.singleCost(curnonb, lastnonb, middle);
            memset(frames[middle]->propagateCost, 0, m_cuCount * sizeof(uint16_t));
            while (i > curnonb)
            {
                int p0 = i > middle ? middle : curnonb;
                int p1 = i < middle ? middle : lastnonb;
                if (i != middle)
                {   //从当前帧向前遍历,计算每一帧与参考帧之间的帧类型成本,并进行CU tree 遗传信息的传递操作
                    estGroup.singleCost(p0, p1, i);
                    estimateCUPropagate(frames, averageDuration, p0, p1, i, 0);
                }
                i--;
            }

            estimateCUPropagate(frames, averageDuration, curnonb, lastnonb, middle, 1);
        }
        else
        {
            while (i > curnonb)
            {   //向前遍历,计算所有帧的cost
                estGroup.singleCost(curnonb, lastnonb, i);
                estimateCUPropagate(frames, averageDuration, curnonb, lastnonb, i, 0);
                i--;
            }
        }
        estimateCUPropagate(frames, averageDuration, curnonb, lastnonb, lastnonb, 1);
        lastnonb = curnonb;
    }

    if (!m_param->lookaheadDepth)
    {
        estGroup.singleCost(0, lastnonb, lastnonb);
        estimateCUPropagate(frames, averageDuration, 0, lastnonb, lastnonb, 1);
        std::swap(frames[lastnonb]->propagateCost, frames[0]->propagateCost);
    }
    //在所有的帧类型成本计算和宏块树的传递操作完成后,进行宏块树的最终处理,并输出结果
    cuTreeFinish(frames[lastnonb], averageDuration, lastnonb);
    if (m_param->bBPyramid && bframes > 1 && !m_param->rc.vbvBufferSize)
        cuTreeFinish(frames[lastnonb + (bframes + 1) / 2], averageDuration, 0);
}

点赞、收藏,会是我继续写作的动力!赠人玫瑰,手有余香。

Open Source (GPL) H.265/HEVC video encoder 下载网址:https://bitbucket.org/multicoreware/x265/src x265 developer wiki To compile x265 you must first install Mercurial (or TortoiseHg on Windows) and CMake. Then follow these easy steps: (for the most definitive instructions, consult our build README) Linux Instructions # ubuntu packages: $ sudo apt-get install mercurial cmake cmake-curses-gui build-essential yasm # Note: if the packaged yasm is older than 1.2, you must download yasm-1.2 and build it $ hg clone https://bitbucket.org/multicoreware/x265 $ cd x265/build/linux $ ./make-Makefiles.bash $ make Windows (Visual Studio) Instructions $ hg clone https://bitbucket.org/multicoreware/x265 Then run make-solutions.bat in the build\ folder that corresponds to your favorite compiler, configure your build options, click 'configure', click 'generate', then close cmake-gui. You will be rewarded with an x265.sln file. Also see cmake documentation. Intel Compiler Instructions On Windows, you should open an Intel Compiler command prompt and within it run one of the make-makefiles.bat scripts in build/icl32 or build/icl64, then run nmake. On Linux, you can tell cmake to build Makefiles for icpc directly. This requires you to have configured Intel's compiler environment (by sourcing the appropriate shell script). For example: $ source /opt/intel/composer_xe_2013/bin/compilervars.sh intel64 $ cd repos/x265/build/linux $ export CXX=icpc $ export CC=icc $ ./make-Makefiles $ make Command line interface The Makefile/solution builds a static encoder.lib library and a standalone x265 executable that aims to be similar to x264 in its command line interface. Running without arguments shows you the command line help. Info Mission Statement Road Map TODO HOWTO add a new encoder performance primitive HOWTO Contribute patches to x265 HOWTO cross compile from Linux to Windows Coding Style Helpful links
评论 5
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值