【x265编码器】章节1——x265的lookahead模块分析

Captain1314_李祖团

已于 2024-05-10 12:52:47 修改

阅读量3.9w

点赞数 15

分类专栏：视频编码器文章标签： windows linux h.265 视频编解码音视频算法

于 2023-08-10 22:19:09 首次发布

本文链接：https://blog.csdn.net/vcvdv123/article/details/132130517

版权

视频编码器专栏收录该内容

15 篇文章 18 订阅

订阅专栏

系列文章目录

HEVC视频编解码标准简介

【x264编码器】章节1——x264编码流程及基于x264的编码器demo

【x264编码器】章节2——x264的lookahead流程分析

【x264编码器】章节3——x264的码率控制

【x264编码器】章节4——x264的帧内预测流程

【x264编码器】章节5——x264的帧间预测流程

【x264编码器】章节6——x264的变换量化

【x265编码器】章节1——lookahead模块分析

【x265编码器】章节2——编码流程及基于x265的编码器demo

【x265编码器】章节3——帧内预测流程

【x265编码器】章节4——帧间预测流程

【x265编码器】章节5——x265帧间运动估计流程

【x265编码器】章节6——x265的码率控制

1.用于向前向预测模块添加图片Lookahead::addPicture()

2.检查前向预测队列Lookahead::checkLookaheadQueue()

3.获取已决定的图片Lookahead::getDecidedPicture()

4.查找并执行工作任务Lookahead::findJob()

5.进行类型分析Lookahead::slicetypeDecide()

6.任务分配模块PreLookaheadGroup::processTasks

7.低分辨率帧的帧内估计LookaheadTLD::lowresIntraEstimate()

8.进行类型分析Lookahead::slicetypeAnalyse

9.低分辨率帧间估计CostEstimateGroup::estimateFrameCost

10.低分辨率单个CU帧间估计CostEstimateGroup::estimateCUCost

11.VBV码率和缓冲区Lookahead::vbvLookahead

12.场景切换检测Lookahead::scenecut

13.帧结构路径成本计算Lookahead::slicetypePathCost

14.CU tree的构建和处理Lookahead::cuTree

点赞、收藏，会是我继续写作的动力！赠人玫瑰，手有余香。

前言

x265完整的流程框架如下：

一、模块功能

在x265中，前向预测（lookahead）是一种技术，用于改善视频编码的效率和质量。x265的前向预测功能涉及分析未来的视频帧，以在当前帧的编码过程中做出更好的决策。同时也会进行帧内预测和帧间预测，不同点有两点：

1.在1/4的低分辨率情况下（宽高各是原始视频的一半），lookahead的cu块大小8x8，跟x264一样，进行帧内和帧间预测；

2.帧间预测时，遍历CU的顺序是从下到上，从右到左，具体可以看estimateFrameCost()；

主要的功能有以下四项：场景检测、帧结构确定、CU tree和VBV；

1.场景检测切换

基本流程与x264的scenecut类似，细节有不同，第一轮搜索过滤与x264方案一致，之后的处理与x264的不同，x265场景切换检测的大体流程如下，详细代码分析见Lookahead::scenecut：

对应的场景检测时计算的帧内和帧间coet方式如下，左边是帧内，右边是帧间：

2.帧结构确定

帧结构方案目前有三种，分别是X265_B_ADAPT_NONE、X265_B_ADAPT_FAST和X265_B_ADAPT_TRELLIS，对应的大体流程如下：

X265_B_ADAPT_NONE方案：与x264的方案一致，主要就是按照固定IBBBPBBBP进行展开，对应代码见Lookahead::slicetypeAnalyse；

X265_B_ADAPT_FAST方案：与x264的方案大体一致，但有差异的部分，第一步都会计算BP和PP帧结构类型cost，选取cost最低的帧类型，之后的处理会遍历(i+2,bframes)范围内是否都应为B，假设都是B帧，则在最后添加一个P则，并作为下一轮起始位置，重复往后展开，对应代码见Lookahead::slicetypeAnalyse；

X265_B_ADAPT_TRELLIS方案：与x264方案一致，会保留前面每次计算得到的最优帧结构方案，从而插入0到bframes逐次的B帧，求得当前长度最优方案，之后再这个基础上，计算长度+1的最优方案，不断迭代，直到处理完整个GOP，对应代码见Lookahead::slicetypePath；

3.CU tree

CU tree跟x264的MB tree基本一致，比较简单的解释作用就是：帧与帧之间存在参考的关系，如果被参考的帧拥有更高的质量，那么通过调整一个帧，就可以改善一批帧质量，因此CU tree是根据帧被引用得程度，也可以认为是遗传给了其他帧多少信息，作为衡量该帧的重要性；

因为考虑到遗传是可以累加的，所以采用的逆序遍历的方式进行CU tree中遗传信息的计算，比如b参考p0，则p0的遗传信息对应的公式如下：

遗传信息公式 =(propagate_in + intra_cost * inv_qscales*fps_factor) * (1 - inter_cost / intra_cost) * dist_scale_factor

遗传信息要求正值，所以只有inter_cost<intra_cost即选择了帧间预测的CU块才会有，inter_cost越小，intra_cost越大，越会体现遗传信息的重要，因此遗传信息值越大，采用了帧内预测模式的CU块，遗传信息为0；

propagate_in：为当前帧b帧作为其他帧的参考帧，遗传给其他帧的信息，在计算p0的遗传信息的时候，需要加上作为修正；

dist_scale_factor：距离比例，上面公式需要一个距离来修正；

inv_qscales：量化系数，变得模糊的MB携带的信息更少，理应再加一个修正因素。量化的过程就是把系数除以QStep；

fps_factor：针对可变帧率，这每帧占比的时间不同，占比时间长的帧，理应更重要，因此提出这个参数对公式做修正；

对应代码如下，以及Lookahead::estimateCUPropagate函数中：

/* Estimate the total amount of influence on future quality that could be had if we
 * were to improve the reference samples used to inter predict any given CU. */
static void estimateCUPropagateCost(int* dst, const uint16_t* propagateIn, const int32_t* intraCosts, const uint16_t* interCosts,
                                    const int32_t* invQscales, const double* fpsFactor, int len)
{
    double fps = *fpsFactor / 256;  // range[0.01, 1.00]
    for (int i = 0; i < len; i++)
    {
        int intraCost = intraCosts[i];
        int interCost = X265_MIN(intraCosts[i], interCosts[i] & LOWRES_COST_MASK);
        double propagateIntra = intraCost * invQscales[i]; // Q16 x Q8.8 = Q24.8
        double propagateAmount = (double)propagateIn[i] + propagateIntra * fps; // Q16.0 + Q24.8 x Q0.x = Q25.0
        double propagateNum = (double)(intraCost - interCost); // Q32 - Q32 = Q33.0
        double propagateDenom = (double)intraCost;             // Q32
        dst[i] = (int)(propagateAmount * propagateNum / propagateDenom + 0.5);
        }
    //}
}

根据遗传信息调整QP，公式如下：

$qpoffset =5\ast \left ( 1-qcompress \right )\ast log_{2}\left ( 1+\frac{propagate }{intra * invQscaleFactor * fpsFactor} \right )$

其中propagate表示遗传给后续帧的信息；intra表示自身的信息；qcompress对外参数代表调整QP的强度，QP=0表示完成ABR，QP任意调整，QP=1，完全的CBR，固定QP；fpsFactor主要针对可变码率视频，当前帧停留的越久越重要；

对应代码：

void Lookahead::cuTreeFinish(Lowres *frame, double averageDuration, int ref0Distance)
{   //省略多余代码
    for (int cuIndex = 0; cuIndex < m_cuCount; cuIndex++)
            {   //CU的intracost（MB自身包含的信息）
                int intracost = (frame->intraCost[cuIndex] * frame->invQscaleFactor[cuIndex] + 128) >> 8;
                if (intracost)
                {   //propagateCost（遗传给后续帧的信息）
                    int propagateCost = (frame->propagateCost[cuIndex] * fpsFactor + 128) >> 8;
                    double log2_ratio = X265_LOG2(intracost + propagateCost) - X265_LOG2(intracost) + weightdelta;
                    frame->qpCuTreeOffset[cuIndex] = frame->qpAqOffset[cuIndex] - m_cuTreeStrength * log2_ratio;
                }
            }
}

4.VBV

与x264基本一致，通过Lookahead::vbvLookahead计算低分辨率情况下的plannedSatd，为实际编码vbv码控时提供数据；

5.lookahead大体流程

与x264基本一致

二、lookahead模块分析

1.调用流程

lookahead流程如下图1，与整体x265的关系如图2 （其中黄色部分）：

完整的x265编码流程如下：

2.代码分析

1.用于向前向预测模块添加图片Lookahead::addPicture()

addPicture()方法：该方法由API线程调用，用于向前向预测模块添加图片。

void Lookahead::addPicture(Frame& curFrame, int sliceType)
{   //如果启用了参数analysisLoad且禁用了前向预测（bDisableLookahead），那么会将图片直接添加到输出队列中，并增加m_inputCount计数器。
    if (m_param->analysisLoad && m_param->bDisableLookahead)
    {
        if (!m_filled)
            m_filled = true;
        m_outputLock.acquire();
        m_outputQueue.pushBack(curFrame);
        m_outputLock.release();
        m_inputCount++;
    }
    //否则，会调用checkLookaheadQueue()方法来检查输入队列的状态，并将图片添加到前向预测模块中
    else
    {
        checkLookaheadQueue(m_inputCount);
        curFrame.m_lowres.sliceType = sliceType;
        addPicture(curFrame);
    }
}

2.检查前向预测队列Lookahead::checkLookaheadQueue()

用于检查前向预测队列（lookahead queue）的状态。以下是对代码的解释：

void Lookahead::checkLookaheadQueue(int &frameCnt)
{
    /* determine if the lookahead is (over) filled enough for frames to begin to
     * be consumed by frame encoders */
    //如果m_filled为false（即前向预测队列还未填满），则
    if (!m_filled)
    {   //如果参数bframes和lookaheadDepth都为零，表示使用零延迟模式，此时将m_filled设置为true（表示前向预测队列已满）
        if (!m_param->bframes & !m_param->lookaheadDepth)
            m_filled = true; /* zero-latency */
        //否则，如果已输入的帧数（frameCnt）大于等于前向预测深度（lookaheadDepth）加2加bframes，则将m_filled设置为true（表示前向预测队列已满）
        else if (frameCnt >= m_param->lookaheadDepth + 2 + m_param->bframes)
            m_filled = true; /* full capacity plus mini-gop lag */
    }

    m_inputLock.acquire();
    //如果存在线程池（m_pool）并且输入队列（m_inputQueue）的大小大于等于m_fullQueueSize，则尝试唤醒一个线程
    if (m_pool && m_inputQueue.size() >= m_fullQueueSize)
        tryWakeOne();
    m_inputLock.release();
}

3.获取已决定的图片Lookahead::getDecidedPicture()

这段代码是前向预测（lookahead）模块中的一部分，用于从输出队列中获取已决定的图片（decided picture）。该方法从输出队列中移除图片，并且只会在没有其他可用图片时阻塞。它只在m_filled为true时开始移除图片，而m_filled在超过前向预测深度的图片已经输入后才设置为true，因此在输出图片被取出之前，slicetypeDecide()应该已经开始运行。第一次slicetypeDecide()显然仍然需要阻塞等待，但之后的slicetypeDecide()将保持领先于编码器（因为每次从输出队列中移除一张图片，就会向输入队列中添加一张图片），并在编码器需要它们之前决定图片的切片类型。以下是对代码的解释：

Frame* Lookahead::getDecidedPicture()
{   //检查m_filled变量是否为true，即是否已经填充了足够的图片到输出队列中
    if (m_filled)//表示已经可以从输出队列中获取图片
    {   //获取输出锁（m_outputLock）以确保线程安全地访问输出队列
        m_outputLock.acquire();
        //使用popFront()方法从输出队列中弹出一张图片，并将其赋值给指针out
        Frame *out = m_outputQueue.popFront();
        //释放输出锁
        m_outputLock.release();
        //如果成功获取到一张图片（out非空），则将m_inputCount计数器减1，并返回该图片
        if (out)
        {
            m_inputCount--;
            return out;
        }
        //如果未能获取到图片（out为空），则根据参数analysisLoad和bDisableLookahead的设置来判断是否需要运行slicetypeDecide()方法
        if (m_param->analysisLoad && m_param->bDisableLookahead)
            return NULL;

        findJob(-1); /* run slicetypeDecide() if necessary */

        m_inputLock.acquire();
        //根据slicetypeDecide()方法是否忙碌（m_sliceTypeBusy）来判断是否需要等待输出信号
        bool wait = m_outputSignalRequired = m_sliceTypeBusy;
        m_inputLock.release();
        //如果需要等待输出信号，则调用wait()方法等待信号到来
        if (wait)
            m_outputSignal.wait();
        //再次使用popFront()方法从输出队列中弹出一张图片，并将其赋值给指针out
        out = m_outputQueue.popFront();
        //如果成功获取到一张图片（out非空），则将m_inputCount计数器减1，并返回该图片
        if (out)
            m_inputCount--;
        return out;
    }
    else//表示还没有填充足够的图片到输出队列中，此时返回空指针
        return NULL;
}

4.查找并执行工作任务Lookahead::findJob()

用于查找并执行工作任务，该方法轮询输入队列的占用情况。如果队列已满，它将运行slicetypeDecide()方法，并将一组帧输出到输出队列中（形成一个mini-gop）。如果调用了flush()方法（意味着不会再收到新的图片），则只要输入队列中还剩下一张图片，就认为输入队列已满。

void Lookahead::findJob(int /*workerThreadID*/)
{
    bool doDecide;
    //获取输入锁（m_inputLock）以确保线程安全地访问相关变量
    m_inputLock.acquire();
    //如果输入队列的大小（m_inputQueue.size()）大于等于满队列大小（m_fullQueueSize），且切片类型任务不忙碌（!m_sliceTypeBusy）且前向预测模块处于活动状态（m_isActive），则将doDecide、m_sliceTypeBusy都设置为true
    if (m_inputQueue.size() >= m_fullQueueSize && !m_sliceTypeBusy && m_isActive)
        doDecide = m_sliceTypeBusy = true;
    else//否则，将doDecide设置为false，并将m_helpWanted设置为false
        doDecide = m_helpWanted = false;
    m_inputLock.release();//释放输入锁

    if (!doDecide)
        return;
    //记录切片类型决策的时间和计数
    ProfileLookaheadTime(m_slicetypeDecideElapsedTime, m_countSlicetypeDecide);
    ProfileScopeEvent(slicetypeDecideEV);
    //执行切片类型决策的具体方法（slicetypeDecide()）
    slicetypeDecide();
    //获取输入锁
    m_inputLock.acquire();
    //如果需要输出信号（m_outputSignalRequired为true），则触发输出信号（m_outputSignal.trigger()），并将m_outputSignalRequired设置为false
    if (m_outputSignalRequired)
    {
        m_outputSignal.trigger();
        m_outputSignalRequired = false;
    }
    m_sliceTypeBusy = false;
    m_inputLock.release();//释放输入锁
}

5.进行类型分析Lookahead::slicetypeDecide()

以下是代码的解释：

oid Lookahead::slicetypeDecide()
{   //创建 PreLookaheadGroup 类的实例 pre，并传入当前 Lookahead 对象的引用
    PreLookaheadGroup pre(*this);
    //创建 Lowres 指针数组 frames 和 Frame 指针数组 list，并将它们初始化为零
    Lowres* frames[X265_LOOKAHEAD_MAX + X265_BFRAME_MAX + 4];
    Frame*  list[X265_BFRAME_MAX + 4];
    memset(frames, 0, sizeof(frames));
    memset(list, 0, sizeof(list));
    //计算最大搜索范围 maxSearch，取 m_param->lookaheadDepth 和 X265_LOOKAHEAD_MAX 中的最小值，并确保至少为 1
    int maxSearch = X265_MIN(m_param->lookaheadDepth, X265_LOOKAHEAD_MAX);   
    maxSearch = X265_MAX(1, maxSearch);

    {   //获取输入锁 m_inputLock 的互斥访问权限
        ScopedLock lock(m_inputLock);
        //获取输入队列中的当前帧 curFrame，并定义整数变量 j
        Frame *curFrame = m_inputQueue.first();
        int j;
		if (m_param->bResetZoneConfig)
		{   //遍历 m_param->rc.zones 数组中的每个区域配置
			for (int i = 0; i < m_param->rc.zonefileCount; i++)
			{   //如果当前帧的 m_poc 等于区域配置的 startFrame，将 m_param 更新为该区域配置的 zoneParam
				if (m_param->rc.zones[i].startFrame == curFrame->m_poc)
					m_param = m_param->rc.zones[i].zoneParam;
			}
		}
        //遍历 m_param->bframes + 2 次，将当前帧 curFrame 添加到 list 数组中，并将 curFrame 更新为下一帧
        for (j = 0; j < m_param->bframes + 2; j++)
        {
            if (!curFrame) break;
            list[j] = curFrame;
            curFrame = curFrame->m_next;
        }
        //将输入队列中的第一帧赋值给 curFrame，将 m_lastNonB 赋值给 frames[0]
        curFrame = m_inputQueue.first();
        frames[0] = m_lastNonB;
        //遍历最大搜索范围 maxSearch 次，将当前帧的低分辨率帧 curFrame->m_lowres 添加到 frames 数组中的相应位置
        for (j = 0; j < maxSearch; j++)
        {
            if (!curFrame) break;
            frames[j + 1] = &curFrame->m_lowres;
            //如果当前帧的低分辨率帧尚未初始化，将当前帧添加到 pre.m_preframes 数组中，并增加 pre.m_jobTotal 的计数
            if (!curFrame->m_lowresInit)
                pre.m_preframes[pre.m_jobTotal++] = curFrame;

            curFrame = curFrame->m_next;
        }
        //更新最大搜索范围 maxSearch 为实际遍历的次数
        maxSearch = j;
        //结束输入锁的使用
    }
    //如果存在需要进行预分析的帧（pre.m_jobTotal > 0），执行以下操作
    /* perform pre-analysis on frames which need it, using a bonded task group */
    if (pre.m_jobTotal)
    {   //如果线程池 m_pool 存在，尝试将预分析任务与其他任务进行绑定
        if (m_pool)
            pre.tryBondPeers(*m_pool, pre.m_jobTotal);
        //调用 pre.processTasks(-1) 执行预分析任务
        pre.processTasks(-1);
        //等待所有任务执行完毕
        pre.waitForExit();
    }
    //根据启用淡入区域检测的设置来处理编码器的帧列表
    if(m_param->bEnableFades)
    {   //初始化一些变量，包括 endIndex、length 和 m_frameVariance 数组
        int j, endIndex = 0, length = X265_BFRAME_MAX + 4;
        for (j = 0; j < length; j++)
            m_frameVariance[j] = -1;
        //遍历帧列表 list，将每个帧的低分辨率帧方差（frameVariance）存储在 m_frameVariance 数组中相应位置
        for (j = 0; list[j] != NULL; j++)
            m_frameVariance[list[j]->m_poc % length] = list[j]->m_lowres.frameVariance;
        //根据 m_frameVariance 数组中的值判断是否存在淡入区域。遍历 m_frameVariance 数组的索引 k，并执行以下操作
        for (int k = list[0]->m_poc % length; k <= list[j - 1]->m_poc % length; k++)
        {   //如果当前索引 k 对应的 m_frameVariance 值为 -1，则跳出循环
            if (m_frameVariance[k]  == -1)
                break;
            //如果当前索引 k 大于 0 并且当前 m_frameVariance[k] 大于等于前一个位置的 m_frameVariance 值，或者如果当前索引 k 等于 0 并且当前 m_frameVariance[k] 大于等于 m_frameVariance[length - 1]（数组的最后一个元素），则表示进入了淡入区域
            if((k > 0 && m_frameVariance[k] >= m_frameVariance[k - 1]) || 
                (k == 0 && m_frameVariance[k] >= m_frameVariance[length - 1]))
            {
                m_isFadeIn = true;
                //如果 m_fadeCount 和 m_fadeStart 均为初始值（0 和 -1），则根据当前帧列表中的帧的 POC（Presentation Order Count）值来确定 m_fadeStart 的值
                if (m_fadeCount == 0 && m_fadeStart == -1)
                {
                    for(int temp = list[0]->m_poc; temp <= list[j - 1]->m_poc; temp++)
                        if (k == temp % length) {
                            m_fadeStart = temp ? temp - 1 : 0;
                            break;
                        }
                }
                //更新 m_fadeCount 的值为 list[endIndex]->m_poc - m_fadeStart，其中 endIndex 是当前帧列表中的索引
                m_fadeCount = list[endIndex]->m_poc > m_fadeStart ? list[endIndex]->m_poc - m_fadeStart : 0;
                endIndex++;
            }
            else
            {   //否则，如果当前已经处于淡入区域，并且 m_fadeCount 大于等于 m_param->fpsNum / m_param->fpsDenom（每秒帧数的分子除以分母），则表示淡入区域已经结束。将 m_lowres.bIsFadeEnd 设置为 true，以指示当前帧是淡入区域的结束帧
                if (m_isFadeIn && m_fadeCount >= m_param->fpsNum / m_param->fpsDenom)
                {
                    for (int temp = 0; list[temp] != NULL; temp++)
                    {
                        if (list[temp]->m_poc == m_fadeStart + (int)m_fadeCount)
                        {
                            list[temp]->m_lowres.bIsFadeEnd = true;
                            break;
                        }
                    }
                }
                m_isFadeIn = false;
                m_fadeCount = 0;
                m_fadeStart = -1;
            }
            //如果当前索引 k 等于数组的最后一个索引（length - 1），则将 k 重置为 -1，以便下一次循环时 k 递增为 0
            if (k == length - 1)
                k = -1;
        }
    }
    //在满足一定条件时进行帧分析和码率控制相关的操作
    /*首先，代码检查了以下条件：*/
    if (m_lastNonB &&
        ((m_param->bFrameAdaptive && m_param->bframes) ||
         m_param->rc.cuTree || m_param->scenecutThreshold || m_param->bHistBasedSceneCut ||
         (m_param->lookaheadDepth && m_param->rc.vbvBufferSize)))
    {   //如果 m_param->rc.bStatRead 为假，则调用 slicetypeAnalyse 函数，对帧进行分析
        if (!m_param->rc.bStatRead)
            slicetypeAnalyse(frames, false);
        //根据一些条件判断是否需要进行 VBV（Video Buffering Verifier）预测
        bool bIsVbv = m_param->rc.vbvBufferSize > 0 && m_param->rc.vbvMaxBitrate > 0;
        if ((m_param->analysisLoad && m_param->scaleFactor && bIsVbv) || m_param->bliveVBV2pass)
        {
            int numFrames;
            //遍历帧列表 frames，直到达到最大搜索数 maxSearch 或者遇到空帧（即指针为空），每次递增 numFrames。
            for (numFrames = 0; numFrames < maxSearch; numFrames++)
            {
                Lowres *fenc = frames[numFrames + 1];
                if (!fenc)
                    break;
            }
            //调用 vbvLookahead 函数，传递帧列表 frames、numFrames 和 false 参数，进行 VBV 预测
            vbvLookahead(frames, numFrames, false);
        }
    }

    int bframes, brefs;
    if (!m_param->analysisLoad || m_param->bAnalysisType == HEVC_INFO)
    {
        bool isClosedGopRadl = m_param->radl && (m_param->keyframeMax != m_param->keyframeMin);
        for (bframes = 0, brefs = 0;; bframes++)
        {
            Lowres& frm = list[bframes]->m_lowres;

            if (frm.sliceType == X265_TYPE_BREF && !m_param->bBPyramid && brefs == m_param->bBPyramid)
            {
                frm.sliceType = X265_TYPE_B;
                x265_log(m_param, X265_LOG_WARNING, "B-ref at frame %d incompatible with B-pyramid\n",
                    frm.frameNum);
            }

            /* pyramid with multiple B-refs needs a big enough dpb that the preceding P-frame stays available.
             * smaller dpb could be supported by smart enough use of mmco, but it's easier just to forbid it. */
            else if (frm.sliceType == X265_TYPE_BREF && m_param->bBPyramid && brefs &&
                m_param->maxNumReferences <= (brefs + 3))
            {
                frm.sliceType = X265_TYPE_B;
                x265_log(m_param, X265_LOG_WARNING, "B-ref at frame %d incompatible with B-pyramid and %d reference frames\n",
                    frm.sliceType, m_param->maxNumReferences);
            }//frm.frameNum与上一个关键帧之间的距离是否满足m_param->keyframeMax和m_extendGopBoundary的条件。根据不同的条件，将帧的slice类型更改为X265_TYPE_I或X265_TYPE_IDR
            if (((!m_param->bIntraRefresh || frm.frameNum == 0) && frm.frameNum - m_lastKeyframe >= m_param->keyframeMax &&
                (!m_extendGopBoundary || frm.frameNum - m_lastKeyframe >= m_param->keyframeMax + m_param->gopLookahead)) ||
                (frm.frameNum == (m_param->chunkStart - 1)) || (frm.frameNum == m_param->chunkEnd))
            {
                if (frm.sliceType == X265_TYPE_AUTO || frm.sliceType == X265_TYPE_I)
                    frm.sliceType = m_param->bOpenGOP && m_lastKeyframe >= 0 ? X265_TYPE_I : X265_TYPE_IDR;
                bool warn = frm.sliceType != X265_TYPE_IDR;
                if (warn && m_param->bOpenGOP)
                    warn &= frm.sliceType != X265_TYPE_I;
                if (warn)
                {
                    x265_log(m_param, X265_LOG_WARNING, "specified frame type (%d) at %d is not compatible with keyframe interval\n",
                        frm.sliceType, frm.frameNum);
                    frm.sliceType = m_param->bOpenGOP && m_lastKeyframe >= 0 ? X265_TYPE_I : X265_TYPE_IDR;
                }
            }
            if (frm.bIsFadeEnd){
                frm.sliceType = m_param->bOpenGOP && m_lastKeyframe >= 0 ? X265_TYPE_I : X265_TYPE_IDR;
            }
            if (m_param->bResetZoneConfig)
            {
                for (int i = 0; i < m_param->rc.zonefileCount; i++)
                {
                    int curZoneStart = m_param->rc.zones[i].startFrame;
                    curZoneStart += curZoneStart ? m_param->rc.zones[i].zoneParam->radl : 0;
                    if (curZoneStart == frm.frameNum)
                        frm.sliceType = X265_TYPE_IDR;
                }
            }
            if ((frm.sliceType == X265_TYPE_I && frm.frameNum - m_lastKeyframe >= m_param->keyframeMin) || (frm.frameNum == (m_param->chunkStart - 1)) || (frm.frameNum == m_param->chunkEnd))
            {
                if (m_param->bOpenGOP)
                {
                    m_lastKeyframe = frm.frameNum;
                    frm.bKeyframe = true;
                }
                else
                    frm.sliceType = X265_TYPE_IDR;
            }
            if (frm.sliceType == X265_TYPE_IDR && frm.bScenecut && isClosedGopRadl)
            {
                for (int i = bframes; i < bframes + m_param->radl; i++)
                    list[i]->m_lowres.sliceType = X265_TYPE_B;
                list[(bframes + m_param->radl)]->m_lowres.sliceType = X265_TYPE_IDR;
            }
            if (frm.sliceType == X265_TYPE_IDR)
            {
                /* Closed GOP */
                m_lastKeyframe = frm.frameNum;
                frm.bKeyframe = true;
                int zoneRadl = 0;
                if (m_param->bResetZoneConfig)
                {
                    for (int i = 0; i < m_param->rc.zonefileCount; i++)
                    {
                        int zoneStart = m_param->rc.zones[i].startFrame;
                        zoneStart += zoneStart ? m_param->rc.zones[i].zoneParam->radl : 0;
                        if (zoneStart == frm.frameNum)
                        {
                            zoneRadl = m_param->rc.zones[i].zoneParam->radl;
                            m_param->radl = 0;
                            m_param->rc.zones->zoneParam->radl = i < m_param->rc.zonefileCount - 1 ? m_param->rc.zones[i + 1].zoneParam->radl : 0;
                            break;
                        }
                    }
                }
                if (bframes > 0 && !m_param->radl && !zoneRadl)
                {
                    list[bframes - 1]->m_lowres.sliceType = X265_TYPE_P;
                    bframes--;
                }
            }
            if (bframes == m_param->bframes || !list[bframes + 1])
            {
                if (IS_X265_TYPE_B(frm.sliceType))
                    x265_log(m_param, X265_LOG_WARNING, "specified frame type is not compatible with max B-frames\n");
                if (frm.sliceType == X265_TYPE_AUTO || IS_X265_TYPE_B(frm.sliceType))
                    frm.sliceType = X265_TYPE_P;
            }
            if (frm.sliceType == X265_TYPE_BREF)
                brefs++;
            if (frm.sliceType == X265_TYPE_AUTO)
                frm.sliceType = X265_TYPE_B;
            else if (!IS_X265_TYPE_B(frm.sliceType))
                break;
        }
    }
    else
    {
        for (bframes = 0, brefs = 0;; bframes++)
        {
            Lowres& frm = list[bframes]->m_lowres;
            if (frm.sliceType == X265_TYPE_BREF)
                brefs++;
            if ((IS_X265_TYPE_I(frm.sliceType) && frm.frameNum - m_lastKeyframe >= m_param->keyframeMin)
                || (frm.frameNum == (m_param->chunkStart - 1)) || (frm.frameNum == m_param->chunkEnd))
            {
                m_lastKeyframe = frm.frameNum;
                frm.bKeyframe = true;
            }
            if (!IS_X265_TYPE_B(frm.sliceType))
                break;
        }
    }

    if (m_param->bEnableTemporalSubLayers > 2)
    {
        //Split the partial mini GOP into sub mini GOPs when temporal sub layers are enabled
        if (bframes < m_param->bframes)
        {
            int leftOver = bframes + 1;
            int8_t gopId = m_gopId - 1;
            int gopLen = x265_gop_ra_length[gopId];
            int listReset = 0;

            m_outputLock.acquire();

            while ((gopId >= 0) && (leftOver > 3))
            {
                if (leftOver < gopLen)
                {
                    gopId = gopId - 1;
                    gopLen = x265_gop_ra_length[gopId];
                    continue;
                }
                else
                {
                    int newbFrames = listReset + gopLen - 1;
                    //Re-assign GOP
                    list[newbFrames]->m_lowres.sliceType = IS_X265_TYPE_I(list[newbFrames]->m_lowres.sliceType) ? list[newbFrames]->m_lowres.sliceType : X265_TYPE_P;
                    if (newbFrames)
                        list[newbFrames - 1]->m_lowres.bLastMiniGopBFrame = true;
                    list[newbFrames]->m_lowres.leadingBframes = newbFrames;
                    m_lastNonB = &list[newbFrames]->m_lowres;

                    /* insert a bref into the sequence */
                    if (m_param->bBPyramid && newbFrames)
                    {
                        placeBref(list, listReset, newbFrames, newbFrames + 1, &brefs);
                    }
                    if (m_param->rc.rateControlMode != X265_RC_CQP)
                    {
                        int p0, p1, b;
                        /* For zero latency tuning, calculate frame cost to be used later in RC */
                        if (!maxSearch)
                        {
                            for (int i = listReset; i <= newbFrames; i++)
                                frames[i + 1] = &list[listReset + i]->m_lowres;
                        }

                        /* estimate new non-B cost */
                        p1 = b = newbFrames + 1;
                        p0 = (IS_X265_TYPE_I(frames[newbFrames + 1]->sliceType)) ? b : listReset;

                        CostEstimateGroup estGroup(*this, frames);

                        estGroup.singleCost(p0, p1, b);

                        if (newbFrames)
                            compCostBref(frames, listReset, newbFrames, newbFrames + 1);
                    }

                    m_inputLock.acquire();
                    /* dequeue all frames from inputQueue that are about to be enqueued
                     * in the output queue. The order is important because Frame can
                     * only be in one list at a time */
                    int64_t pts[X265_BFRAME_MAX + 1];
                    for (int i = 0; i < gopLen; i++)
                    {
                        Frame *curFrame;
                        curFrame = m_inputQueue.popFront();
                        pts[i] = curFrame->m_pts;
                        maxSearch--;
                    }
                    m_inputLock.release();

                    int idx = 0;
                    /* add non-B to output queue */
                    list[newbFrames]->m_reorderedPts = pts[idx++];
                    list[newbFrames]->m_gopOffset = 0;
                    list[newbFrames]->m_gopId = gopId;
                    list[newbFrames]->m_tempLayer = x265_gop_ra[gopId][0].layer;
                    m_outputQueue.pushBack(*list[newbFrames]);

                    /* add B frames to output queue */
                    int i = 1, j = 1;
                    while (i < gopLen)
                    {
                        int offset = listReset + (x265_gop_ra[gopId][j].poc_offset - 1);
                        if (!list[offset] || offset == newbFrames)
                            continue;

                        // Assign gop offset and temporal layer of frames
                        list[offset]->m_gopOffset = j;
                        list[bframes]->m_gopId = gopId;
                        list[offset]->m_tempLayer = x265_gop_ra[gopId][j++].layer;

                        list[offset]->m_reorderedPts = pts[idx++];
                        m_outputQueue.pushBack(*list[offset]);
                        i++;
                    }

                    listReset += gopLen;
                    leftOver = leftOver - gopLen;
                    gopId -= 1;
                    gopLen = (gopId >= 0) ? x265_gop_ra_length[gopId] : 0;
                }
            }

            if (leftOver > 0 && leftOver < 4)
            {
                int64_t pts[X265_BFRAME_MAX + 1];
                int idx = 0;

                int newbFrames = listReset + leftOver - 1;
                list[newbFrames]->m_lowres.sliceType = IS_X265_TYPE_I(list[newbFrames]->m_lowres.sliceType) ? list[newbFrames]->m_lowres.sliceType : X265_TYPE_P;
                if (newbFrames)
                        list[newbFrames - 1]->m_lowres.bLastMiniGopBFrame = true;
                list[newbFrames]->m_lowres.leadingBframes = newbFrames;
                m_lastNonB = &list[newbFrames]->m_lowres;

                /* insert a bref into the sequence */
                if (m_param->bBPyramid && (newbFrames- listReset) > 1)
                    placeBref(list, listReset, newbFrames, newbFrames + 1, &brefs);

                if (m_param->rc.rateControlMode != X265_RC_CQP)
                {
                    int p0, p1, b;
                    /* For zero latency tuning, calculate frame cost to be used later in RC */
                    if (!maxSearch)
                    {
                        for (int i = listReset; i <= newbFrames; i++)
                            frames[i + 1] = &list[listReset + i]->m_lowres;
                    }

                        /* estimate new non-B cost */
                    p1 = b = newbFrames + 1;
                    p0 = (IS_X265_TYPE_I(frames[newbFrames + 1]->sliceType)) ? b : listReset;

                    CostEstimateGroup estGroup(*this, frames);

                    estGroup.singleCost(p0, p1, b);

                    if (newbFrames)
                        compCostBref(frames, listReset, newbFrames, newbFrames + 1);
                }

                m_inputLock.acquire();
                /* dequeue all frames from inputQueue that are about to be enqueued
                 * in the output queue. The order is important because Frame can
                 * only be in one list at a time */
                for (int i = 0; i < leftOver; i++)
                {
                    Frame *curFrame;
                    curFrame = m_inputQueue.popFront();
                    pts[i] = curFrame->m_pts;
                    maxSearch--;
                }
                m_inputLock.release();

                m_lastNonB = &list[newbFrames]->m_lowres;
                list[newbFrames]->m_reorderedPts = pts[idx++];
                list[newbFrames]->m_gopOffset = 0;
                list[newbFrames]->m_gopId = -1;
                list[newbFrames]->m_tempLayer = 0;
                m_outputQueue.pushBack(*list[newbFrames]);
                if (brefs)
                {
                    for (int i = listReset; i < newbFrames; i++)
                    {
                        if (list[i]->m_lowres.sliceType == X265_TYPE_BREF)
                        {
                            list[i]->m_reorderedPts = pts[idx++];
                            list[i]->m_gopOffset = 0;
                            list[i]->m_gopId = -1;
                            list[i]->m_tempLayer = 0;
                            m_outputQueue.pushBack(*list[i]);
                        }
                    }
                }

                /* add B frames to output queue */
                for (int i = listReset; i < newbFrames; i++)
                {
                    /* push all the B frames into output queue except B-ref, which already pushed into output queue */
                    if (list[i]->m_lowres.sliceType != X265_TYPE_BREF)
                    {
                        list[i]->m_reorderedPts = pts[idx++];
                        list[i]->m_gopOffset = 0;
                        list[i]->m_gopId = -1;
                        list[i]->m_tempLayer = 1;
                        m_outputQueue.pushBack(*list[i]);
                    }
                }
            }
        }
        else
        // Fill the complete mini GOP when temporal sub layers are enabled
        {

            list[bframes - 1]->m_lowres.bLastMiniGopBFrame = true;
            list[bframes]->m_lowres.leadingBframes = bframes;
            m_lastNonB = &list[bframes]->m_lowres;

            /* insert a bref into the sequence */
            if (m_param->bBPyramid && !brefs)
            {
                placeBref(list, 0, bframes, bframes + 1, &brefs);
            }

            /* calculate the frame costs ahead of time for estimateFrameCost while we still have lowres */
            if (m_param->rc.rateControlMode != X265_RC_CQP)
            {
                int p0, p1, b;
                /* For zero latency tuning, calculate frame cost to be used later in RC */
                if (!maxSearch)
                {
                    for (int i = 0; i <= bframes; i++)
                        frames[i + 1] = &list[i]->m_lowres;
                }

                /* estimate new non-B cost */
                p1 = b = bframes + 1;
                p0 = (IS_X265_TYPE_I(frames[bframes + 1]->sliceType)) ? b : 0;

                CostEstimateGroup estGroup(*this, frames);
                estGroup.singleCost(p0, p1, b);

                compCostBref(frames, 0, bframes, bframes + 1);
            }

            m_inputLock.acquire();
            /* dequeue all frames from inputQueue that are about to be enqueued
            * in the output queue. The order is important because Frame can
            * only be in one list at a time */
            int64_t pts[X265_BFRAME_MAX + 1];
            for (int i = 0; i <= bframes; i++)
            {
                Frame *curFrame;
                curFrame = m_inputQueue.popFront();
                pts[i] = curFrame->m_pts;
                maxSearch--;
            }
            m_inputLock.release();

            m_outputLock.acquire();

            int idx = 0;
            /* add non-B to output queue */
            list[bframes]->m_reorderedPts = pts[idx++];
            list[bframes]->m_gopOffset = 0;
            list[bframes]->m_gopId = m_gopId;
            list[bframes]->m_tempLayer = x265_gop_ra[m_gopId][0].layer;
            m_outputQueue.pushBack(*list[bframes]);

            int i = 1, j = 1;
            while (i <= bframes)
            {
                int offset = x265_gop_ra[m_gopId][j].poc_offset - 1;
                if (!list[offset] || offset == bframes)
                    continue;

                // Assign gop offset and temporal layer of frames
                list[offset]->m_gopOffset = j;
                list[offset]->m_gopId = m_gopId;
                list[offset]->m_tempLayer = x265_gop_ra[m_gopId][j++].layer;

                /* add B frames to output queue */
                list[offset]->m_reorderedPts = pts[idx++];
                m_outputQueue.pushBack(*list[offset]);
                i++;
            }
        }

        bool isKeyFrameAnalyse = (m_param->rc.cuTree || (m_param->rc.vbvBufferSize && m_param->lookaheadDepth));
        if (isKeyFrameAnalyse && IS_X265_TYPE_I(m_lastNonB->sliceType))
        {
            m_inputLock.acquire();
            Frame *curFrame = m_inputQueue.first();
            frames[0] = m_lastNonB;
            int j;
            for (j = 0; j < maxSearch; j++)
            {
                frames[j + 1] = &curFrame->m_lowres;
                curFrame = curFrame->m_next;
            }
            m_inputLock.release();

            frames[j + 1] = NULL;
            if (!m_param->rc.bStatRead)
                slicetypeAnalyse(frames, true);
            bool bIsVbv = m_param->rc.vbvBufferSize > 0 && m_param->rc.vbvMaxBitrate > 0;
            if ((m_param->analysisLoad && m_param->scaleFactor && bIsVbv) || m_param->bliveVBV2pass)
            {
                int numFrames;
                for (numFrames = 0; numFrames < maxSearch; numFrames++)
                {
                    Lowres *fenc = frames[numFrames + 1];
                    if (!fenc)
                        break;
                }
                vbvLookahead(frames, numFrames, true);
            }
        }


        m_outputLock.release();
    }
    else
    {

        if (bframes)
            list[bframes - 1]->m_lowres.bLastMiniGopBFrame = true;
        list[bframes]->m_lowres.leadingBframes = bframes;
        m_lastNonB = &list[bframes]->m_lowres;
        //接下来的代码段是关于插入B参考帧（B reference frame）的。如果满足条件m_param->bBPyramid为真，且bframes大于1，且brefs为0，则会调用placeBref函数将B参考帧插入到序列中
        /* insert a bref into the sequence */
        if (m_param->bBPyramid && bframes > 1 && !brefs)
        {
            placeBref(list, 0, bframes, bframes + 1, &brefs);
        }
        /* calculate the frame costs ahead of time for estimateFrameCost while we still have lowres */
        if (m_param->rc.rateControlMode != X265_RC_CQP)
        {
            int p0, p1, b;
            /* For zero latency tuning, calculate frame cost to be used later in RC */
            if (!maxSearch)
            {
                for (int i = 0; i <= bframes; i++)
                    frames[i + 1] = &list[i]->m_lowres;
            }

            /* estimate new non-B cost */
            p1 = b = bframes + 1;
            p0 = (IS_X265_TYPE_I(frames[bframes + 1]->sliceType)) ? b : 0;

            CostEstimateGroup estGroup(*this, frames);
            estGroup.singleCost(p0, p1, b);

            if (m_param->bEnableTemporalSubLayers > 1 && bframes)
            {
                compCostBref(frames, 0, bframes, bframes + 1);
            }
            else
            {
                if (bframes)
                {
                    p0 = 0; // last nonb
                    bool isp0available = frames[bframes + 1]->sliceType == X265_TYPE_IDR ? false : true;

                    for (b = 1; b <= bframes; b++)
                    {
                        if (!isp0available)
                            p0 = b;

                        if (frames[b]->sliceType == X265_TYPE_B)
                            for (p1 = b; frames[p1]->sliceType == X265_TYPE_B; p1++)
                                ; // find new nonb or bref
                        else
                            p1 = bframes + 1;

                        estGroup.singleCost(p0, p1, b);

                        if (frames[b]->sliceType == X265_TYPE_BREF)
                        {
                            p0 = b;
                            isp0available = true;
                        }
                    }
                }
            }
        }
        //使用m_inputLock进行锁定，以确保线程安全
        m_inputLock.acquire();
        /* dequeue all frames from inputQueue that are about to be enqueued
         * in the output queue. The order is important because Frame can
         * only be in one list at a time */
        int64_t pts[X265_BFRAME_MAX + 1];
        for (int i = 0; i <= bframes; i++)
        {
            Frame *curFrame;
            curFrame = m_inputQueue.popFront();
            pts[i] = curFrame->m_pts;
            maxSearch--;
        }
        m_inputLock.release();

        m_outputLock.acquire();

        /* add non-B to output queue */
        int idx = 0;
        list[bframes]->m_reorderedPts = pts[idx++];
        m_outputQueue.pushBack(*list[bframes]);
        //如果存在B参考帧（brefs为真），则遍历list中的帧，找到类型为B参考帧（X265_TYPE_BREF）的帧，并将其添加到m_outputQueue中。这些帧的时间戳也从pts数组中取出
        /* Add B-ref frame next to P frame in output queue, the B-ref encode before non B-ref frame */
        if (brefs)
        {
            for (int i = 0; i < bframes; i++)
            {
                if (list[i]->m_lowres.sliceType == X265_TYPE_BREF)
                {
                    list[i]->m_reorderedPts = pts[idx++];
                    m_outputQueue.pushBack(*list[i]);
                }
            }
        }
        //代码遍历B帧（除了B参考帧），将它们添加到m_outputQueue中，并从pts数组中取出相应的时间戳
        /* add B frames to output queue */
        for (int i = 0; i < bframes; i++)
        {
            /* push all the B frames into output queue except B-ref, which already pushed into output queue */
            if (list[i]->m_lowres.sliceType != X265_TYPE_BREF)
            {
                list[i]->m_reorderedPts = pts[idx++];
                m_outputQueue.pushBack(*list[i]);
            }
        }

        //如果满足条件isKeyFrameAnalyse为真且最后一个非B帧的类型为I帧，则进入关键帧分析的逻辑
        bool isKeyFrameAnalyse = (m_param->rc.cuTree || (m_param->rc.vbvBufferSize && m_param->lookaheadDepth));
        if (isKeyFrameAnalyse && IS_X265_TYPE_I(m_lastNonB->sliceType))
        {
            m_inputLock.acquire();
            Frame *curFrame = m_inputQueue.first();
            frames[0] = m_lastNonB;
            int j;
            for (j = 0; j < maxSearch; j++)
            {
                frames[j + 1] = &curFrame->m_lowres;
                curFrame = curFrame->m_next;
            }
            m_inputLock.release();

            frames[j + 1] = NULL;
            if (!m_param->rc.bStatRead)
                slicetypeAnalyse(frames, true);
            bool bIsVbv = m_param->rc.vbvBufferSize > 0 && m_param->rc.vbvMaxBitrate > 0;
            if ((m_param->analysisLoad && m_param->scaleFactor && bIsVbv) || m_param->bliveVBV2pass)
            {
                int numFrames;
                for (numFrames = 0; numFrames < maxSearch; numFrames++)
                {
                    Lowres *fenc = frames[numFrames + 1];
                    if (!fenc)
                        break;
                }
                vbvLookahead(frames, numFrames, true);
            }
        }

        m_outputLock.release();
    }
}

6.任务分配模块PreLookaheadGroup::processTasks

这段代码是PreLookaheadGroup类中的processTasks函数。以下是代码的解释：

void PreLookaheadGroup::processTasks(int workerThreadID)
{
    //如果 workerThreadID 小于 0，则将其设置为 m_lookahead 对象的线程池中的工作线程数量，否则将其设置为 0
    if (workerThreadID < 0)
        workerThreadID = m_lookahead.m_pool ? m_lookahead.m_pool->m_numWorkers : 0;
    //获取与工作线程ID对应的 LookaheadTLD 对象引用 tld，即预先分析任务相关的线程本地数据
    LookaheadTLD& tld = m_lookahead.m_tld[workerThreadID];
    //获取锁 m_lock 的互斥访问权限
    m_lock.acquire();
    //在循环中，只要已经获取的任务数量 m_jobAcquired 小于总任务数量 m_jobTotal
    while (m_jobAcquired < m_jobTotal)
    {   //获取当前需要处理的预先分析帧 preFrame，并将 m_jobAcquired 自增
        Frame* preFrame = m_preframes[m_jobAcquired++];
        //在预先分析任务开始的位置进行性能分析
        ProfileLookaheadTime(m_lookahead.m_preLookaheadElapsedTime, m_lookahead.m_countPreLookahead);
        ProfileScopeEvent(prelookahead);
        //释放锁 m_lock
        m_lock.release();
        //初始化预先分析帧的低分辨率帧 preFrame->m_lowres，使用 preFrame->m_fencPic 和 preFrame->m_poc 初始化
        preFrame->m_lowres.init(preFrame->m_fencPic, preFrame->m_poc);
        //如果启用了自适应量化 (m_lookahead.m_bAdaptiveQuant)，则调用 tld.calcAdaptiveQuantFrame 方法计算自适应量化帧
        if (m_lookahead.m_bAdaptiveQuant)
            tld.calcAdaptiveQuantFrame(preFrame, m_lookahead.m_param);
        //如果启用了基于直方图的场景切换检测 (m_lookahead.m_param->bHistBasedSceneCut)，则调用 tld.collectPictureStatistics 方法收集图片统计信息
        if (m_lookahead.m_param->bHistBasedSceneCut)
            tld.collectPictureStatistics(preFrame);
        //调用 tld.lowresIntraEstimate 方法进行低分辨率帧的帧内估计
        tld.lowresIntraEstimate(preFrame->m_lowres, m_lookahead.m_param->rc.qgSize);
        preFrame->m_lowresInit = true;
        //获取锁 m_lock 的互斥访问权限
        m_lock.acquire();
    }
    //释放锁 m_lock
    m_lock.release();
}

7.低分辨率帧的帧内估计LookaheadTLD::lowresIntraEstimate()

该方法用于进行低分辨率帧的帧内估计，以下是代码的解释：

//该方法用于进行低分辨率帧的帧内估计
void LookaheadTLD::lowresIntraEstimate(Lowres& fenc, uint32_t qgSize)
{   //定义了一些局部变量和常量，包括像素数组 prediction、fencIntra、neighbours，以及指向 neighbours 中两个不同位置的指针 samples 和 filtered
    ALIGN_VAR_32(pixel, prediction[X265_LOWRES_CU_SIZE * X265_LOWRES_CU_SIZE]);
    pixel fencIntra[X265_LOWRES_CU_SIZE * X265_LOWRES_CU_SIZE];
    pixel neighbours[2][X265_LOWRES_CU_SIZE * 4 + 1];
    pixel* samples = neighbours[0], *filtered = neighbours[1];
    //初始化一些参数，如预测模式相关的 lambda 值、帧内预测的惩罚值、CU（Coding Unit）的大小和索引等
    const int lookAheadLambda = (int)x265_lambda_tab[X265_LOOKAHEAD_QP];
    const int intraPenalty = 5 * lookAheadLambda;
    const int lowresPenalty = 4; /* fixed CU cost overhead */

    const int cuSize  = X265_LOWRES_CU_SIZE;
    const int cuSize2 = cuSize << 1;
    const int sizeIdx = X265_LOWRES_CU_BITS - 2;

    pixelcmp_t satd = primitives.pu[sizeIdx].satd;
    int planar = !!(cuSize >= 8);

    int costEst = 0, costEstAq = 0;
    //对于每个 CU 的 Y 坐标（cuY）循环遍历，范围是从 0 到 heightInCU - 1
    for (int cuY = 0; cuY < heightInCU; cuY++)
    {
        fenc.rowSatds[0][0][cuY] = 0;
        //在每个 CU 的 X 坐标（cuX）循环遍历，范围是从 0 到 widthInCU - 1
        for (int cuX = 0; cuX < widthInCU; cuX++)
        {   //计算当前 CU 的索引 cuXY 和像素偏移量 pelOffset
            const int cuXY = cuX + cuY * widthInCU;
            const intptr_t pelOffset = cuSize * cuX + cuSize * cuY * fenc.lumaStride;
            pixel *pixCur = fenc.lowresPlane[0] + pelOffset;

            /* copy fenc pixels *///将当前 CU 的像素拷贝到 fencIntra 数组中
            primitives.cu[sizeIdx].copy_pp(fencIntra, cuSize, pixCur, fenc.lumaStride);

            /* collect reference sample pixels */
            //收集邻域样本像素，并存储在 samples 数组中。拷贝顶部样本和左侧样本
            pixCur -= fenc.lumaStride + 1;
            memcpy(samples, pixCur, (2 * cuSize + 1) * sizeof(pixel)); /* top */
            for (int i = 1; i <= 2 * cuSize; i++)
                samples[cuSize2 + i] = pixCur[i * fenc.lumaStride];    /* left */

            primitives.cu[sizeIdx].intra_filter(samples, filtered);

            int cost, icost = me.COST_MAX;
            uint32_t ilowmode = 0;
            //对于 DC 和 Planar 两种预测模式，分别进行帧内预测，并计算预测残差的 SATD（Sum of Absolute Transformed Differences）代价。选择较小的代价作为当前 CU 的最佳预测模式
            /* DC and planar */
            primitives.cu[sizeIdx].intra_pred[DC_IDX](prediction, cuSize, samples, 0, cuSize <= 16);
            cost = satd(fencIntra, cuSize, prediction, cuSize);
            COPY2_IF_LT(icost, cost, ilowmode, DC_IDX);

            primitives.cu[sizeIdx].intra_pred[PLANAR_IDX](prediction, cuSize, neighbours[planar], 0, 0);
            cost = satd(fencIntra, cuSize, prediction, cuSize);
            COPY2_IF_LT(icost, cost, ilowmode, PLANAR_IDX);

            /* scan angular predictions */
            int filter, acost = me.COST_MAX;
            uint32_t mode, alowmode = 4;
            //遍历角度预测模式，计算每个模式的预测残差的 SATD 代价，并选择最小的代价作为当前 CU 的最佳预测模式
            for (mode = 5; mode < 35; mode += 5)
            {
                filter = !!(g_intraFilterFlags[mode] & cuSize);
                primitives.cu[sizeIdx].intra_pred[mode](prediction, cuSize, neighbours[filter], mode, cuSize <= 16);
                cost = satd(fencIntra, cuSize, prediction, cuSize);
                COPY2_IF_LT(acost, cost, alowmode, mode);
            }
            //在最佳预测模式周围的两个模式中，再次计算预测残差的 SATD 代价，并选择最小的代价作为当前 CU 的最终预测模式
            for (uint32_t dist = 2; dist >= 1; dist--)
            {
                int minusmode = alowmode - dist;
                int plusmode = alowmode + dist;

                mode = minusmode;
                filter = !!(g_intraFilterFlags[mode] & cuSize);
                primitives.cu[sizeIdx].intra_pred[mode](prediction, cuSize, neighbours[filter], mode, cuSize <= 16);
                cost = satd(fencIntra, cuSize, prediction, cuSize);
                COPY2_IF_LT(acost, cost, alowmode, mode);

                mode = plusmode;
                filter = !!(g_intraFilterFlags[mode] & cuSize);
                primitives.cu[sizeIdx].intra_pred[mode](prediction, cuSize, neighbours[filter], mode, cuSize <= 16);
                cost = satd(fencIntra, cuSize, prediction, cuSize);
                COPY2_IF_LT(acost, cost, alowmode, mode);
            }
            COPY2_IF_LT(icost, acost, ilowmode, alowmode);
            //根据预测模式的代价和惩罚值，估计当前 CU 的帧内信号代价，并更新相关数据结构
            icost += intraPenalty + lowresPenalty; /* estimate intra signal cost */

            fenc.lowresCosts[0][0][cuXY] = (uint16_t)(X265_MIN(icost, LOWRES_COST_MASK) | (0 << LOWRES_COST_SHIFT));
            fenc.intraCost[cuXY] = icost;
            fenc.intraMode[cuXY] = (uint8_t)ilowmode;
            /* do not include edge blocks in the 
            frame cost estimates, they are not very accurate */
            //如果当前 CU 不在边缘位置，则将其帧内信号代价累加到整个帧的代价估计中
            const bool bFrameScoreCU = (cuX > 0 && cuX < widthInCU - 1 &&
                                        cuY > 0 && cuY < heightInCU - 1) || widthInCU <= 2 || heightInCU <= 2;
            int icostAq;
            if (qgSize == 8)
                icostAq = (bFrameScoreCU && fenc.invQscaleFactor) ? ((icost * fenc.invQscaleFactor8x8[cuXY] + 128) >> 8) : icost;
            else
                icostAq = (bFrameScoreCU && fenc.invQscaleFactor) ? ((icost * fenc.invQscaleFactor[cuXY] +128) >> 8) : icost;

            if (bFrameScoreCU)
            {
                costEst += icost;
                costEstAq += icostAq;
            }

            fenc.rowSatds[0][0][cuY] += icostAq;
        }
    }
    //更新整个帧的代价估计
    fenc.costEst[0][0] = costEst;
    fenc.costEstAq[0][0] = costEstAq;
}

8.进行类型分析Lookahead::slicetypeAnalyse

类型分析

void Lookahead::slicetypeAnalyse(Lowres **frames, bool bKeyframe)
{
    int numFrames, origNumFrames, keyintLimit, framecnt;
    //根据条件计算最大搜索帧数 maxSearch，取 m_param->lookaheadDepth 和 X265_LOOKAHEAD_MAX 中的较小值
    int maxSearch = X265_MIN(m_param->lookaheadDepth, X265_LOOKAHEAD_MAX);
    int cuCount = m_8x8Blocks;
    int resetStart;
    bool bIsVbvLookahead = m_param->rc.vbvBufferSize && m_param->lookaheadDepth;

    /* count undecided frames */
    //统计未决帧数。遍历帧列表 frames，直到达到最大搜索帧数 maxSearch 或遇到切片类型不为 X265_TYPE_AUTO 的帧，每次递增 framecnt。这一步统计了未决帧的数量
    for (framecnt = 0; framecnt < maxSearch; framecnt++)
    {
        Lowres *fenc = frames[framecnt + 1];
        if (!fenc || fenc->sliceType != X265_TYPE_AUTO)
            break;
    }
    //如果 framecnt 为 0，表示未找到未决帧。根据条件判断是否需要进行 CU 树的处理，如果需要，则调用 cuTree 函数进行处理，然后返回
    if (!framecnt)
    {
        if (m_param->rc.cuTree)
            cuTree(frames, 0, bKeyframe);
        return;
    }//将 frames[framecnt + 1] 设置为 NULL，表示未决帧之后的帧为空
    frames[framecnt + 1] = NULL;
    //如果启用了区域配置重置（m_param->bResetZoneConfig 为真），则根据区域配置的设置更新 m_param->keyframeMax
    if (m_param->bResetZoneConfig)
    {
        for (int i = 0; i < m_param->rc.zonefileCount; i++)
        {
            int curZoneStart = m_param->rc.zones[i].startFrame, nextZoneStart = 0;
            curZoneStart += curZoneStart ? m_param->rc.zones[i].zoneParam->radl : 0;
            nextZoneStart += (i + 1 < m_param->rc.zonefileCount) ? m_param->rc.zones[i + 1].startFrame + m_param->rc.zones[i + 1].zoneParam->radl : m_param->totalFrames;
            if (curZoneStart <= frames[0]->frameNum && nextZoneStart > frames[0]->frameNum)
                m_param->keyframeMax = nextZoneStart - curZoneStart;
            if (m_param->rc.zones[m_param->rc.zonefileCount - 1].startFrame <= frames[0]->frameNum && nextZoneStart == 0)
                m_param->keyframeMax = m_param->rc.zones[0].keyframeMax;
        }
    }//根据当前帧的帧号和区块的设置，更新 keylimit 的值
    int keylimit = m_param->keyframeMax;
    if (frames[0]->frameNum < m_param->chunkEnd)
    {
        int chunkStart = (m_param->chunkStart - m_lastKeyframe - 1);
        int chunkEnd = (m_param->chunkEnd - m_lastKeyframe);
        if ((chunkStart > 0) && (chunkStart < m_param->keyframeMax))
            keylimit = chunkStart;
        else if ((chunkEnd > 0) && (chunkEnd < m_param->keyframeMax))
            keylimit = chunkEnd;
    }
    //根据 GOP 的设置和可用的关键帧限制，计算 keyFrameLimit 的值
    int keyFrameLimit = keylimit + m_lastKeyframe - frames[0]->frameNum - 1;
    if (m_param->gopLookahead && keyFrameLimit <= m_param->bframes + 1)
        keyintLimit = keyFrameLimit + m_param->gopLookahead;
    else
        keyintLimit = keyFrameLimit;
    //根据不同情况更新 numFrames 的值，包括是否启用 VBV 预测、是否为开放式 GOP 和是否存在未决帧
    origNumFrames = numFrames = m_param->bIntraRefresh ? framecnt : X265_MIN(framecnt, keyintLimit);
    if (bIsVbvLookahead)
        numFrames = framecnt;
    else if (m_param->bOpenGOP && numFrames < framecnt)
        numFrames++;
    else if (numFrames == 0)
    {
        frames[1]->sliceType = X265_TYPE_I;
        return;
    }
    //首先判断是否需要进行批处理的运动搜索
    if (m_bBatchMotionSearch)
    {   //创建一个CostEstimateGroup对象estGroup，该对象用于存储成本估计，使用嵌套循环遍历帧（frames）中的每个参考帧（b）和其之前的帧（p0），以及其之后的帧（p1），并添加到estGroup中进行运动搜索
        /* pre-calculate all motion searches, using many worker threads */
        CostEstimateGroup estGroup(*this, frames);
        for (int b = 2; b < numFrames; b++)
        {   //这个循环仅增加前后帧距离相等的参考关系
            for (int i = 1; i <= m_param->bframes + 1; i++)
            {
                int p0 = b - i;
                if (p0 < 0)
                    continue;

                /* Skip search if already done */
                if (frames[b]->lowresMvs[0][i][0].x != 0x7FFF)
                    continue;

                /* perform search to p1 at same distance, if possible */
                int p1 = b + i;
                if (p1 >= numFrames || frames[b]->lowresMvs[1][i][0].x != 0x7FFF)
                    p1 = b;

                estGroup.add(p0, p1, b);
            }
        }//自动禁用批处理运动搜索（m_bBatchMotionSearch）如果线程池（m_pool）的工作线程数量小于4
        /* auto-disable after the first batch if pool is small */
        m_bBatchMotionSearch &= m_pool->m_numWorkers >= 4;
        estGroup.finishBatch();

        if (m_bBatchFrameCosts)
        {   //这边在上面的前后帧距离相等的基础上，再补充其他的组合方式
            /* pre-calculate all frame cost estimates, using many worker threads */
            for (int b = 2; b < numFrames; b++)
            {
                for (int i = 1; i <= m_param->bframes + 1; i++)
                {   
                    if (b < i)
                        continue;

                    /* only measure frame cost in this pass if motion searches
                     * are already done */
                    if (frames[b]->lowresMvs[0][i][0].x == 0x7FFF)
                        continue;

                    int p0 = b - i;

                    for (int j = 0; j <= m_param->bframes; j++)
                    {
                        int p1 = b + j;
                        if (p1 >= numFrames)
                            break;

                        /* ensure P1 search is done */
                        if (j && frames[b]->lowresMvs[1][j][0].x == 0x7FFF)
                            continue;

                        /* ensure frame cost is not done */
                        if (frames[b]->costEst[i][j] >= 0)
                            continue;

                        estGroup.add(p0, p1, b);
                    }
                }
            }

            /* auto-disable after the first batch if the pool is not large */
            m_bBatchFrameCosts &= m_pool->m_numWorkers > 12;
            estGroup.finishBatch();
        }
    }

    int numBFrames = 0;
    int numAnalyzed = numFrames;
    bool isScenecut = false;

    if (m_param->bHistBasedSceneCut)
        isScenecut = histBasedScenecut(frames, 0, 1, origNumFrames);
    else//判断当前帧是否是场景切换
        isScenecut = scenecut(frames, 0, 1, true, origNumFrames);

    /* When scenecut threshold is set, use scenecut detection for I frame placements */
    if (m_param->scenecutThreshold && isScenecut)
    {   //将第二帧的 sliceType 设置为关键帧（I 帧）类型，并返回
        frames[1]->sliceType = X265_TYPE_I;
        return;
    }
    if (m_param->gopLookahead && (keyFrameLimit >= 0) && (keyFrameLimit <= m_param->bframes + 1))
    {
        bool sceneTransition = m_isSceneTransition;
        m_extendGopBoundary = false;
        for (int i = m_param->bframes + 1; i < origNumFrames; i += m_param->bframes + 1)
        {
            scenecut(frames, i, i + 1, true, origNumFrames);

            for (int j = i + 1; j <= X265_MIN(i + m_param->bframes + 1, origNumFrames); j++)
            {
                if (frames[j]->bScenecut && scenecutInternal(frames, j - 1, j, true))
                {
                    m_extendGopBoundary = true;
                    break;
                }
            }
            if (m_extendGopBoundary)
                break;
        }
        m_isSceneTransition = sceneTransition;
    }
    if (m_param->bframes)
    {
        if (m_param->bFrameAdaptive == X265_B_ADAPT_TRELLIS)
        {
            if (numFrames > 1)
            {   //并初始化第一行为空字符串，第二行为"P"
                char best_paths[X265_BFRAME_MAX + 1][X265_LOOKAHEAD_MAX + 1] = { "", "P" };
                int best_path_index = numFrames % (X265_BFRAME_MAX + 1);
                //调用slicetypePath函数确定最佳的切片路径，并将结果保存在best_paths数组中
                /* Perform the frame type analysis. */
                for (int j = 2; j <= numFrames; j++)
                    slicetypePath(frames, j, best_paths);
                //使用strspn函数计算best_paths[best_path_index]中连续的"B"字符数量，得到B帧的数量（numBFrames）
                numBFrames = (int)strspn(best_paths[best_path_index], "B");
                //将分析结果加载到frames数组中
                /* Load the results of the analysis into the frame types. */
                for (int j = 1; j < numFrames; j++)
                    frames[j]->sliceType = best_paths[best_path_index][j - 1] == 'B' ? X265_TYPE_B : X265_TYPE_P;
            }//将最后一帧（frames[numFrames]）的切片类型设置为P帧
            frames[numFrames]->sliceType = X265_TYPE_P;
        }
        else if (m_param->bFrameAdaptive == X265_B_ADAPT_FAST)
        {
            CostEstimateGroup estGroup(*this, frames);

            int64_t cost1p0, cost2p0, cost1b1, cost2p1;

            for (int i = 0; i <= numFrames - 2; )
            {
                cost2p1 = estGroup.singleCost(i + 0, i + 2, i + 2, true);
                if (frames[i + 2]->intraMbs[2] > cuCount / 2)
                {
                    frames[i + 1]->sliceType = X265_TYPE_P;
                    frames[i + 2]->sliceType = X265_TYPE_P;
                    i += 2;
                    continue;
                }

                cost1b1 = estGroup.singleCost(i + 0, i + 2, i + 1);
                cost1p0 = estGroup.singleCost(i + 0, i + 1, i + 1);
                cost2p0 = estGroup.singleCost(i + 1, i + 2, i + 2);

                if (cost1p0 + cost2p0 < cost1b1 + cost2p1)
                {
                    frames[i + 1]->sliceType = X265_TYPE_P;
                    i += 1;
                    continue;
                }

// arbitrary and untuned
#define INTER_THRESH 300
#define P_SENS_BIAS (50 - m_param->bFrameBias)
                frames[i + 1]->sliceType = X265_TYPE_B;

                int j;
                for (j = i + 2; j <= X265_MIN(i + m_param->bframes, numFrames - 1); j++)
                {
                    int64_t pthresh = X265_MAX(INTER_THRESH - P_SENS_BIAS * (j - i - 1), INTER_THRESH / 10);
                    int64_t pcost = estGroup.singleCost(i + 0, j + 1, j + 1, true);
                    if (pcost > pthresh * cuCount || frames[j + 1]->intraMbs[j - i + 1] > cuCount / 3)
                        break;
                    frames[j]->sliceType = X265_TYPE_B;
                }

                frames[j]->sliceType = X265_TYPE_P;
                i = j;
            }
            frames[numFrames]->sliceType = X265_TYPE_P;
            numBFrames = 0;
            while (numBFrames < numFrames && frames[numBFrames + 1]->sliceType == X265_TYPE_B)
                numBFrames++;
        }
        else
        {
            numBFrames = X265_MIN(numFrames - 1, m_param->bframes);
            for (int j = 1; j < numFrames; j++)
                frames[j]->sliceType = (j % (numBFrames + 1)) ? X265_TYPE_B : X265_TYPE_P;

            frames[numFrames]->sliceType = X265_TYPE_P;
        }
        //根据条件判断是否强制使用RADL
        int zoneRadl = m_param->rc.zonefileCount && m_param->bResetZoneConfig ? m_param->rc.zones->zoneParam->radl : 0;
        bool bForceRADL = zoneRadl || (m_param->radl && (m_param->keyframeMax == m_param->keyframeMin));
        bool bLastMiniGop = (framecnt >= m_param->bframes + 1) ? false : true;//根据条件判断是否为最后一个小GOP
        int radl = m_param->radl ? m_param->radl : zoneRadl;
        int preRADL = m_lastKeyframe + m_param->keyframeMax - radl - 1; /*Frame preceeding RADL in POC order*/
        if (bForceRADL && (frames[0]->frameNum == preRADL) && !bLastMiniGop)
        {//如果满足强制使用RADL的条件，并且第一个帧的frameNum等于preRADL，并且不是最后一个小GOP，则执行以下操作
            int j = 1;
            numBFrames = m_param->radl ? m_param->radl : zoneRadl;
            for (; j <= numBFrames; j++)//循环设置帧类型为B帧，从第2帧到第numBFrames帧
                frames[j]->sliceType = X265_TYPE_B;
            frames[j]->sliceType = X265_TYPE_I;
        }
        else /* Check scenecut and RADL on the first minigop. */
        {
            for (int j = 1; j < numBFrames + 1; j++)
            {   //对于每个帧，检查是否满足场景切换条件或者强制使用RADL的条件,如果满足条件，将该帧的帧类型设置为P帧，并将numAnalyzed设置为当前帧的索引，并跳出循环
                if (scenecut(frames, j, j + 1, false, origNumFrames) ||
                    (bForceRADL && (frames[j]->frameNum == preRADL)))
                {
                    frames[j]->sliceType = X265_TYPE_P;
                    numAnalyzed = j;
                    break;
                }
            }
        }
        resetStart = bKeyframe ? 1 : X265_MIN(numBFrames + 2, numAnalyzed + 1);
    }
    else
    {
        for (int j = 1; j <= numFrames; j++)
            frames[j]->sliceType = X265_TYPE_P;

        resetStart = bKeyframe ? 1 : 2;
    }
    if (m_param->bAQMotion)
        aqMotion(frames, bKeyframe);
    //调用cuTree函数处理帧的CU树
    if (m_param->rc.cuTree)
        cuTree(frames, X265_MIN(numFrames, m_param->keyframeMax), bKeyframe);

    if (m_param->gopLookahead && (keyFrameLimit >= 0) && (keyFrameLimit <= m_param->bframes + 1) && !m_extendGopBoundary)
        keyintLimit = keyFrameLimit;

    if (!m_param->bIntraRefresh)
        for (int j = keyintLimit + 1; j <= numFrames; j += m_param->keyframeMax)
        {
            frames[j]->sliceType = X265_TYPE_I;
            resetStart = X265_MIN(resetStart, j + 1);
        }
    
    if (bIsVbvLookahead)
        vbvLookahead(frames, numFrames, bKeyframe);
    int maxp1 = X265_MIN(m_param->bframes + 1, origNumFrames);

    /* Restore frame types for all frames that haven't actually been decided yet. */
    for (int j = resetStart; j <= numFrames; j++)
    {
        frames[j]->sliceType = X265_TYPE_AUTO;
        /* If any frame marked as scenecut is being restarted for sliceDecision, 
         * undo scene Transition flag */
        if (j <= maxp1 && frames[j]->bScenecut && m_isSceneTransition)
            m_isSceneTransition = false;
    }
}

9.低分辨率帧间估计CostEstimateGroup::estimateFrameCost

用于估算一个Frame的成本

int64_t CostEstimateGroup::estimateFrameCost(LookaheadTLD& tld, int p0, int p1, int b, bool bIntraPenalty)
{
    Lowres*     fenc  = m_frames[b];
    x265_param* param = m_lookahead.m_param;
    int64_t     score = 0;

    if (fenc->costEst[b - p0][p1 - b] >= 0 && fenc->rowSatds[b - p0][p1 - b][0] != -1)
        score = fenc->costEst[b - p0][p1 - b];
    else
    {
        bool bDoSearch[2];
        bDoSearch[0] = fenc->lowresMvs[0][b - p0][0].x == 0x7FFF;
        bDoSearch[1] = p1 > b && fenc->lowresMvs[1][p1 - b][0].x == 0x7FFF;

#if CHECKED_BUILD
        X265_CHECK(!(p0 < b && fenc->lowresMvs[0][b - p0][0].x == 0x7FFE), "motion search batch duplication L0\n");
        X265_CHECK(!(p1 > b && fenc->lowresMvs[1][p1 - b][0].x == 0x7FFE), "motion search batch duplication L1\n");
        if (bDoSearch[0]) fenc->lowresMvs[0][b - p0][0].x = 0x7FFE;
        if (bDoSearch[1]) fenc->lowresMvs[1][p1 - b][0].x = 0x7FFE;
#endif

        fenc->weightedRef[b - p0].isWeighted = false;
        if (param->bEnableWeightedPred && bDoSearch[0])
            tld.weightsAnalyse(*m_frames[b], *m_frames[p0]);

        fenc->costEst[b - p0][p1 - b] = 0;
        fenc->costEstAq[b - p0][p1 - b] = 0;
        //如果不处于批处理模式，并且协同模式的切片数大于1，并且需要进行运动搜索或双向测量，则进入协同模式
        if (!m_batchMode && m_lookahead.m_numCoopSlices > 1 && ((p1 > b) || bDoSearch[0] || bDoSearch[1]))
        {
            /* Use cooperative mode if a thread pool is available and the cost estimate is
             * going to need motion searches or bidir measurements */

            memset(&m_slice, 0, sizeof(Slice) * m_lookahead.m_numCoopSlices);

            m_lock.acquire();
            X265_CHECK(!m_batchMode, "single CostEstimateGroup instance cannot mix batch modes\n");
            m_coop.p0 = p0;
            m_coop.p1 = p1;
            m_coop.b = b;
            m_coop.bDoSearch[0] = bDoSearch[0];
            m_coop.bDoSearch[1] = bDoSearch[1];
            m_jobTotal = m_lookahead.m_numCoopSlices;
            m_jobAcquired = 0;
            m_lock.release();

            tryBondPeers(*m_lookahead.m_pool, m_jobTotal);

            processTasks(-1);

            waitForExit();
            //通过使用线程池来并行处理多个任务，计算每个任务的成本估算值，并将结果累加到costEst和costEstAq中
            for (int i = 0; i < m_lookahead.m_numCoopSlices; i++)
            {
                fenc->costEst[b - p0][p1 - b] += m_slice[i].costEst;
                fenc->costEstAq[b - p0][p1 - b] += m_slice[i].costEstAq;
                if (p1 == b)
                    fenc->intraMbs[b - p0] += m_slice[i].intraMbs;
            }
        }
        else
        {   //计算1/16分辨率下的运动矢量（MV
            /* Calculate MVs for 1/16th resolution*/
            bool lastRow;
            if (param->bEnableHME)
            {
                lastRow = true;
                for (int cuY = m_lookahead.m_4x4Height - 1; cuY >= 0; cuY--)
                {
                    for (int cuX = m_lookahead.m_4x4Width - 1; cuX >= 0; cuX--)
                        estimateCUCost(tld, cuX, cuY, p0, p1, b, bDoSearch, lastRow, -1, 1);
                    lastRow = false;
                }
            }
            lastRow = true;
            for (int cuY = m_lookahead.m_8x8Height - 1; cuY >= 0; cuY--)
            {
                fenc->rowSatds[b - p0][p1 - b][cuY] = 0;

                for (int cuX = m_lookahead.m_8x8Width - 1; cuX >= 0; cuX--)
                    estimateCUCost(tld, cuX, cuY, p0, p1, b, bDoSearch, lastRow, -1, 0);

                lastRow = false;
            }
        }

        score = fenc->costEst[b - p0][p1 - b];

        if (b != p1)
            score = score * 100 / (130 + param->bFrameBias);

        fenc->costEst[b - p0][p1 - b] = score;
    }

    if (bIntraPenalty)
        // arbitrary penalty for I-blocks after B-frames
        score += score * fenc->intraMbs[b - p0] / (tld.ncu * 8);

    return score;
}

10.低分辨率单个CU帧间估计CostEstimateGroup::estimateCUCost

用于估算一个Coding Unit（CU）的成本

void CostEstimateGroup::estimateCUCost(LookaheadTLD& tld, int cuX, int cuY, int p0, int p1, int b, bool bDoSearch[2], bool lastRow, int slice, bool hme)
{
    Lowres *fref0 = m_frames[p0];
    Lowres *fref1 = m_frames[p1];
    Lowres *fenc  = m_frames[b];

    ReferencePlanes *wfref0 = fenc->weightedRef[b - p0].isWeighted && !hme ? &fenc->weightedRef[b - p0] : fref0;
    //根据帧的宽度和高度，确定CU在帧中的位置，计算CU的大小、像素偏移量等参数
    const int widthInCU = hme ? m_lookahead.m_4x4Width : m_lookahead.m_8x8Width;
    const int heightInCU = hme ? m_lookahead.m_4x4Height : m_lookahead.m_8x8Height;
    const int bBidir = (b < p1);
    const int cuXY = cuX + cuY * widthInCU;
    const int cuXY_4x4 = (cuX / 2) + (cuY / 2) * widthInCU / 2;
    const int cuSize = X265_LOWRES_CU_SIZE;
    const intptr_t pelOffset = cuSize * cuX + cuSize * cuY * (hme ? fenc->lumaStride/2 : fenc->lumaStride);

    if ((bBidir || bDoSearch[0] || bDoSearch[1]) && hme)
        tld.me.setSourcePU(fenc->lowerResPlane[0], fenc->lumaStride / 2, pelOffset, cuSize, cuSize, X265_HEX_SEARCH, m_lookahead.m_param->hmeSearchMethod[0], m_lookahead.m_param->hmeSearchMethod[1], 1);
    else if((bBidir || bDoSearch[0] || bDoSearch[1]) && !hme)
        tld.me.setSourcePU(fenc->lowresPlane[0], fenc->lumaStride, pelOffset, cuSize, cuSize, X265_HEX_SEARCH, m_lookahead.m_param->hmeSearchMethod[0], m_lookahead.m_param->hmeSearchMethod[1], 1);

    //设置一个小的偏置值lowresPenalty，用于避免由于零残差的预测块导致VBV（Video Buffering Verifier）问题
    /* A small, arbitrary bias to avoid VBV problems caused by zero-residual lookahead blocks. */
    int lowresPenalty = 4;
    int listDist[2] = { b - p0, p1 - b};

    MV mvmin, mvmax;
    int bcost = tld.me.COST_MAX;
    int listused = 0;

    // TODO: restrict to slices boundaries
    // establish search bounds that don't cross extended frame boundaries
    mvmin.x = (int32_t)(-cuX * cuSize - 8);
    mvmin.y = (int32_t)(-cuY * cuSize - 8);
    mvmax.x = (int32_t)((widthInCU - cuX - 1) * cuSize + 8);
    mvmax.y = (int32_t)((heightInCU - cuY - 1) * cuSize + 8);
    //对每个参考列表（单向或双向）进行运动估计和成本计算
    for (int i = 0; i < 1 + bBidir; i++)
    {
        int& fencCost = hme ? fenc->lowerResMvCosts[i][listDist[i]][cuXY] : fenc->lowresMvCosts[i][listDist[i]][cuXY];
        int skipCost = INT_MAX;

        if (!bDoSearch[i])
        {
            COPY2_IF_LT(bcost, fencCost, listused, i + 1);
            continue;
        }

        int numc = 0;
        MV mvc[5], mvp;
        MV* fencMV = hme ? &fenc->lowerResMvs[i][listDist[i]][cuXY] : &fenc->lowresMvs[i][listDist[i]][cuXY];
        ReferencePlanes* fref = i ? fref1 : wfref0;
        //根据特定的条件填充了数组 mvc，将运动矢量存储其中
        /* Reverse-order MV prediction */
#define MVC(mv) mvc[numc++] = mv;
        if (cuX < widthInCU - 1)
            MVC(fencMV[1]);
        if (!lastRow)
        {
            MVC(fencMV[widthInCU]);
            if (cuX > 0)
                MVC(fencMV[widthInCU - 1]);
            if (cuX < widthInCU - 1)
                MVC(fencMV[widthInCU + 1]);
        }
        if (fenc->lowerResMvs[0][0] && !hme && fenc->lowerResMvCosts[i][listDist[i]][cuXY_4x4] > 0)
        {
            MVC((fenc->lowerResMvs[i][listDist[i]][cuXY_4x4]) * 2);
        }
#undef MVC

        if (!numc)
            mvp = 0;
        else
        {
            ALIGN_VAR_32(pixel, subpelbuf[X265_LOWRES_CU_SIZE * X265_LOWRES_CU_SIZE]);
            int mvpcost = MotionEstimate::COST_MAX;

            /* measure SATD cost of each neighbor MV (estimating merge analysis)
             * and use the lowest cost MV as MVP (estimating AMVP). Since all
             * mvc[] candidates are measured here, none are passed to motionEstimate */
            for (int idx = 0; idx < numc; idx++)
            {
                intptr_t stride = X265_LOWRES_CU_SIZE;
                pixel *src = fref->lowresMC(pelOffset, mvc[idx], subpelbuf, stride, hme);
                int cost = tld.me.bufSATD(src, stride);
                COPY2_IF_LT(mvpcost, cost, mvp, mvc[idx]);
                /* Except for mv0 case, everyting else is likely to have enough residual to not trigger the skip. */
                if (!mvp.notZero() && bBidir)
                    skipCost = cost;
            }
        }

        int searchRange = m_lookahead.m_param->bEnableHME ? (hme ? m_lookahead.m_param->hmeRange[0] : m_lookahead.m_param->hmeRange[1]) : s_merange;
        /* ME will never return a cost larger than the cost @MVP, so we do not
         * have to check that ME cost is more than the estimated merge cost */
        if(!hme)//使用运动估计技术计算了 fencCost
            fencCost = tld.me.motionEstimate(fref, mvmin, mvmax, mvp, 0, NULL, searchRange, *fencMV, m_lookahead.m_param->maxSlices);
        else
            fencCost = tld.me.motionEstimate(fref, mvmin, mvmax, mvp, 0, NULL, searchRange, *fencMV, m_lookahead.m_param->maxSlices, fref->lowerResPlane[0]);
        if (skipCost < 64 && skipCost < fencCost && bBidir)
        {
            fencCost = skipCost;
            *fencMV = 0;
        }//通过调用宏 COPY2_IF_LT，将 fencCost 的值复制到 bcost
        COPY2_IF_LT(bcost, fencCost, listused, i + 1);
    }
    if (hme)
        return;
    //如果 bBidir 为真，表示当前帧为双向预测帧（B帧），则执行双向预测的成本估计过程；否则，表示当前帧为单向预测帧（P帧），则执行单向预测的成本估计过程以及考虑帧内预测的情况
    if (bBidir) /* B, also consider bidir */
    {
        /* NOTE: the wfref0 (weightp) is not used for BIDIR */
        //调用 fref0->lowresMC 和 fref1->lowresMC 函数，对参考帧进行亚像素运动补偿，得到两个亚像素平面 src0 和 src1
        /* avg(l0-mv, l1-mv) candidate */
        ALIGN_VAR_32(pixel, subpelbuf0[X265_LOWRES_CU_SIZE * X265_LOWRES_CU_SIZE]);
        ALIGN_VAR_32(pixel, subpelbuf1[X265_LOWRES_CU_SIZE * X265_LOWRES_CU_SIZE]);
        intptr_t stride0 = X265_LOWRES_CU_SIZE, stride1 = X265_LOWRES_CU_SIZE;
        pixel *src0 = fref0->lowresMC(pelOffset, fenc->lowresMvs[0][listDist[0]][cuXY], subpelbuf0, stride0, 0);
        pixel *src1 = fref1->lowresMC(pelOffset, fenc->lowresMvs[1][listDist[1]][cuXY], subpelbuf1, stride1, 0);
        //创建用于存储像素平均值的缓冲区 ref
        ALIGN_VAR_32(pixel, ref[X265_LOWRES_CU_SIZE * X265_LOWRES_CU_SIZE]);
        //使用像素平均值函数
        primitives.pu[LUMA_8x8].pixelavg_pp[NONALIGNED](ref, X265_LOWRES_CU_SIZE, src0, stride0, src1, stride1, 32);
        //计算 ref的 SATD
        int bicost = tld.me.bufSATD(ref, X265_LOWRES_CU_SIZE);
        COPY2_IF_LT(bcost, bicost, listused, 3);
        /* coloc candidate */
        //再次使用像素平均值函数，将 fref0->lowresPlane[0] 和 fref1->lowresPlane[0] 的像素平均值存储到 ref 缓冲区中
        src0 = fref0->lowresPlane[0] + pelOffset;
        src1 = fref1->lowresPlane[0] + pelOffset;
        primitives.pu[LUMA_8x8].pixelavg_pp[NONALIGNED](ref, X265_LOWRES_CU_SIZE, src0, fref0->lumaStride, src1, fref1->lumaStride, 32);
        bicost = tld.me.bufSATD(ref, X265_LOWRES_CU_SIZE);
        COPY2_IF_LT(bcost, bicost, listused, 3);
        bcost += lowresPenalty;
    }
    else /* P, also consider intra */
    {
        bcost += lowresPenalty;

        if (fenc->intraCost[cuXY] < bcost)
        {
            bcost = fenc->intraCost[cuXY];
            listused = 0;
        }
    }
    //根据条件判断当前块是否位于帧的边缘区域，并将结果存储在布尔变量 bFrameScoreCU 中
    /* do not include edge blocks in the frame cost estimates, they are not very accurate */
    const bool bFrameScoreCU = (cuX > 0 && cuX < widthInCU - 1 &&
                                cuY > 0 && cuY < heightInCU - 1) || widthInCU <= 2 || heightInCU <= 2;
    int bcostAq;
    if (m_lookahead.m_param->rc.qgSize == 8)
        bcostAq = (bFrameScoreCU && fenc->invQscaleFactor) ? ((bcost * fenc->invQscaleFactor8x8[cuXY] + 128) >> 8) : bcost;
    else
        bcostAq = (bFrameScoreCU && fenc->invQscaleFactor) ? ((bcost * fenc->invQscaleFactor[cuXY] +128) >> 8) : bcost;

    if (bFrameScoreCU)
    {   //具体的更新根据当前是整个帧还是分片进行不同的处理
        if (slice < 0)//如果 slice 小于零，表示当前处理的是整个帧（不是分片）
        {
            fenc->costEst[b - p0][p1 - b] += bcost;
            fenc->costEstAq[b - p0][p1 - b] += bcostAq;
            if (!listused && !bBidir)
                fenc->intraMbs[b - p0]++;
        }
        else
        {
            m_slice[slice].costEst += bcost;
            m_slice[slice].costEstAq += bcostAq;
            if (!listused && !bBidir)
                m_slice[slice].intraMbs++;
        }
    }

    fenc->rowSatds[b - p0][p1 - b][cuY] += bcostAq;
    fenc->lowresCosts[b - p0][p1 - b][cuXY] = (uint16_t)(X265_MIN(bcost, LOWRES_COST_MASK) | (listused << LOWRES_COST_SHIFT));
}

11.VBV码率和缓冲区Lookahead::vbvLookahead

VBV预测用于估计视频编码过程中的码率和缓冲区占用情况，以便进行码率控制和缓冲区管理。

void Lookahead::vbvLookahead(Lowres **frames, int numFrames, int keyframe)
{
    int prevNonB = 0, curNonB = 1, idx = 0;
    //根据帧类型，确定非B帧（curNonB）和下一个非B帧（nextNonB）的索引
    while (curNonB < numFrames && IS_X265_TYPE_B(frames[curNonB]->sliceType))
        curNonB++;
    int nextNonB = keyframe ? prevNonB : curNonB;
    int nextB = prevNonB + 1;
    int nextBRef = 0, curBRef = 0;
    if (m_param->bBPyramid && curNonB - prevNonB > 1)
        curBRef = (prevNonB + curNonB + 1) / 2;
    int miniGopEnd = keyframe ? prevNonB : curNonB;
    //遍历帧数组中的每个非B帧（curNonB）
    while (curNonB <= numFrames)
    {   //对于P帧或I帧，计算其与下一个非B帧之间的预测代价（plannedSatd）和帧类型（plannedType）
        /* P/I cost: This shouldn't include the cost of nextNonB */
        if (nextNonB != curNonB)
        {
            int p0 = IS_X265_TYPE_I(frames[curNonB]->sliceType) ? curNonB : prevNonB;
            frames[nextNonB]->plannedSatd[idx] = vbvFrameCost(frames, p0, curNonB, curNonB);
            frames[nextNonB]->plannedType[idx] = frames[curNonB]->sliceType;

            /* Save the nextNonB Cost in each B frame of the current miniGop */
            if (curNonB > miniGopEnd)
            {
                for (int j = nextB; j < miniGopEnd; j++)
                {
                    frames[j]->plannedSatd[frames[j]->indB] = frames[nextNonB]->plannedSatd[idx];
                    frames[j]->plannedType[frames[j]->indB++] = frames[nextNonB]->plannedType[idx];
                }
            }
            idx++;
        }
        
        /* Handle the B-frames: coded order */
        if (m_param->bBPyramid && curNonB - prevNonB > 1)
            nextBRef = (prevNonB + curNonB + 1) / 2;

        for (int i = prevNonB + 1; i < curNonB; i++, idx++)
        {
            int64_t satdCost = 0;
            int type = X265_TYPE_B;
            //如果当前非B帧之后还有B帧（curNonB - prevNonB > 1），计算B帧的预测代价和帧类型
            if (nextBRef)
            {
                if (i == nextBRef)
                {
                    satdCost = vbvFrameCost(frames, prevNonB, curNonB, nextBRef);
                    type = X265_TYPE_BREF;
                }
                else if (i < nextBRef)
                    satdCost = vbvFrameCost(frames, prevNonB, nextBRef, i);
                else
                    satdCost = vbvFrameCost(frames, nextBRef, curNonB, i);
            }
            else
                satdCost = vbvFrameCost(frames, prevNonB, curNonB, i);
            //将计算得到的预测代价和帧类型存储在下一个非B帧（nextNonB）的相应数组中
            frames[nextNonB]->plannedSatd[idx] = satdCost;
            frames[nextNonB]->plannedType[idx] = type;
            /* Save the nextB Cost in each B frame of the current miniGop */
            //根据具体情况，将预测代价和帧类型保存在当前miniGop中的每个B帧中

            for (int j = nextB; j < miniGopEnd; j++)
            {
                if (curBRef && curBRef == i)
                    break;
                if (j >= i && j !=nextBRef)
                    continue;
                frames[j]->plannedSatd[frames[j]->indB] = satdCost;
                frames[j]->plannedType[frames[j]->indB++] = type;
            }
        }
        //更新索引和计数器，继续下一个非B帧的处理，直到遍历完所有帧。
        prevNonB = curNonB;
        curNonB++;
        while (curNonB <= numFrames && IS_X265_TYPE_B(frames[curNonB]->sliceType))
            curNonB++;
    }
    //设置最后一个非B帧（nextNonB）的帧类型为自动X265_TYPE_AUTO
    frames[nextNonB]->plannedType[idx] = X265_TYPE_AUTO;
}

12.场景切换检测Lookahead::scenecut

该函数用于检测场景切换，并返回是否发生了真正的场景切换，代码如下：

bool Lookahead::scenecut(Lowres **frames, int p0, int p1, bool bRealScenecut, int numFrames)
{
    /* Only do analysis during a normal scenecut check. */
    if (bRealScenecut && m_param->bframes)
    {
        int origmaxp1 = p0 + 1;
        /* Look ahead to avoid coding short flashes as scenecuts. */
        origmaxp1 += m_param->bframes;
        int maxp1 = X265_MIN(origmaxp1, numFrames);
        bool fluctuate = false;
        bool noScenecuts = false;
        int64_t avgSatdCost = 0;
        if (frames[p0]->costEst[p1 - p0][0] > -1)
            avgSatdCost = frames[p0]->costEst[p1 - p0][0];
        int cnt = 1;
        /* Where A and B are scenes: AAAAAABBBAAAAAA
         * If BBB is shorter than (maxp1-p0), it is detected as a flash
         * and not considered a scenecut. */
        //需要避免出现这种闪回认为是场景的情况
        for (int cp1 = p1; cp1 <= maxp1; cp1++)
        {
            if (!scenecutInternal(frames, p0, cp1, false))
            {
                /* Any frame in between p0 and cur_p1 cannot be a real scenecut. */
                for (int i = cp1; i > p0; i--)
                {
                    frames[i]->bScenecut = false;
                    noScenecuts = false;
                }
            }
            else if (scenecutInternal(frames, cp1 - 1, cp1, false))
            {   //判断前一帧与当前帧是否也是场景切换帧
                /* If current frame is a Scenecut from p0 frame as well as Scenecut from
                 * preceeding frame, mark it as a Scenecut */
                frames[cp1]->bScenecut = true;
                noScenecuts = true;
            }

            /* compute average satdcost of all the frames in the mini-gop to confirm 
             * whether there is any great fluctuation among them to rule out false positives */
            X265_CHECK(frames[cp1]->costEst[cp1 - p0][0]!= -1, "costEst is not done \n");
            avgSatdCost += frames[cp1]->costEst[cp1 - p0][0];
            cnt++;
        }

        /* Identify possible scene fluctuations by comparing the satd cost of the frames.
         * This could denote the beginning or ending of scene transitions.
         * During a scene transition(fade in/fade outs), if fluctuate remains false,
         * then the scene had completed its transition or stabilized */
        if (noScenecuts)
        {
            fluctuate = false;
            avgSatdCost /= cnt;
            for (int i = p1; i <= maxp1; i++)
            {
                int64_t curCost  = frames[i]->costEst[i - p0][0];
                int64_t prevCost = frames[i - 1]->costEst[i - 1 - p0][0];
                if (fabs((double)(curCost - avgSatdCost)) > 0.1 * avgSatdCost || 
                    fabs((double)(curCost - prevCost)) > 0.1 * prevCost)//比较当前帧和前一帧的SAD成本与平均SAD成本的差异是否超过阈值的10%。如果超过阈值，将波动标志fluctuate设置为true
                {
                    fluctuate = true;
                    if (!m_isSceneTransition && frames[i]->bScenecut)
                    {
                        m_isSceneTransition = true;//只需要检测到第一个场景切换帧即可
                        /* just mark the first scenechange in the scene transition as a scenecut. */
                        for (int j = i + 1; j <= maxp1; j++)
                            frames[j]->bScenecut = false;
                        break;
                    }
                }
                frames[i]->bScenecut = false;
            }
        }
        if (!fluctuate && !noScenecuts)
            m_isSceneTransition = false; /* Signal end of scene transitioning */
    }

    if (m_param->csvLogLevel >= 2)
    {
        int64_t icost = frames[p1]->costEst[0][0];
        int64_t pcost = frames[p1]->costEst[p1 - p0][0];
        frames[p1]->ipCostRatio = (double)icost / pcost;
    }

    /* A frame is always analysed with bRealScenecut = true first, and then bRealScenecut = false,
       the former for I decisions and the latter for P/B decisions. It's possible that the first 
       analysis detected scenecuts which were later nulled due to scene transitioning, in which 
       case do not return a true scenecut for this frame */

    if (!frames[p1]->bScenecut)
        return false;
    //仅返回P1是否是转码
    return scenecutInternal(frames, p0, p1, bRealScenecut);
}

13.帧结构路径成本计算Lookahead::slicetypePathCost

实现了X265_B_ADAPT_TRELLIS帧结构的方案，代码如下：

int64_t Lookahead::slicetypePathCost(Lowres **frames, char *path, int64_t threshold)
{
    int64_t cost = 0;
    int loc = 1;//初始化变量 loc 为 1，表示路径的索引位置，从第一个路径元素开始
    int cur_p = 0;//初始化变量 cur_p 为 0，表示当前p帧的索引位置

    CostEstimateGroup estGroup(*this, frames);
    //将路径指针 path 减1，这是因为第一个路径元素实际上是第二帧
    path--; /* Since the 1st path element is really the second frame */
    while (path[loc])//在循环中，遍历路径元素，直到遇到空字符结束循环
    {
        int next_p = loc;
        /* Find the location of the next P-frame. */
        while (path[next_p] != 'P')
            next_p++;
        //根据找到的下一个P帧位置，计算该帧的代价，并将其添加到总代价 cost 中
        /* Add the cost of the P-frame found above */
        cost += estGroup.singleCost(cur_p, next_p, next_p);

        /* Early terminate if the cost we have found is larger than the best path cost so far */
        if (cost > threshold)
            break;
        //如果启用了B帧金字塔（B-frame pyramid）且下一个P帧与当前P帧的间隔大于2，则进行特殊处理
        if (m_param->bBPyramid && next_p - cur_p > 2)
        {
            int middle = cur_p + (next_p - cur_p) / 2;
            cost += estGroup.singleCost(cur_p, next_p, middle);

            for (int next_b = loc; next_b < middle && cost < threshold; next_b++)
                cost += estGroup.singleCost(cur_p, middle, next_b);

            for (int next_b = middle + 1; next_b < next_p && cost < threshold; next_b++)
                cost += estGroup.singleCost(middle, next_p, next_b);
        }
        else//如果未启用B帧金字塔或间隔小于等于2，则遍历当前P帧和下一个P帧之间的每一帧，计算其代价并添加到总代价 cost 中
        {
            for (int next_b = loc; next_b < next_p && cost < threshold; next_b++)
                cost += estGroup.singleCost(cur_p, next_p, next_b);
        }

        loc = next_p + 1;
        cur_p = next_p;
    }

    return cost;
}

14.CU tree的构建和处理Lookahead::cuTree

实现了X265_B_ADAPT_TRELLIS帧结构的方案，代码如下：

//对给定的帧数组进行CU树的构建和处理
void Lookahead::cuTree(Lowres **frames, int numframes, bool bIntra)
{
    int idx = !bIntra;
    int lastnonb, curnonb = 1;
    int bframes = 0;

    x265_emms();
    double totalDuration = 0.0;
    for (int j = 0; j <= numframes; j++)
        totalDuration += (double)m_param->fpsDenom / m_param->fpsNum;

    double averageDuration = totalDuration / (numframes + 1);

    int i = numframes;

    while (i > 0 && frames[i]->sliceType == X265_TYPE_B)
        i--;

    lastnonb = i;

    /* Lookaheadless MB-tree is not a theoretically distinct case; the same extrapolation could
     * be applied to the end of a lookahead buffer of any size.  However, it's most needed when
     * lookahead=0, so that's what's currently implemented. */
    if (!m_param->lookaheadDepth)
    {
        if (bIntra)
        {   //如果没有启用前向预测（lookaheadDepth为0），则根据帧类型进行处理，设置传播代价（propagateCost）和QP偏移
            memset(frames[0]->propagateCost, 0, m_cuCount * sizeof(uint16_t));
            if (m_param->rc.qgSize == 8)
                memcpy(frames[0]->qpCuTreeOffset, frames[0]->qpAqOffset, m_cuCount * 4 * sizeof(double));
            else
                memcpy(frames[0]->qpCuTreeOffset, frames[0]->qpAqOffset, m_cuCount * sizeof(double));
            return;
        }
        std::swap(frames[lastnonb]->propagateCost, frames[0]->propagateCost);
        memset(frames[0]->propagateCost, 0, m_cuCount * sizeof(uint16_t));
    }
    else
    {
        if (lastnonb < idx)
            return;
        memset(frames[lastnonb]->propagateCost, 0, m_cuCount * sizeof(uint16_t));
    }

    CostEstimateGroup estGroup(*this, frames);

    while (i-- > idx)
    {   //从最后一个非B帧开始，向前遍历帧序列
        curnonb = i;
        while (frames[curnonb]->sliceType == X265_TYPE_B && curnonb > 0)
            curnonb--;

        if (curnonb < idx)
            break;

        estGroup.singleCost(curnonb, lastnonb, lastnonb);

        memset(frames[curnonb]->propagateCost, 0, m_cuCount * sizeof(uint16_t));
        bframes = lastnonb - curnonb - 1;
        if (m_param->bBPyramid && bframes > 1)
        {
            int middle = (bframes + 1) / 2 + curnonb;
            estGroup.singleCost(curnonb, lastnonb, middle);
            memset(frames[middle]->propagateCost, 0, m_cuCount * sizeof(uint16_t));
            while (i > curnonb)
            {
                int p0 = i > middle ? middle : curnonb;
                int p1 = i < middle ? middle : lastnonb;
                if (i != middle)
                {   //从当前帧向前遍历，计算每一帧与参考帧之间的帧类型成本，并进行CU tree 遗传信息的传递操作
                    estGroup.singleCost(p0, p1, i);
                    estimateCUPropagate(frames, averageDuration, p0, p1, i, 0);
                }
                i--;
            }

            estimateCUPropagate(frames, averageDuration, curnonb, lastnonb, middle, 1);
        }
        else
        {
            while (i > curnonb)
            {   //向前遍历，计算所有帧的cost
                estGroup.singleCost(curnonb, lastnonb, i);
                estimateCUPropagate(frames, averageDuration, curnonb, lastnonb, i, 0);
                i--;
            }
        }
        estimateCUPropagate(frames, averageDuration, curnonb, lastnonb, lastnonb, 1);
        lastnonb = curnonb;
    }

    if (!m_param->lookaheadDepth)
    {
        estGroup.singleCost(0, lastnonb, lastnonb);
        estimateCUPropagate(frames, averageDuration, 0, lastnonb, lastnonb, 1);
        std::swap(frames[lastnonb]->propagateCost, frames[0]->propagateCost);
    }
    //在所有的帧类型成本计算和宏块树的传递操作完成后，进行宏块树的最终处理，并输出结果
    cuTreeFinish(frames[lastnonb], averageDuration, lastnonb);
    if (m_param->bBPyramid && bframes > 1 && !m_param->rc.vbvBufferSize)
        cuTreeFinish(frames[lastnonb + (bframes + 1) / 2], averageDuration, 0);
}