x264速率控制方法的定性概述【翻译】

最新推荐文章于 2022-02-25 10:51:58 发布

编解码

最新推荐文章于 2022-02-25 10:51:58 发布

阅读量201

点赞数

分类专栏： H264

本文链接：https://blog.csdn.net/beidoubushixing/article/details/107313081

版权

H264 专栏收录该内容

9 篇文章 0 订阅

订阅专栏

By Loren Merritt

*Historical note: This document is outdated, but a significant part of it is still accurate. Here are some important ways ratecontrol has changed since the authoring of this document:

By default, MB-tree is used instead of qcomp for weighting frame quality based on complexity. MB-tree is effectively a generalization of qcomp to the macroblock level. MB-tree also replaces the constant offsets for B-frame quantizers. The legacy algorithm is still available for low-latency applications.
Adaptive quantization is now used to distribute quality among each frame; frames are no longer constant quantizer, even if MB-tree is off.
VBV runs per-row rather than per-frame to improve accuracy.*

提示：该文档已过时，但其中很大一部分仍然准确。自撰写本文以来，一些重要的速率控制方法已经发生了变化：
1.默认情况下，使用MB树代替了qcomp参数完成基于复杂度的加权帧质量评估。 MB树实际上是宏块级别qcomp的一般化。 MB树还替换了B帧量化器的常量偏移量。但传统算法仍然可用于低延迟应用程序。
2.自适应量化操作现在用于在每个帧之间分配质量；即使MB树关闭，帧处理也不再是用恒定量化器。
3.VBV按行运行而不是按帧运行，以提高准确性。

*x264’s ratecontrol is based on libavcodec’s, and is mostly empirical. But I can retroactively propose the following theoretical points which underlie most of the algorithms:

You want the movie to be somewhere approaching constant quality. However, constant quality does not mean constant PSNR nor constant QP. Details are less noticeable in high-complexity or high-motion scenes, so you can get away with somewhat higher QP for the same perceived quality.
On the other hand, you get more quality per bit if you spend those bits in scenes where motion compensation works well: A given artifact may stick around several seconds in a low-motion scene, and you only have to fix it in one frame to improve the quality of the whole scene.
Both of the above are correlated with the number of bits it takes to encode a frame at a given QP.
Given one encoding of a frame, we can predict the number of bits needed to encode it at a different QP. This prediction gets less accurate if the QPs are far apart.
The importance of a frame depends on the number of other frames that are predicted from it. Hence I-frames get reduced QP depending on the number and complexity of following inter-frames, disposable B-frames get higher QP than P-frames, and referenced B-frames are between P-frames and disposable B-frames.*

x264的码率控制基于libavcodec的码率控制算法，并且主要是基于经验性的。但是我们可以追溯地提出以下理论要点，这些要点也是大多数算法的基础：

-如果您希望电影能达到稳定的质量。但是，通常来说，恒定的质量并不意味着恒定的PSNR或恒定QP。在高复杂度或高运动场景中，细节不太明显，因此对于相同的感知质量，您可以摆脱更高的QP。
-另一方面，如果将这些位花费在运动补偿效果良好的场景中，则可以获得更高的每位质量：给定的伪像可能在低运动场景中停留约几秒钟，而您只需将其固定在一个帧中以提高整个场景的质量。
-以上两项都与在给定QP上编码帧所需的位数有关。
-给定帧的一种编码方式，我们可以预测以不同的QP对其进行编码所需的位数。如果QP相距较远，则此预测的准确性会降低。
-帧的重要性取决于从中预测的其他帧的数量。因此，根据后续帧间的数量和复杂性，I帧的QP降低，一次性B帧的QP高于P帧，参考的B帧介于P帧和一次性B帧之间。

The modes:模式

2pass:

Given some data about each frame of a 1st pass (e.g. generated by 1pass ABR, below), we try to choose QPs to maximize quality while matching a specified total size. This is separated into 3 parts: (1) Before starting the 2nd pass, select the relative number of bits to allocate between frames. This pays no attention to the total size of the encode. The default formula, empirically selected to balance between the 1st 2 theoretical points, is “complexity ** 0.6”, where complexity is defined to be the bit size of the frame at a constant QP (estimated from the 1st pass). (2) Scale the results of (1) to fill the requested total size. Optional: Impose VBV limitations. Due to nonlinearities in the frame size predictor and in VBV, this is an iterative process. (3) Now start encoding. After each frame, update future QPs to compensate for mispredictions in size. If the 2nd pass is consistently off from the predicted size (usually because we use slower compression options than the 1st pass), then we multiply all future frames’ qscales by the reciprocal of the error. Additionally, there is a short-term compensation to prevent us from deviating too far from the desired size near the beginning (when we don’t have much data for the global compensation) and near the end (when global doesn’t have time to react).

给定第一遍的每个帧的一些编码信息数据（例如下面的1pass ABR生成的），我们尝试选择一些QP，当在匹配指定的目标码率总大小的同时，最大化质量。这分为三部分：
（1）在开始第二遍之前，选择要在帧之间分配的相对比特位数。这时候无需关注编码码率的总大小。
根据经验选择以在第1个2个理论点之间达到平衡的默认公式为“ complexity ** 0.6”，其中，complexity ** 0.6定义为恒定QP（从第1遍估计）的帧的位大小。
（2）缩放（1）的结果以填充所需的总大小。可选：施加VBV限制。由于帧大小预测器和VBV中的非线性，这是一个迭代过程。
（3）现在开始编码。在每帧之后，更新将来的QP以补偿大小上的错误预测。如果第二遍始终与预测大小不符（通常是因为我们使用的压缩选项比第一遍慢），那么我们将所有未来帧的q标度乘以误差的倒数。此外，还有一项短期补偿措施，以防止我们在开始时（当我们没有太多数据可用于全局补偿时）和接近结束时（在全局没有时间可调整时）偏离所需的大小。反应）。

1pass, average bitrate:

The goal is the same as in 2pass, but here we don’t have the benefit of a previous encode, so all ratecontrol must be done during the encode. (1) This is the same as in 2pass, except that instead of estimating complexity from a previous encode, we run a fast motion estimation algo over a half-resolution version of the frame, and use the SATD residuals (these are also used in the decision between P- and B-frames). Also, we don’t know the size or complexity of the following GOP, so I-frame bonus is based on the past. (2) We don’t know the complexities of future frames, so we can only scale based on the past. The scaling factor is chosen to be the one that would have resulted in the desired bitrate if it had been applied to all frames so far. (3) Overflow compensation is the same as in 2pass. By tuning the strength of compensation, you can get anywhere from near the quality of 2pass (but unpredictable size, like ± 10%) to reasonably strict filesize but lower quality.

1pass，平均码率：
目标与2pass中的目标相同，但是这里我们没有先前编码的收益，因此所有速率控制必须在编码期间完成。
（1）与2pass中的相同，除了我们不是在先前的编码中估算复杂度，而是在帧的半分辨率版本上运行快速运动估算算法，并使用了SATD残差（这些也用于 P帧和B帧之间的决定）。另外，我们不知道接下来的GOP的大小或复杂性，因此I帧奖金是基于过去的。
（2）我们不知道未来框架的复杂性，因此我们只能基于过去进行扩展。如果缩放因子已被应用到所有帧中，则将其选择为会产生所需比特率的比例因子。
（3）溢出补偿与2pass中相同。通过调整补偿的强度，您可以获得接近2pass的质量（但大小不可预测，例如±10％）到相当严格的文件大小，但质量较低。

1pass, constant bitrate (VBV compliant):

(1) Same as ABR. (2) Scaling factor is based on a local average (dependent on VBV buffer size) instead of all past frames. (3) Overflow compensation is stricter, and has an additional term to hard limit the QPs if the VBV is near empty. Note that no hard limit is done for a full VBV, so CBR may use somewhat less than the requested bitrate. Note also that if a frame violates VBV constraints despite the best efforts of prediction, it is not re-encoded.

（1）CBR处理过程与ABR相同。
（2）不同的是，CBR的缩放因子基于局部平均值（取决于设置的VBV缓冲区大小），而不是已经编码的帧比特大小。
（3）CBR的溢出补偿更加严格，并且在VBV缓冲接近空的情况下，还有一个附加条件可以硬性限制QP。需要注意的是，对于满了的VBV并没有硬性限制，因此CBR所使用的比特率可能会略低于目标比特率。同时需要注意，即使尽了最大的努力预测目标大小，如果还有帧仍不满足VBV缓冲的约束，则不会对其进行重新编码。