WebRTC中的视频编码

最新推荐文章于 2023-08-05 15:01:38 发布

꧁白杨树下꧂

最新推荐文章于 2023-08-05 15:01:38 发布

阅读量590

点赞数

分类专栏： Qt+WebRTC 文章标签： webrtc

原文链接：https://webrtc.googlesource.com/src/+/HEAD/modules/video_coding/g3doc/index.md

版权

Qt+WebRTC 专栏收录该内容

50 篇文章 50 订阅

订阅专栏

Video coding in WebRTC

WebRTC中的视频编码

Introduction to layered video coding

分层视频编码简介

Video coding is the process of encoding a stream of uncompressed video frames into a compressed bitstream, whose bitrate is lower than that of the original stream.

视频编码是将未压缩的视频帧流编码为压缩比特流的过程，其比特率低于原始流的比特率。

Block-based hybrid video coding

基于块的混合视频编码

All video codecs in WebRTC are based on the block-based hybrid video coding paradigm, which entails prediction of the original video frame using either information from previously encoded frames or information from previously encoded portions of the current frame, subtraction of the prediction from the original video, and transform and quantization of the resulting difference. The output of the quantization process, quantized transform coefficients, is losslessly entropy coded along with other encoder parameters (e.g., those related to the prediction process) and then a reconstruction is constructed by inverse quantizing and inverse transforming the quantized transform coefficients and adding the result to the prediction. Finally, in-loop filtering is applied and the resulting reconstruction is stored as a reference frame to be used to develop predictions for future frames.

WebRTC中的所有视频编解码器都基于基于块的混合视频编码范式，这需要使用来自先前编码的帧的信息或来自当前帧的先前编码部分的信息来预测原始视频帧，从原始视频中减去预测，以及对所产生的差进行变换和量化。量化过程的输出，量化变换系数，与其他编码器参数（例如，与预测过程相关的编码器参数）一起被无损熵编码，然后通过对量化变换系数进行逆量化和逆变换并将结果添加到预测来构造重构。最后，应用环内滤波，并将得到的重建存储为参考帧，用于开发对未来帧的预测。

Frame types

帧类型

When an encoded frame depends on previously encoded frames (i.e., it has one or more inter-frame dependencies), the prior frames must be available at the receiver before the current frame can be decoded. In order for a receiver to start decoding an encoded bitstream, a frame which has no prior dependencies is required. Such a frame is called a “key frame”. For real-time-communications encoding, key frames typically compress less efficiently than “delta frames” (i.e., frames whose predictions are derived from previously encoded frames).

当编码的帧依赖于先前编码的帧时（即，它具有一个或多个帧间依赖性），在当前帧可以被解码之前，先前帧必须在接收器处可用。为了使接收器开始解码编码的比特流，需要不具有先前依赖性的帧。这样的框架被称为“关键框架”。对于实时通信编码，关键帧的压缩效率通常低于“德尔塔帧”（即，其预测源自先前编码的帧的帧）。

Single-layer coding

单层编码

In 1:1 calls, the encoded bitstream has a single recipient. Using end-to-end bandwidth estimation, the target bitrate can thus be well tailored for the intended recipient. The number of key frames can be kept to a minimum and the compressability of the stream can be maximized. One way of achiving this is by using “single-layer coding”, where each delta frame only depends on the frame that was most recently encoded.

在1:1调用中，编码的比特流只有一个接收方。使用端到端带宽估计，目标比特率因此可以很好地针对预期接收方进行定制。关键帧的数量可以保持在最小值，并且流的可压缩性可以最大化。实现这一点的一种方法是使用“单层编码”，其中每个delta帧只取决于最近编码的帧。

Scalable video coding

分级视频编码

In multiway conferences, on the other hand, the encoded bitstream has multiple recipients each of whom may have different downlink bandwidths. In order to tailor the encoded bitstreams to a heterogeneous network of receivers, scalable video coding can be used. The idea is to introduce structure into the dependency graph of the encoded bitstream, such that layers of the full stream can be decoded using only available lower layers. This structure allows for a selective forwarding unit to discard upper layers of the of the bitstream in order to achieve the intended downlink bandwidth.

另一方面，在多路会议中，编码比特流具有多个接收方，每个接收方可以具有不同的下行链路带宽。为了使编码的比特流适应接收器的异构网络，可以使用可缩放的视频编码。其思想是将结构引入编码比特流的依赖图中，使得可以仅使用可用的较低层来解码完整流的层。这种结构允许选择性转发单元丢弃比特流的上层，以便实现预期的下行链路带宽。

There are multiple types of scalability:

有多种类型的可扩展性：

Temporal scalability are layers whose framerate (and bitrate) is lower than that of the upper layer(s)
时间可伸缩性是指帧速率（和比特率）低于上层的层
Spatial scalability are layers whose resolution (and bitrate) is lower than that of the upper layer(s)
空间可伸缩性是指分辨率（和比特率）低于上层的层
Quality scalability are layers whose bitrate is lower than that of the upper layer(s)
质量可伸缩性是指比特率低于上层比特率的层

WebRTC supports temporal scalability for VP8, VP9 and AV1, and spatial scalability for VP9 and AV1.

WebRTC支持VP8、VP9和AV1的时间可扩展性，以及VP9和AV 1的空间可扩展性。

Simulcast

同时播放

Simulcast is another approach for multiway conferencing, where multiple independent bitstreams are produced by the encoder.

Simulcast是多路会议的另一种方法，其中编码器产生多个独立的比特流。

In cases where multiple encodings of the same source are required (e.g., uplink transmission in a multiway call), spatial scalability with inter-layer prediction generally offers superior coding efficiency compared with simulcast. When a single encoding is required (e.g., downlink transmission in any call), simulcast generally provides better coding efficiency for the upper spatial layers. The K-SVC concept, where spatial inter-layer dependencies are only used to encode key frames, for which inter-layer prediction is typically significantly more effective than it is for delta frames, can be seen as a compromise between full spatial scalability and simulcast.

在需要同一源的多个编码的情况下（例如，多路呼叫中的上行链路传输），与联播相比，具有层间预测的空间可伸缩性通常提供优越的编码效率。当需要单个编码时（例如，任何呼叫中的下行链路传输），联播通常为上层空间层提供更好的编码效率。K-SVC概念，其中空间层间依赖性仅用于对关键帧进行编码，对于关键帧，层间预测通常比德尔塔帧有效得多，可以被视为全空间可扩展性和联播之间的折衷。

Overview of implementation in `modules/video_coding`

modules/video_coding实现概述

Given the general introduction to video coding above, we now describe some specifics of the modules/video_coding folder in WebRTC.

鉴于上面对视频编码的一般介绍，我们现在描述WebRTC中modules/video_coding文件夹的一些细节。

Built-in software codecs in modules/video_coding/codecs

modules/video_coding/codecs中的内置软编解码器

This folder contains WebRTC-specific classes that wrap software codec implementations for different video coding standards:

此文件夹包含特定于WebRTC的类，这些类包装不同视频编码标准的软件编解码器实现：

libaom for AV1
libvpx for VP8 and VP9
OpenH264 for H.264 constrained baseline profile

Users of the library can also inject their own codecs, using the VideoEncoderFactory and VideoDecoderFactory interfaces. This is how platform-supported codecs, such as hardware backed codecs, are implemented.

库的用户还可以使用VideoEncoderFactory和VideoDecoderFactory接口注入自己的编解码器。这就是平台支持的编解码器（如硬件支持的编解码）的实现方式。

Video codec test framework in modules/video_coding/codecs/test

modules/video_coding/codecs/test中的视频编解码器测试框架

This folder contains a test framework that can be used to evaluate video quality performance of different video codec implementations.

此文件夹包含一个测试框架，可用于评估不同视频编解码器实现的视频质量性能。

SVC helper classes in modules/video_coding/svc

modules/video_coding/svc中的SVC帮助类

ScalabilityStructure* - different standardized scalability structures
ScalabilityStructure*-不同的标准化可扩展性架构
ScalableVideoController - provides instructions to the video encoder how to create a scalable stream
ScalableVideoController-为视频编码器提供如何创建可扩展流的说明
SvcRateAllocator - bitrate allocation to different spatial and temporal layers
SvcRateAllocator-不同空间和时间层的比特率分配

Utility classes in modules/video_coding/utility

modules/video_coding/utility中的实用类

FrameDropper - drops incoming frames when encoder systematically overshoots its target bitrate
FrameDropper-当编码器系统性地超出其目标比特率时，丢弃传入帧
FramerateController - drops incoming frames to achieve a target framerate
FramerateController-丢弃传入帧以实现目标帧速率
QpParser - parses the quantization parameter from a bitstream
QpParser-从比特流中解析量化参数
QualityScaler - signals when an encoder generates encoded frames whose quantization parameter is outside the window of acceptable values
QualityScaler-编码器生成量化参数在可接受值窗口之外的编码帧时的信号
SimulcastRateAllocator - bitrate allocation to simulcast layers
SimulcastRateAllocator-向联播层分配比特率

General helper classes in modules/video_coding

modules/video_coding中的通用助手类

FecControllerDefault - provides a default implementation for rate allocation to forward error correction
FecControllerDefault-为前向纠错提供速率分配的默认实现
VideoCodecInitializer - converts between different encoder configuration structs
VideoCodecInitializer-不同编码器配置结构之间的转换

Receiver buffer classes in modules/video_coding

modules/video_coding中的接收器缓冲区类

PacketBuffer - (re-)combines RTP packets into frames
PacketBuffer-将RTP数据包组合成帧
RtpFrameReferenceFinder - determines dependencies between frames based on information in the RTP header, payload header and RTP extensions
RtpFrameReferenceFinder-基于RTP报头、有效载荷报头和RTP扩展中的信息来确定帧之间的依赖性
FrameBuffer - order frames based on their dependencies to be fed to the decoder
FrameBuffer-基于帧的依赖性对帧进行排序，以将其馈送到解码器