佳能镜头编码_基于优化镜头的编码,现在可用于4k流

佳能镜头编码

by Aditya Mavlankar, Liwei Guo, Anush Moorthy and Anne Aaron

通过阿迪亚Mavlankar杨利伟郭Anush Moorthy安妮·亚伦

Netflix has an ever-expanding collection of titles which customers can enjoy in 4K resolution with a suitable device and subscription plan. Netflix creates premium bitstreams for those titles in addition to the catalog-wide 8-bit stream profiles¹. Premium features comprise a title-dependent combination of 10-bit bit-depth, 4K resolution, high frame rate (HFR) and high dynamic range (HDR) and pave the way for an extraordinary viewing experience.

Netflix拥有种类繁多的标题,客户可以通过适当的设备和订阅计划以4K分辨率欣赏这些标题。 除目录范围的8位流配置文件¹之外,Netflix还为这些标题创建了高级位流。 高级功能包括与标题相关的10位比特深度,4K分辨率,高帧率(HFR)和高动态范围(HDR)的组合,并为获得非凡的观看体验铺平了道路。

The premium bitstreams, launched several years ago, were rolled out with a fixed-bitrate ladder, with fixed 4K resolution bitrates — 8, 10, 12 and 16 Mbps — regardless of content characteristics. Since then, we’ve developed algorithms such as per-title encode optimizations and per-shot dynamic optimization, but these innovations were not back-ported on these premium bitstreams. Moreover, the encoding group of pictures (GoP) duration (or keyframe period) was constant throughout the stream causing additional inefficiency due to shot boundaries not aligning with GoP boundaries.

几年前推出的高级比特流,采用了固定比特率的阶梯,无论内容特性如何,都具有固定的4K分辨率比特率,即8、10、12和16 Mbps。 从那时起,我们开发了诸如按字幕编码优化按镜头动态优化之类的算法 ,但是这些创新并未在这些高级比特流上进行反向移植。 而且,在整个流中,图片的编码组(GoP)持续时间(或关键帧周期)是恒定的,由于镜头边界与GoP边界不对齐,导致额外的效率低下。

As the number of 4K titles in our catalog continues to grow and more devices support the premium features, we expect these video streams to have an increasing impact on our members and the network. We’ve worked hard over the last year to leapfrog to our most advanced encoding innovations — shot-optimized encoding and 4K VMAF model — and applied those to the premium bitstreams. More specifically, we’ve improved the traditional 4K and 10-bit ladder by employing

随着目录中4K标题的数量持续增长,以及更多支持高级功能的设备,我们希望这些视频流对我们的会员和网络产生越来越大的影响。 在过去的一年中,我们一直在努力实现最先进的编码创新-镜头优化编码和4K VMAF模型 -并将其应用于高级比特流。 更具体地说,我们通过采用以下方法改进了传统的4K和10位阶梯

In this blog post, we present benefits of applying the above-mentioned optimizations to standard dynamic range (SDR) 10-bit and 4K streams (some titles are also HFR). As for HDR, our team is currently developing an HDR extension to VMAF, Netflix’s video quality metric, which will then be used to optimize the HDR streams.

在此博客文章中,我们介绍了将上述优化应用于标准动态范围(SDR)10位和4K流(某些标题也是HFR)的好处。 至于HDR,我们的团队目前正在开发对VMAF( Netflix的视频质量指标)的HDR扩展,然后将其用于优化HDR流。

¹ The 8-bit stream profiles go up to 1080p resolution.

¹8 位流配置文件的分辨率高达1080p。

比特率与质量比较 (Bitrate versus quality comparison)

For a sample of titles from the 4K collection, the following plots show the rate-quality comparison of the fixed-bitrate ladder and the optimized ladder. The plots have been arranged in decreasing order of the new highest bitrate — which is now content adaptive and commensurate with the overall complexity of the respective title.

对于来自4K集合的标题样本,以下图表显示了固定比特率阶梯和优化阶梯的速率质量比较。 剧情按照新的最高比特率的降序排列-现在内容自适应并且与相应标题的整体复杂度相对应。

Image for post
Fig. 2: Example of a sitcom episode with some action showing new highest bitrate of 8.5 Mbps
图2:一个情景喜剧示例,其中包含一些动作,显示出新的最高8.5 Mbps比特率
Image for post
Fig. 3: Example of a sitcom episode with less action showing new highest bitrate of 6.6 Mbps
图3:情景喜剧示例,动作较少,显示了6.6 Mbps的新最高比特率
Image for post
Fig. 4: Example of a 4K animation episode showing new highest bitrate of 1.8 Mbps
图4:4K动画片段示例显示了1.8 Mbps的新最高比特率

The bitrate as well as quality shown for any point is the average for the corresponding stream, computed over the duration of the title. The annotation next to the point is the corresponding encoding resolution; it should be noted that video received by the client device is decoded and scaled to the device’s display resolution. As for VMAF score computation, for encoding resolutions less than 4K, we follow the VMAF best practice to upscale to 4K assuming bicubic upsampling. Aside from the encoding resolution, each point is also associated with an appropriate pixel aspect ratio (PAR) to achieve a target 16:9 display aspect ratio (DAR). For example, the 640x480 encoding resolution is paired with a 4:3 PAR to achieve 16:9 DAR, consistent with the DAR for other points on the ladder.

任何点显示的比特率和质量是在标题持续时间内计算出的相应流的平均值 。 该点旁边的注释是相应的编码分辨率。 应当注意,客户端设备接收的视频被解码并缩放到设备的显示分辨率。 对于VMAF分数计算,对于小于4K的编码分辨率,我们假设双三次上采样,则遵循VMAF最佳实践将其扩展到4K。 除了编码分辨率,每个点还与适当的像素长宽比(PAR)关联,以实现目标16:9的显示长宽比(DAR)。 例如,将640x480编码分辨率与4:3 PAR配对以实现16:9 DAR,这与阶梯上其他点的DAR一致。

The last example, showing the new highest bitrate to be 1.8 Mbps, is for a 4K animation title episode which can be very efficiently encoded. It serves as an extreme example of content adaptive ladder optimization — it however should not to be interpreted as all animation titles landing on similar low bitrates.

最后一个示例显示新的最高比特率为1.8 Mbps,该示例适用于4K动画标题情节,可以非常有效地对其进行编码。 它是内容自适应梯形优化的一个极端示例-但是,不应将其解释为所有动画标题都以相似的低比特率着陆。

The resolutions and bitrates for the fixed-bitrate ladder are pre-determined; minor deviation in the achieved bitrate is due to rate control in the encoder implementation not hitting the target bitrate precisely. On the other hand, each point on the optimized ladder is associated with optimal bit allocation across all shots with the goal of maximizing a video quality objective function while resulting in the corresponding average bitrate. Consequently, for the optimized encodes, the bitrate varies shot to shot depending on relative complexity and overall bit budget and in theory can reach the respective codec level maximum. Various points are constrained to different codec levels, so receivers with different decoder level capabilities can stream the corresponding subset of points up to the corresponding level.

固定比特率阶梯的分辨率和比特率是预先确定的; 达到的比特率中的较小偏差是由于编码器实现中的速率控制未精确达到目标比特率。 另一方面,优化梯形图上的每个点都与所有镜头上的最佳位分配相关联,目的是最大化视频质量目标函数,同时获得相应的平均比特率。 因此,对于优化的编码,比特率根据相对复杂度和总体比特预算而逐帧变化,并且理论上可以达到相应编解码器级别的最大值。 各个点被限制在不同的编解码器级别,因此具有不同解码器级别功能的接收器可以将相应的点子集流式传输到相应的级别。

The fixed-bitrate ladder often appears like steps — since it is not title adaptive it switches “late” to most encoding resolutions and as a result the quality stays flat within that resolution even with increasing bitrate. For example, two 1080p points with identical VMAF score or four 4K points with identical VMAF score, resulting in wasted bits and increased storage footprint.

固定比特率的梯形图通常看起来像步长–因为它不是标题自适应的,它会“延迟”切换到大多数编码分辨率,结果,即使增加比特率,质量也保持在该分辨率范围内。 例如,两个具有相同VMAF分数的1080p点或四个具有相同VMAF分数的4K点,导致比特浪费并增加了存储空间。

On the other hand, the optimized ladder appears closer to a monotonically increasing curve — increasing bitrate results in an increasing VMAF score. As a side note, we do have some additional points, not shown in the plots, that are used in resolution limited scenarios — such as a streaming session limited to 720p or 1080p highest encoding resolution. Such points lie under (or to the right of) the convex hull main ladder curve but allow quality to ramp up in resolution limited scenarios.

另一方面,优化的阶梯看起来更接近单调递增的曲线-比特率的增加导致VMAF分数的增加。 附带说明一下,我们确实有一些未在图中显示的其他点,这些点在分辨率受限的情况下使用-例如,将流会话限制为最高720p或1080p编码分辨率。 这些点位于凸包主梯形曲线的下方(或右侧),但允许在分辨率受限的情况下提高质量。

挑战性编码内容 (Challenging-to-encode content)

For the optimized ladders we have logic to detect quality saturation at the high end, meaning an increase in bitrate not resulting in material improvement in quality. Once such a bitrate is reached it is a good candidate for the topmost rung of the ladder. An additional limit can be imposed as a safeguard to avoid excessively high bitrates.

对于优化的梯子,我们具有在高端检测质量饱和的逻辑,这意味着比特率的增加不会导致质量的实质改善。 一旦达到这样的比特率,它就是梯子最高梯级的理想选择。 可以施加一个额外的限制来避免过多的高比特率。

Sometimes we ingest a title that would need more bits at the highest end of the quality spectrum — even higher than the 16 Mbps limit of the fixed-bitrate ladder. For example,

有时,我们摄取的标题在质量频谱的最高端将需要更多位,甚至高于固定位速率阶梯的16 Mbps限制。 例如,

  • a rock concert with fast-changing lighting effects and other details or

    具有快速变化的灯光效果和其他细节的摇滚音乐会,或
  • a wildlife documentary with fast action and/or challenging spatial details.

    具有快速动作和/或具有挑战性的空间细节的野生动物纪录片。

This scenario is generally rare. Nevertheless, below plot highlights such a case where the optimized ladder exceeds the fixed-bitrate ladder in terms of the highest bitrate, thereby achieving an improvement in the highest quality.

这种情况通常很少见。 然而,下面的曲线图突出显示了这样一种情况,即在最高比特率方面优化梯形图超过固定比特率梯形图,从而实现了最高质量的改善。

As expected, the quality is higher for the same bitrate, even when compared in the low or medium bitrate regions.

不出所料,即使在低或中等比特率区域进行比较,相同比特率的质量也更高。

Image for post
Fig. 5: Example of a movie with action and great amount of rich spatial details showing new highest bitrate of 17.2 Mbps
图5:具有动作和大量丰富空间细节的电影示例,显示了17.2 Mbps的新最高比特率

视觉例子 (Visual examples)

As an example, we compare the 1.75 Mbps encode from the fixed-bitrate ladder with the 1.45 Mbps encode from the optimized ladder for one of the titles from our 4K collection. Since 4K resolution entails a rather large number of pixels, we show 1024x512 pixel cutouts from the two encodes. The encodes are decoded and scaled to a 4K canvas prior to extracting the cutouts. We toggle between the cutouts so it is convenient to spot differences. We also show the corresponding full frame which helps to get a sense of how the cutout fits in the corresponding video frame.

举例来说,对于我们4K资料集中的标题之一,我们将固定比特率阶梯的1.75 Mbps编码与优化阶梯的1.45 Mbps编码进行比较。 由于4K分辨率需要大量像素,因此我们显示了两种编码的1024x512像素切口。 在提取切口之前,对编码进行解码并缩放到4K画布。 我们在切口之间切换,以便于发现差异。 我们还显示了相应的完整帧,有助于了解切口如何适合相应的视频帧。

Image for post
Fig. 6: Pristine full frame — the purpose is to give a sense of how below cutouts fit in the frame
图6:原始的全画幅—目的是让人们感觉到下面的切口如何适合画幅
Image for post
Image for post
Image for post
Fig. 7: Toggling between 1024x512 pixel cutouts from two encodes as annotated. Corresponding to pristine frame shown in Figure 6.
图7:在两种编码的1024x512像素切口之间切换,如注释所示。 对应的原始框架如图6所示。
Image for post
Fig. 8: Pristine full frame — the purpose is to give a sense of how below cutouts fit in the frame
图8:原始的全画幅—目的是让人们感觉到下面的切口如何适合画幅
Image for post
Image for post
Fig. 9: Toggling between 1024x512 pixel cutouts from two encodes as annotated. Corresponding to pristine frame shown in Figure 8.
图9:已注释的在两种编码的1024x512像素切口之间切换。 对应的原始框架如图8所示。
Image for post
Fig. 10: Pristine full frame — the purpose is to give a sense of how below cutouts fit in the frame
图10:原始的全画幅—目的是让人们感觉到下面的切口如何适合画幅
Image for post
Image for post
Fig. 11: Toggling between 1024x512 pixel cutouts from two encodes as annotated. Corresponding to pristine frame shown in Figure 10.
图11:在两种编码的1024x512像素切口之间切换,如注释所示。 对应的原始框架如图10所示。
Image for post
Fig. 12: Pristine full frame — the purpose is to give a sense of how below cutouts fit in the frame
图12:原始的全画幅—目的是让人们感觉下面的切口如何适合画幅
Image for post
Image for post
Fig. 13: Toggling between 1024x512 pixel cutouts from two encodes as annotated. Corresponding to pristine frame shown in Figure 12.
图13:在两种编码的1024x512像素切口之间切换,如图所示。 对应于原始框架,如图12所示。
Image for post
Fig. 14: Pristine full frame — the purpose is to give a sense of how below cutouts fit in the frame
图14:原始的全画幅—目的是让人们感觉到下面的切口如何适合画幅
Image for post
Fig. 15: Toggling between 1024x512 pixel cutouts from two encodes as annotated. Corresponding to pristine frame shown in Figure 14.
图15:在两种编码的1024x512像素切口之间切换,如注释所示。 对应于原始框架,如图14所示。

As can be seen, the encode from the optimized ladder delivers crisper textures and higher detail for less bits. At 1.45 Mbps it is by no means a perfect 4K rendition, but still very commendable for that bitrate. There exist higher bitrate points on the optimized ladder that deliver impeccable 4K quality, also for less bits compared to the fixed-bitrate ladder.

可以看出,优化梯形图的编码提供了更清晰的纹理和更高的细节,更少的位。 在1.45 Mbps的速度下,这绝不是完美的4K再现,但对于该比特率仍然非常值得称赞。 与固定比特率梯形图相比,优化阶梯上存在更高的比特率点,可提供无可挑剔的4K质量,而且比特数更少。

压缩和比特率阶梯改善 (Compression and bitrate ladder improvements)

Even before testing the new streams in the field, we observe the following advantages of the optimized ladders vs the fixed ladders, evaluated over 100 sample titles:

甚至在现场测试新流之前,我们都会观察到优化梯形图相对于固定梯形图的以下优点,对100个样本标题进行了评估:

  • Computing the Bjøntegaard Delta (BD) rate shows 50% gains on average over the fixed-bitrate ladder. Meaning, on average we need 50% less bitrate to achieve the same quality with the optimized ladder.

    计算Bjøntegaard增量(BD)速率显示,固定比特率阶梯上的平均增益 50% 。 这意味着,平均而言,我们需要的比特率降低50%,以实现优化梯形图的相同质量。

  • The highest 4K bitrate on average is 8 Mbps which is also a 50% reduction compared to 16 Mbps of the fixed-bitrate ladder.

    最高的4K比特率平均为8 Mbps,与固定比特率阶梯的16 Mbps相比也降低50%

  • As mobile devices continue to improve, they adopt premium features (other than 4K resolution) like 10-bit and HFR. These video encodes can be delivered to mobile devices as well. The fixed-bitrate ladder starts at 560 kbps which may be too high for some cellular networks. The optimized ladder, on the other hand, has lower bitrate points that are viable in most cellular scenarios.

    随着移动设备的不断改进,它们采用了高级功能(4K分辨率除外),例如10位和HFR。 这些视频编码也可以传递到移动设备。 固定比特率阶梯开始于560 kbps,对于某些蜂窝网络而言可能太高了。 另一方面,优化的梯形图具有较低的比特率点,这在大多数蜂窝场景中都是可行的。
  • The optimized ladder entails a smaller storage footprint compared to the fixed-bitrate ladder.

    与固定比特率的梯形图相比,优化的梯形图占用的存储空间更小。
  • The new ladder considers adding 1440p resolution (aka QHD) points if they lie on the convex hull of rate-quality tradeoff and most titles seem to get the 1440p treatment. As a result, when averaged over 100 titles, the bitrate required to jump to a resolution higher than 1080p (meaning either QHD or 4K) is 1.7 Mbps compared to 8 Mbps of the fixed-bitrate ladder. When averaged over 100 titles, the bitrate required to jump to 4K resolution is 3.2 Mbps compared to 8 Mbps of the fixed-bitrate ladder.

    如果新梯子位于速率质量权衡的凸包上,并且大多数标题似乎都接受了1440p处理,则考虑增加1440p分辨率(即QHD)点。 结果,当平均超过100个标题时,跳转到高于1080p的分辨率(意味着QHD或4K)所需的比特率是1.7 Mbps ,而固定比特率阶梯的是8 Mbps 。 当平均超过100个标题时,跳到4K分辨率所需的比特率是3.2 Mbps ,而固定比特率阶梯的是8 Mbps

给会员的好处 (Benefits to members)

At Netflix we perform A/B testing of encoding optimizations to detect any playback issues on client devices as well as gauge the benefits experienced by our members. One set of streaming sessions receives the default encodes and the other set of streaming sessions receives the new encodes. This in turn allows us to compare error rates as well as various metrics related to quality of experience (QoE). Although our streams are standard compliant, the A/B testing can and does sometimes find device-side implementations with minor gaps; in such cases we work with our device partners to find the best remedy.

在Netflix,我们执行编码优化的A / B测试,以检测客户端设备上的任何播放问题,并评估会员的利益。 一组流传输会话接收默认编码,另一组流传输会话接收新的编码。 反过来,这使我们可以比较错误率以及与体验质量(QoE)相关的各种指标。 尽管我们的数据流符合标准,但A / B测试可以并且有时确实发现设备端实现之间的差距很小。 在这种情况下,我们会与我们的设备合作伙伴一起寻找最佳的补救措施。

Overall, while A/B testing these new encodes, we have seen the following benefits, which are in line with the offline evaluation covered in the previous section:

总体而言,在A / B测试这些新编码时,我们看到了以下好处,这与上一节中介绍的脱机评估是一致的:

  • For members with high-bandwidth connections we deliver the same great quality at half the bitrate on average.

    对于具有高带宽连接的成员,我们以平均一半的比特率提供相同的高质量

  • For members with constrained bandwidth we deliver higher quality at the same (or even lower) bitrate — higher VMAF at the same encoding resolution and bitrate or even higher resolutions than they could stream before. For example, members who were limited by their network to 720p can now be served 1080p or higher resolution instead.

    对于带宽受限的成员,我们以相同(甚至更低)的比特率提供了更高的质量-在相同的编码分辨率,比特率甚至更高的分辨率下提供了更高的VMAF,这是他们之前无法提供的。 例如,受网络限制为720p的成员现在可以使用1080p或更高分辨率的服务。
  • Most streaming sessions start with a higher initial quality.

    大多数流会话以更高的初始质量开始。
  • The number of rebuffers per hour go down by over 65%; members also experience fewer quality drops while streaming.

    每小时重新缓冲的数量下降了65%以上 ; 成员在流式传输中遇到的质量下降也更少。

  • The reduced bitrate together with some Digital Rights Management (DRM) system improvements (not covered in this blog) result in reducing the initial play delay by about 10%.

    降低的比特率以及数字版权管理(DRM)系统的一些改进(本博客中未介绍)导致初始播放延迟减少了大约10%

下一步 (Next steps)

We have started re-encoding the 4K titles in our catalog to generate the optimized streams and we expect to complete in a couple of months. We continue to work on applying similar optimizations to our HDR streams.

我们已经开始对目录中的4K标题进行重新编码,以生成优化的视频流,我们希望在几个月后完成。 我们将继续努力对HDR流应用类似的优化。

致谢 (Acknowledgements)

We thank Lishan Zhu for help rendered during A/B testing.

感谢朱立山在A / B测试期间提供的帮助。

This is a collective effort on the part of our larger team, known as Encoding Technologies, and various other teams that we have crucial partnerships with, such as:

这是我们较大的团队(称为编码技术)以及我们与之有着重要合作伙伴关系的其他各种团队的共同努力,例如:

If you are passionate about video compression research and would like to contribute to this field, we have an open position.

如果您对视频压缩研究充满热情并且想在这一领域做出贡献,我们将为您提供开放的职位。

翻译自: https://netflixtechblog.com/optimized-shot-based-encodes-for-4k-now-streaming-47b516b10bbb

佳能镜头编码

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值