《VP9 Video Codec》Overview

资料:《VP9 Video Codec》

进度

  • 2021.6.30 ongoing,一堆不明白的概念,补充中

VP9 Video Codec

vp9是WebM Project's的下一代开放视频编解码器,从June 17,2013起可以使用。本页总结了WebM community感兴趣的,开放的VP9 topics

Draft VP9 Bitstream and Decoding Process Specification(起草VP9比特流和解码过程规范)

注意:

  • the spec is not final
  • has performed internally,seek for external review
  • We have compiled the syntax tables into an application, and verified that Argon Design's test streams produce output identical to vpxenc.
    • Argon Design是一家音视频解决方案公司,从事过CPU,半导体,嵌入式,2D/3D图形引擎的设计,并在此基础上,研发Argon Streams视频解码器验证方案

草案链接

相关链接:Draft: RTP Payload Format for VP9 Video

VP9 Profiles and Levels

HDR10+ 元数据处理

HDR10+ metadata can be specified in the form of ITU-T T.35 terminal codes. See the BlockAddID element in the WebM Container Guidelines. ATSC 2094-40/CTA-861.4 define one of the possible ways to specify HDR10+ metadata.

HDR介绍

  • HDR全称是High Dynamic Range(高动态范围),目的是为了还原自然光的高动态范围,在摄影和视频领域都有HDR技术,但技术原理和效果不同
  • 在摄影方面,使用的技术叫“多次曝光,软件合成”。目前在各个层面(系统,软件,硬件)层面,图像的存储使用8bits RGB色彩空间,这种规格表示的色彩深度和范围,不如自然界的光线丰富(换句话说,就是有损保存)。虽然现在的相机可以无损保存高动态范围图像(RAW),但在最终显示的时候,也会收到软件和硬件的限制,通常情况下不能直接显示超过8bit色深的图像
    • 在光线强烈对比的环境中,高光部分曝光,低光就很暗;低光部分曝光,高光就很暗。HDR就是为了解决这个问题,分别曝光的图片进行合成
    • 一些照片看上去很震撼但不真实,原因就处在照片的实际动态范围不够,和自然光线差距太大
  • 视频领域的HDR就复杂一些。视频从采集,制作,存储到播放,能全程实现10bit以上的高动态范围,更真实的呈现出自然光
  • 问题
    • 色彩空间:色彩范围(能显示哪些色彩)
    • 色彩深度:8bit和16bit,保存颜色用的最大bit位数,就是所谓的色彩深度
    • 动态范围:最亮和最暗的距离,<=0就是0全黑,>=255就是255全白

HDR10+介绍

  • HDR10+和HDR10完全免费开源,虽然在效果上干不过杜比视界,但总的来说已经满足人们看视频的需求
  • HDR10+支持动态元数据,HDR10是静态元数据

图像基本知识

  • 图像可以用三通道去表示
    • 色调(hue),饱和度(saturation),亮度(luminance)。简称HSL
    • RGB通道
    • YCbCr通道
  • 8bit * 3 = 24bit  2^24 → 1670w,人眼能够分辨1000w

Acquiring VP9(libvpx)

2013.6.17起,VP9编码和在libvpx中支持

libvpx build prerequistes,介绍构建的条件

问题:libvpx是什么

  • VP8/VP9编解码SDK(software development kit)

用户贡献页

产品支持

  • Microsofct Edge
  • WebRTC
  • Google Chroe
  • Mozilla Firefox
  • VLC播放器
  • FFMpeg/Libav,这两个是什么?
    • FFMpeg:可以编解码,播放视频的代码。理解为音视频功能大杂烩
    • Libav:从FFMpeg fork出来的分支,因为两帮程序员的设计理念不同,分道扬镳

世界上最快的VP9视频解码器 As before , I was very excited when Google released VP9 – for one, because I was one of the people involved in creating it back when I worked for Google (I no longer do). How good is it, and how much better can it be? To evaluate that question, Clément Bœsch and I set out to write a VP9 decoder from scratch for FFmpeg. The goals never changed from the original ffvp8 situation (community-developed, fast, free from the beginning). We also wanted to answer new questions: how does a well-written decoder compare, speed-wise, with a well-written decoder for other codecs? TLDR (see rest of post for details): as a codec, VP9 is quite impressive – it beats x264 in many cases. However, the encoder is slow, very slow. At higher speed settings, the quality gain melts away. This seems to be similar to what people report about HEVC (using e.g. x265 as an encoder). single-threaded decoding speed of libvpx isn’t great. FFvp9 beats it by 25-50% on a variety of machines. FFvp9 is somewhat slower than ffvp8, and somewhat faster than ffh264 decoding speed (for files encoded to matching SSIM scores). Multi-threading performance in libvpx is deplorable, it gains virtually nothing from its loopfilter-mt algorithm. FFvp9 multi-threading gains nearly as much as ffh264/ffvp8 multithreading, but there’s a cap (material-, settings- and resolution-dependent, we found it to be around 3 threads in one of our clips although it’s typically higher) after which further threads don’t cause any more gain. The codec itself To start, we did some tests on the encoder itself. The direct goal here was to identify bitrates at which encodings would give matching SSIM-scores so we could do same-quality decoder performance measurements. However, as such, it also allows us to compare encoder performance in itself. We used settings very close to recommended settings forVP8,VP9andx264, optimized for SSIM as a metric. As source clips, we chose Sintel (1920×1080 CGI content, source ), a 2-minute clip from Tears of Steel (1920×800 cinematic content, source ), and a 3-minute clip from Enter the Void (1920×818 high-grain/noise content,screenshot). For each, we encoded at various bitrates and plotted effective bitrate versus SSIM . sintel_ssimtos_ssimetv_ssim You’ll notice that in most cases, VP9 can indeed beat x264, but, there’s some big caveats: VP9 encoding (using libvpx) is horrendously slow – like, 50x slower than VP8/x264 encoding. This means that encoding a 3-minute 1080p clip takes several days on a high-end machine. Higher –cpu-used=X parameters make the quality gains melt away. libvpx’ VP9 encodes miss the target bitrates by a long shot (100% off) for the ETV clip, possibly because of our use of –aq-mode=1. libvpx tends to slowly crumble at higher bitrates for hard content – again, look at the ETV clip, where x264 shows some serious mature killer instinct at the high bitrate end of things. Overall, these results are promising, although the lack-of-speed is a serious issue. Decoder performance For decoding performance measurements, we chose (Sintel)500 (VP9), 1200 (VP8) and 700 (x264) kbps (SSIM=19.8); Tears of Steel4.0 (VP9), 7.9 (VP8) and 6.3 (x264) mbps (SSIM=19.2); and Enter the Void 9.7 (VP9), 16.6 (VP8) and 10.7 (x264) mbps (SSIM=16.2). We used FFmpeg to decode each of these files, either using the built-in decoder (to compare between codecs), or using libvpx-vp9 (to compare ffvp9 versus libvpx). Decoding time was measured in seconds using “time ffmpeg -threads 1 [-c:v libvpx-vp9] -i $file -f null -v 0 -nostats – 2>&1 | grep user”, with this FFmpeg and this libvpx revision (downloaded on Feb 20th, 2014). sintel_archs tos_archsetv_archs A few notes on ffvp9 vs. libvpx-vp9 performance: ffvp9 beats libvpx consistently by 25-50%. In practice, this means that typical middle- to high-end hardware will be able to playback 4K content using ffvp9, but not using libvpx. Low-end hardware will struggle to playback even 720p content using libvpx (but do so fine using ffvp9). on Haswell, the difference is significantly smaller than on sandybridge, likely because libvpx has some AVX2 optimizations (e.g. for MC and loop filtering), whereas ffvp9 doesn’t have that yet; this means this difference might grow over time as ffvp9 gets AVX2 optimizations also. on the Atom, the differences are significantly smaller than on other systems; the reason for this is likely that we haven’t done any significant work on Atom-performance yet. Atom has unusually large latencies between GPRs and XMM registers, which means you need to take special care in ordering your instructions to prevent unnecessary halts – we haven’t done anything in that area yet (for ffvp9). Some users may find that ffvp9 is a lot slower than advertised on 32bit; this is correct, most of our SIMD only works on 64bit machines. If you have 32bit software, port it to 64bit. Can’t port it? Ditch it. Nobody owns 32bit x86 hardware anymore these days. So how does VP9 decoding performance compare to that of other codecs? There’s basically two ways to measure this: same-bitrate (e.g. a 500kbps VP8 file vs. a 500kbps VP9 file, where the VP9 file likely looks much better), or same-quality (e.g. a VP8 file with SSIM=19.2 vs. a VP9 file with SSIM=19.2, where the VP9 file likely has a much lower bitrate). We did same-quality measurements, and found: ffvp9 tends to beat ffh264 by a tiny bit (10%), except on Atom (which is likely because ffh264 has received more Atom-specific attention than ffvp9). ffvp9 tends to be quite a bit slower than ffvp8 (15%), although the massive bitrate differences in Enter the Void actually makes it win for that clip (by about 15%, except on Atom). Given that Google promised VP9 would be no more than 40% more complex than VP8, it seems they kept that promise. we did some same-bitrate comparisons, and found that x264 and ffvp9 are essentially identical in that scenario (with x264 having slightly lower SSIM scores); vp8 tends to be about 50% faster, but looks significantly worse. Multithreading One of the killer-features in FFmpeg is frame-level multithreading, which allows multiple cores to decode different video frames in parallel. Libvpx also supports multithreading. So which is better?
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值