姜健:VP9可适性视频编码(SVC)新特性

本文由Google软件工程师姜健分享,深入探讨VP9可适性视频编码(SVC)的新功能,包括SVC参考帧预测、帧内编码帧、长时间尺度的编码帧预测等,对比VP8 Simulcast,并介绍去噪技术的应用,展示了VP9 SVC在视频会议和实时通信中的优势。
摘要由CSDN通过智能技术生成

640?wx_fmt=jpeg


与VP8相比,VP9进行了大量的设计改进以尽可能的获得更高的视频编码质量。Google软件工程师 姜健详细介绍了VP9可适性视频编码(SVC)中多种新功能的实现与相应API。本文来自姜健在LiveVideoStack 线上交流分享,并由LiveVideoStack整理而成。


文 / 姜健

整理 / LiveVideoStack

直播回放

https://www2.tutormeetplus.com/v2/render/playback?mode=playback&token=e9d457fedba34b69844f3cba29345704


大家好,我是来自Google的姜健,今天主要想给大家分享一下我们在VP9 SVC里新加的一些功能,以及一些相应API的设置,整体上会比较偏向技术一点。

 

640?wx_fmt=png


分享的主要内容包括以下几个方面:


1、 介绍VP9 SVC;

2、 对比SVC和VP8的一些参数;

3、 SVC中去噪功能的实现;


一、SVC (Scalable Video Coding) in VP9


首先向大家提出一个问题,我们为什么要用SVC?常用的视频编码难道不可以吗?当我们进行视频会议时,可能会有多方的参与者。如果其中一方参与者的网络状况不是很好,在不采用SVC编码时,则只有一个分辨率,包括空间和时间分辨率。这时,需要根据网络情况不太好的参与者来进行丢包,或者降低空间分辨率来适应其网络状况。又因为每个参与者收到的包都是一样的,这样其他网络情况好的参与者也就会同时受到影响。但是在采用SVC编码的情况下,我们就可以很好的解决这个问题。SVC可以编码不同的分辨率,服务器在分发的时候,它可以根据不同接收者的网络情况对应分发高分辨率或低分辨率的帧,当有参与者的网络情况不好时,就接受低分辨率的帧。这样一来,其他网络情况好的参与者可以不受影响的接收高分辨率的视频帧。


640?wx_fmt=png


目前VP9 SVC仍在WebRTC中不断的改进,特别是对于屏幕分享的一些参数。Google正在进行Dogfood,意思就是指大规模的内测。因为Google本身有九万多名员工,我们依靠自己员工的力量来进行内测,员工们如果有什么问题,也可以及时向我们提交一些反馈,以便于进行修改。当内测进行几个月之后,就会把它开放给公众,后面会介绍一下我们在Dogfood中接收到的一些反馈和问题。

 

640?wx_fmt=png


今天会主要介绍VP9 SVC的几种Feature。首先会给大家介绍一下SVC的参考帧预测,因为SVC包含空间和时间上的不同分辨率,所以在参考帧的预测上会有很大的区别。我们还添加了一些特殊的Feature,比如帧内编码帧,但是这个帧不是关键帧。另外,SVC的预测模式固定之后是可以更改的,即可以在编码的过程中,随时修改预测模式。还有一个就是我们最近加入的长时间尺度的编码帧预测,最后则是去噪部分。


1、SVC Superframe

 

640?wx_fmt=png


在这里给大家介绍一下SVC中是如何实现将不同分辨率的帧放在一起的。上图的例子中包含三个不同空间分辨率的帧,在SVC里面,不同的层有不同的分辨率,这个例子中有三个不同的层即三个不同的分辨率。假如相机捕捉到的是一个高清720P的帧,首先我们把最上层720P的帧4×4 Scale Down到180P,然后对其进行编码。把中间层720P的帧2×2 Scale Down到VGA 360P,再对它进行编码。最后,我们再编码HD 720P的帧,编码结束之

  • 2
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
世界上最快的VP9视频解码器 As before , I was very excited when Google released VP9 – for one, because I was one of the people involved in creating it back when I worked for Google (I no longer do). How good is it, and how much better can it be? To evaluate that question, Clément Bœsch and I set out to write a VP9 decoder from scratch for FFmpeg. The goals never changed from the original ffvp8 situation (community-developed, fast, free from the beginning). We also wanted to answer new questions: how does a well-written decoder compare, speed-wise, with a well-written decoder for other codecs? TLDR (see rest of post for details): as a codec, VP9 is quite impressive – it beats x264 in many cases. However, the encoder is slow, very slow. At higher speed settings, the quality gain melts away. This seems to be similar to what people report about HEVC (using e.g. x265 as an encoder). single-threaded decoding speed of libvpx isn’t great. FFvp9 beats it by 25-50% on a variety of machines. FFvp9 is somewhat slower than ffvp8, and somewhat faster than ffh264 decoding speed (for files encoded to matching SSIM scores). Multi-threading performance in libvpx is deplorable, it gains virtually nothing from its loopfilter-mt algorithm. FFvp9 multi-threading gains nearly as much as ffh264/ffvp8 multithreading, but there’s a cap (material-, settings- and resolution-dependent, we found it to be around 3 threads in one of our clips although it’s typically higher) after which further threads don’t cause any more gain. The codec itself To start, we did some tests on the encoder itself. The direct goal here was to identify bitrates at which encodings would give matching SSIM-scores so we could do same-quality decoder performance measurements. However, as such, it also allows us to compare encoder performance in itself. We used settings very close to recommended settings forVP8,VP9andx264, optimized for SSIM as a metric. As source clips, we chose Sintel (1920×1080 CGI content, source ), a 2-minute clip from Tears of Steel (1920×800 cinematic content, source ), and a 3-minute clip from Enter the Void (1920×818 high-grain/noise content,screenshot). For each, we encoded at various bitrates and plotted effective bitrate versus SSIM . sintel_ssimtos_ssimetv_ssim You’ll notice that in most cases, VP9 can indeed beat x264, but, there’s some big caveats: VP9 encoding (using libvpx) is horrendously slow – like, 50x slower than VP8/x264 encoding. This means that encoding a 3-minute 1080p clip takes several days on a high-end machine. Higher –cpu-used=X parameters make the quality gains melt away. libvpx’ VP9 encodes miss the target bitrates by a long shot (100% off) for the ETV clip, possibly because of our use of –aq-mode=1. libvpx tends to slowly crumble at higher bitrates for hard content – again, look at the ETV clip, where x264 shows some serious mature killer instinct at the high bitrate end of things. Overall, these results are promising, although the lack-of-speed is a serious issue. Decoder performance For decoding performance measurements, we chose (Sintel)500 (VP9), 1200 (VP8) and 700 (x264) kbps (SSIM=19.8); Tears of Steel4.0 (VP9), 7.9 (VP8) and 6.3 (x264) mbps (SSIM=19.2); and Enter the Void 9.7 (VP9), 16.6 (VP8) and 10.7 (x264) mbps (SSIM=16.2). We used FFmpeg to decode each of these files, either using the built-in decoder (to compare between codecs), or using libvpx-vp9 (to compare ffvp9 versus libvpx). Decoding time was measured in seconds using “time ffmpeg -threads 1 [-c:v libvpx-vp9] -i $file -f null -v 0 -nostats – 2>&1 | grep user”, with this FFmpeg and this libvpx revision (downloaded on Feb 20th, 2014). sintel_archs tos_archsetv_archs A few notes on ffvp9 vs. libvpx-vp9 performance: ffvp9 beats libvpx consistently by 25-50%. In practice, this means that typical middle- to high-end hardware will be able to playback 4K content using ffvp9, but not using libvpx. Low-end hardware will struggle to playback even 720p content using libvpx (but do so fine using ffvp9). on Haswell, the difference is significantly smaller than on sandybridge, likely because libvpx has some AVX2 optimizations (e.g. for MC and loop filtering), whereas ffvp9 doesn’t have that yet; this means this difference might grow over time as ffvp9 gets AVX2 optimizations also. on the Atom, the differences are significantly smaller than on other systems; the reason for this is likely that we haven’t done any significant work on Atom-performance yet. Atom has unusually large latencies between GPRs and XMM registers, which means you need to take special care in ordering your instructions to prevent unnecessary halts – we haven’t done anything in that area yet (for ffvp9). Some users may find that ffvp9 is a lot slower than advertised on 32bit; this is correct, most of our SIMD only works on 64bit machines. If you have 32bit software, port it to 64bit. Can’t port it? Ditch it. Nobody owns 32bit x86 hardware anymore these days. So how does VP9 decoding performance compare to that of other codecs? There’s basically two ways to measure this: same-bitrate (e.g. a 500kbps VP8 file vs. a 500kbps VP9 file, where the VP9 file likely looks much better), or same-quality (e.g. a VP8 file with SSIM=19.2 vs. a VP9 file with SSIM=19.2, where the VP9 file likely has a much lower bitrate). We did same-quality measurements, and found: ffvp9 tends to beat ffh264 by a tiny bit (10%), except on Atom (which is likely because ffh264 has received more Atom-specific attention than ffvp9). ffvp9 tends to be quite a bit slower than ffvp8 (15%), although the massive bitrate differences in Enter the Void actually makes it win for that clip (by about 15%, except on Atom). Given that Google promised VP9 would be no more than 40% more complex than VP8, it seems they kept that promise. we did some same-bitrate comparisons, and found that x264 and ffvp9 are essentially identical in that scenario (with x264 having slightly lower SSIM scores); vp8 tends to be about 50% faster, but looks significantly worse. Multithreading One of the killer-features in FFmpeg is frame-level multithreading, which allows multiple cores to decode different video frames in parallel. Libvpx also supports multithreading. So which is better?
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值