VC-1 Technical Overview

Contents

 

Introduction

VC-1 is a video codec specification that has been standardized by the Society of Motion Picture and Television Engineers (SMPTE) and implemented by Microsoft as Microsoft® Windows Media® Video (WMV) 9. Formal standardization of VC-1 represents the culmination of years of technical scrutiny by over 75 companies. SMPTE 421M details the complete bit stream syntax and is accompanied by two companion documents (SMPTE RP227 and SMPTE RP228) that describe VC-1 transport and conformance. These documents provide comprehensive guidance to ensure content delivery and interoperability.

Back to the top of this pageBack to Top


Overview of VC-1

The VC-1 codec is designed to achieve state-of-the-art compressed video quality at bit rates that may range from very low to very high. The codec can easily handle 1920 pixel × 1080 pixel presentation at 6 to 30 megabits per second (Mbps) for high-definition video. VC-1 is capable of higher resolutions such as 2048 pixels × 1536 pixels for digital cinema, and of a maximum bit rate of 135 Mbps. An example of very low bit rate video would be 160 pixel × 120 pixel presentation at 10 kilobits per second (Kbps) for modem applications.

The basic functionality of VC-1 involves a block-based motion compensation and spatial transform scheme similar to that used in other video compression standards since MPEG-1 and H.261. However, VC-1 includes a number of innovations and optimizations that make it distinct from the basic compression scheme, resulting in excellent quality and efficiency.

Unlike earlier versions of the Windows Media Video implementation, VC-1 is transport and container independent. This provides even greater flexibility for device manufacturers and content services.

Innovations

VC-1 includes a number of innovations that enable it to produce high quality content. This section provides brief descriptions of some of these features.

Adaptive Block Size Transform

Traditionally, 8 × 8 transforms have been used for image and video coding. However, there is evidence to suggest that 4 × 4 transforms can reduce ringing artifacts at edges and discontinuities. VC-1 is capable of coding an 8 × 8 block using either an 8 × 8 transform, two 8 × 4 transforms, two 4 × 8 transforms, or four 4 × 4 transforms. This feature enables coding that takes advantage of the different transform sizes as needed for optimal image quality.

16-Bit Transforms

In order to minimize the computational complexity of the decoder, VC-1 uses 16-bit transforms. This also has the advantage of easy implementation on the large amount of digital signal processing (DSP) hardware built with 16-bit processors. Among the constraints put on VC-1 transforms is the requirement that the 16-bit values used produce results that can fit in 16 bits. The constraints on transforms ensure that decoding is as efficient as possible on a wide range of devices.

Motion Compensation

Motion compensation is the process of generating a prediction of a video frame by displacing the reference frame. Typically, the prediction is formed for a block (an 8 × 8 pixel tile) or a macroblock (a 16 × 16 pixel tile) of data. The displacement of data due to motion is defined by a motion vector, which captures the shift along both the x- and y-axes.

The efficiency of the codec is affected by the size of the predicted block, the granularity of sub-pixel data that can be captured, and the type of filter used for generating sub-pixel predictors. VC-1 uses 16 × 16 blocks for prediction, with the ability to generate mixed frames of 16 × 16 and 8 × 8 blocks. The finest granularity of sub-pixel information supported by VC-1 is 1/4 pixel. Two sets of filters are used by VC-1 for motion compensation. The first is an approximate bicubic filter with four taps. The second is a bilinear filter with two taps.

VC-1 combines the motion vector settings defined by the block size, sub-pixel granularity, and filter type into modes. The result is four motion compensation modes that suit a range of different situations. This classification of settings into modes also helps compact decoder implementations.

Loop Filtering

VC-1 uses an in-loop deblocking filter that attempts to remove block-boundary discontinuities introduced by quantization errors in interpolated frames. These discontinuities can cause visible artifacts in the decompressed video frames and can impact the quality of the frame as a predictor for future interpolated frames.

The loop filter takes into account the adaptive block size transforms. The filter is also optimized to reduce the number of operations required.

Interlace Coding

Interlaced video content is widely used in television broadcasting. When encoding interlaced content, the VC-1 codec can take advantage of the characteristics of interlaced frames to improve compression. This is achieved by using data from both fields to predict motion compensation in interpolated frames.

Advanced B Frame Coding

A bi-directional or B frame is a frame that is interpolated from data both in previous and subsequent frames. B frames are distinct from I frames (also called key frames), which are encoded without reference to other frames. B frames are also distinct from P frames, which are interpolated from previous frames only. VC-1 includes several optimizations that make B frames more efficient.

Fading Compensation

Due to the nature of compression that uses motion compensation, encoding of video frames that contain fades to or from black is very inefficient. With a uniform fade, every macroblock needs adjustments to luminance. VC-1 includes fading compensation, which detects fades and uses alternate methods to adjust luminance. This feature improves compression efficiency for sequences with fading and other global illumination changes.

Profiles and Levels

VC-1 contains a number of profile and level combinations that support the encoding of many types of video. The profile determines the codec features that are available, and thereby determines the required decoder complexity (mathematical intensity). The following table lists VC-1 profiles and levels.

ProfileLevelMax Bit RateRepresentative Resolutions by Frame Rate
Simple
Low
96 Kbps
176 × 144 @ 15 Hz (QCIF)

Medium
384 Kbps
240 × 176 @ 30 Hz
352 × 288 @ 15 Hz (CIF)
Main
Low
2 Mbps
320 × 240 @ 24 Hz (QVGA)

Medium
10 Mbps
720 × 480 @ 30 Hz (480p)
720 × 576 @ 25 Hz (576p)

High
20 Mbps
1920 × 1080 @ 30 Hz (1080p)
Advanced
L0
2 Mbps
352 × 288 @ 30 Hz (CIF)

L1
10 Mbps
720 × 480 @ 30 Hz (NTSC-SD)
720 × 576 @ 25 Hz (PAL-SD)

L2
20 Mbps
720 × 480 @ 60 Hz (480p)
1280 × 720 @ 30 Hz (720p)

L3
45 Mbps
1920 × 1080 @ 24 Hz (1080p)
1920 × 1080 @ 30 Hz (1080i)
1280 × 720 @ 60 Hz (720p)

L4
135 Mbps
1920 × 1080 @ 60 Hz (1080p)
2048 × 1536 @ 24 Hz


Back to the top of this pageBack to Top


VC-1 Compared to Other Codecs

VC-1 is very competitive when compared to other codecs in use today. This section compares the performance of VC-1 with MPEG-2 and H.264.

Quality Comparison

VC-1 achieves clearly superior quality to MPEG-2 at comparable bit rates, and has been judged superior to H.264 in several independent studies.

Measuring the quality of a video codec is not easy, because the reconstructed image is not meant to be identical to the original. Ideally, only information that is perceptually irrelevant will be lost in the compression/decompression process, but what counts as "irrelevant" depends on the viewer's subjective response.

One useful objective metric is the peak signal-to-noise ratio (PSNR) plotted against bit rate. PSNR is the ratio between the maximum value of a signal (255 for 8-bit video) and the quantization noise. A higher PSNR indicates a less noisy signal. For any codec, PSNR is expected to increase at higher bit rates, because higher bit rates translate to less aggressive compression. Thus, a graph that plots PSNR against bit rate shows the performance of the codec over a range of compression settings.

In Microsoft's own internal tests, VC-1 performs 2 to 3 times better than MPEG-2. In other words, to achieve a given PSNR, MPEG-2 requires a bit rate up to 3 times higher than VC-1. These results were measured using both low-motion and high-motion video sequences. Microsoft also compared VC-1 with H.264 and found that both codecs have comparable performance when PSNR is plotted against bit rate.

The final arbiter of codec quality is the subjective appearance of the decoded video. In subjective tests, the perceived quality of VC-1 equals or exceeds that of H.264. The DVD Forum conducted tests in the winter of 2002 to select codecs for the next-generation red-laser HD-DVD. Viewers from Hollywood film studios and major consumer electronics companies rated video clips on a scale of 1 to 5 for resolution, noise, and overall impression. Multiple codecs were tested, including MPEG-2, VC-1, H.264, and MPEG-4 Advanced Simple Profile. The baselines against which the codecs were compared were D5 masters and D-VHS (24 Mbps). During the tests, viewers were not told which codec was used to encode each of the clips.

On all three measures (resolution, noise, and overall impression), the quality of VC-1 was judged closest to the original D5 master. By comparison, the H.264 codec was rated as comparable only to MPEG-2 on two of the three measures (resolution and overall impression), and was rated somewhat worse than VC-1 on noise.

VC-1 has performed well in other independent subjective quality tests:
  • DV Magazine found VC-1 to be superior to both MPEG-2 and MPEG-4.
  • TANDBERG Television found VC-1 produces significantly better quality than MPEG-2 and comparable quality to H.264. These results were presented at the 2003 International Broadcasting Convention (IBC).
  • C'T Magazine, Germany's premiere audio-video magazine, compared various codecs, including VC-1, H.264, and MPEG-2, and selected VC-1 as producing the best subjective and objective quality for high-definition (HD) video.
  • The European Broadcasting Union (EBU) found VC-1 had the most consistent quality in tests that compared VC-1, RealMedia V9, the Envivio MPEG-4 encoder, and the Apple MPEG-4 encoder.

Complexity Comparison

It is not enough to deliver high-quality video. A video codec must also be efficient to decode, particularly when the codec is implemented in hardware. Lower complexity means less silicon, lower cost, and fewer problems with power consumption and heat.

Because they are more sophisticated, VC-1 and H.264 are both more complex to decode than MPEG-2. Yet VC-1 is more than twice as efficient to decode as H.264. A study by 3GPP, a collaboration group that is setting 3G mobile phone standards, found that VC-1 Main Profile requires 25% fewer cycles than H.264 Baseline. It should be noted that H.264 Main Profile requires even more cycles than Baseline, because it includes highly complex arithmetic coding, also known as CABAC.

In fact, software decoding of VC-1 at 1080p (1920 × 1080 progressive) resolution is possible on today's off-the-shelf computer hardware. In the hardware domain, companies can do more with a single DSP because VC-1 is easier to implement.

Back to the top of this pageBack to Top


VC-1 Adoption

VC-1 has already been adopted by the digital video industry and a number of standards bodies and industry organizations in addition to SMPTE.

Next-Generation Optical Media. All of the leading next-generation optical media formats have adopted VC-1 as a mandatory codec. The DVD Forum has mandated VC-1, H.264, and MPEG-2 for the HD DVD format. The Blu-ray Disc Association has mandated the same three codecs for their blue-laser Blu-ray Disc format. And the recent FVD standard from Taiwan has adopted VC-1 as the only mandated video codec.

Chips. Numerous DSP and chip manufacturers have begun to support VC-1.

Professional Video Equipment. VC-1 is being used for professional video broadcast and delivery today. Leading industry companies already have products on the market that support VC-1, ranging from encoders and decoders to professional video test equipment.

Home Networks. VC-1 is an optional format in the Digital Living Network Alliance (DLNA) standards. DLNA is developing a set of interoperability guidelines for home networks. These guidelines will enable computers, portable devices, and home consumer electronic devices such as stereos and set-top boxes to share digital media seamlessly over a home network.

Mobile Devices. VC-1 is one of the formats included in the Digital Video Broadcasting - Handheld (DVB-H) specification, and is a key component of Modeo's new DVB-H solution. VC-1 is also part of new broadband, Wi-Fi, and cellular delivery solutions such as MobiTV.

Transport Independence

The VC-1 codec is not tied to any particular transport mechanism. From the beginning, the codec was designed to take into account the existing MPEG-2 Systems layer. As a result, it can be used easily with existing broadcast infrastructures.
  • Closed captions, active format descriptions (AFD), and other information can be carried in user data.
  • The organization of the video stream into I, B, and P frames enables conventional tuning and trick modes.

In addition to participating in standards development, Microsoft is also working with companies to support broadcast solutions. For example, at the 2004 IBC convention, a prototype system was demonstrated that delivers VC-1 over satellite using DVB-S2. Pre-encoded video files were multiplexed, encapsulated in transport streams, and streamed to a DVB-S2 modulator. The satellite uplink was located in the United Kingdom, and the signal was received at the IBC convention center in Amsterdam, where it was demodulated, decoded, and played back on the convention floor.

Tools

Microsoft has a number of tools available to help companies adopt VC-1 technology.
  • Windows Media Encoder Studio Edition Beta 1 is an exciting new addition to the Windows Media tools family. It is a powerful tool for video professionals, optimized for the creation of high-quality offline encoding using Microsoft's implementation of the VC-1 video standard (WMV9). Windows Media Encoder Studio Edition Beta 1 provides the key features necessary to create next-generation video content and capitalize on the growing importance of scenarios around optical media and video-on-demand.
  • Windows Media Encoder 9 Series is a general purpose software encoding application. It is freely downloadable. It supports a range of codecs and bit rates, from real-time encoding (suitable for live videoconferencing) to best quality. Users doing their encoding on Windows machines can update their versions of Windows Media Encoder 9 Series with the package found A VC-1 porting kit is available for companies interested in porting their own VC-1 implementations in software or hardware. It covers encoders and decoders, is fully aligned with the latest SMPTE 421M specification, and comes with a full set of conformance test vectors. For additional information about licensing, see the Windows Media Licensing page.
  • Windows Media licensees have access to a collection of utilities that convert between ASF and MPEG-2 transport streams. These utilities allow studios to convert existing VC-1 content. The package contains an encoder utility that directly supports encapsulating VC-1 elementary streams inside MPEG-2 transport streams. For additional information about licensing, see the Windows Media Licensing page.
  • The SMPTE test materials are the official tools for VC-1 conformance testing. These materials include the source code for a reference decoder and a sample encoder as well as bitstreams for testing. The SMPTE test materials are acquired through the SMPTE VC-1 Test Materials Access program found at the SMPTE Web site.

Back to the top of this pageBack to Top


SMPTE Standardization Background

SMPTE is the preeminent society of film and video experts, with members in 85 countries worldwide. The standards that SMPTE produces are widely used by professionals in the fields of video, motion pictures, and digital cinema.

The SMPTE standard for VC-1, SMPTE 421M, was originally based on Microsoft's Windows Media® Video 9 codec. The Windows Media Video 9 codec is functionally equivalent to VC-1; it is Microsoft's implementation of the VC-1 standard. VC-1 includes the Simple, Main, and Advanced Profiles that are described in the Overview of VC-1 section of this article.

The standardization process was undertaken by the SMPTE Video Compression Technology Committee, also known as C24. This committee is responsible for technologies that encode, process, switch, and decode video signals to, in, and from the compressed domain.

Microsoft chose to standardize the Windows Media Video 9 codec for a number of reasons, including accessibility and interoperability. Standardizing enables independent implementations and ensures those implementations will be interoperable. Standardizing the bitstream syntax and decoding process gives hardware manufacturers the resources and stability required to invest in creating decoders on chips in a variety of hardware devices. Isolating the video codec standard from the other parts of a complete video system enables the use of the codec on many types of hardware and in many different systems.

In addition to the other reasons for standardization, having an SMPTE standard that can be referenced by other format and system specifications makes the inclusion of VC-1 easier for independent companies and discourages the adoption of different versions of the technology by different organizations. Standardization also helps gain adoption of the technology by organizations that are committed to the use of open industry standards.

There are three documents produced for SMPTE to describe VC-1. The first document, SMPTE 421M, is the VC-1 specification itself. This is the main document providing comprehensive details of the VC-1 bitstream syntax and decoder semantics. The second document, SMPTE RP228, is the VC-1 conformance specification. This document describes the test procedures and criteria for determining conformance to the SMPTE 421M specification and includes reference source code and bitstreams. The third document, SMPTE RP227, is the VC-1 transport specification. The transport document provides details about carrying VC-1 elementary streams in MPEG-2 Program and Transport Streams.

Back to the top of this pageBack to Top


Conclusion

VC-1 is a cutting-edge codec that offers very high image quality with excellent compression efficiency. The quality of the reconstructed content that VC-1 produces has been deemed superior to competing video compression standards in independent tests. At the same time, the Advanced Profile of VC-1 can encode content with up to three times the compression efficiency of MPEG-2. VC-1 is also capable of delivering high-definition video at bit rates as low as 6 to 8 Mbps.

The emphasis during development on reducing the computational power required by the VC-1 decoder provides advantages for a broad range of media consumers. Personal computer users can decode full 1080i/p resolution video with off-the-shelf hardware, making HD video delivery a reality for the home computer. Perhaps more important than the benefits of VC-1 to the personal computer market is its value in the consumer electronics space. Hardware supporting VC-1 includes next generation DVD players, set-top boxes, portable media devices, wireless phones, and more. Major industry players are selecting VC-1 for its scalability and quality.

VC-1 is leading the next wave of digital video. It is a high-quality codec that benefits from the resources of Microsoft® while being an open standard. Adopters can choose to develop custom implementations and solutions or to use the existing support provided by Windows Media technologies. VC-1 offers something for every digital video solution.

Back to the top of this pageBack to Top


References


Back to the top of this pageBack to Top
 
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值