Ref frames and Video compression picture types

最新推荐文章于 2022-07-17 02:27:41 发布

taolinke

最新推荐文章于 2022-07-17 02:27:41 发布

阅读量906

点赞数

分类专栏：杂文章标签： compression video types reference standards numbers

杂专栏收录该内容

135 篇文章 0 订阅

订阅专栏

Reference frame (video)

Reference frames are frames of a compressed video that are used to define future frames. As such, they are only used in inter-frame compression techniques. In older video encoding standards, such as MPEG-2 , only one reference frame – the previous frame – was used for P-frames. Two reference frames (one past and one future) were used for B-frames .

Multiple reference frames

Some modern video encoding standards, such as H.264/AVC , allow the use of multiple reference frames. This allows the video encoder to choose among more than one previously decoded frame on which to base each macroblock in the next frame. While the best frame for this purpose is usually the previous frame, the extra reference frames can improve compression efficiency and/or video quality . Note that different reference frames can be chosen for different macroblocks in the same frame. The maximum number of concurrent reference frames supported by H.264 is 16. Different reference frames can be chosen for each 8x8 partition of a macroblock. Another video format that supports multiple reference frames is Snow , which can handle up to eight. The Theora codec provides a limited form of multiple reference frames, allowing references to both the preceding frame and the most recent intra frame .

Disadvantages

Encoding

Multiple reference frames can considerably increase encoding time because many of the decisions, such as motion estimation , that are ordinarily carried out only on one reference frame have to be repeated on all of the reference frames. Heuristics can be used to reduce this speed cost at the cost of quality. Very high numbers of reference frames are rarely useful in terms of quality for live-action material because frames from farther back in time generally have less and less correlation with the current frame. This is not as true for animated sources, where repetitive motion can make high numbers of reference frames more useful.

Decoding

When decoding, reference frames must be stored in memory until they are no longer needed for further decoding. This can considerably raise the memory usage of the decoder for videos with large numbers of reference frames. The use of several reference frames also decreases locality of reference , which might cause a speed impact.

In the field of video compression a video frame is compressed using different algorithms with different advantages and disadvantages, centered mainly around amount of data compression . These different algorithms for video frames are called picture types or frame types . The three major picture types used in the different video algorithms are I , P and B . They are different in the following characteristics:

I ‑frames are the least compressible but don't require other video frames to decode.
P ‑frames can use data from previous frames to decompress and are more compressible than I‑frames.
B ‑frames can use both previous and forward frames for data reference to get the highest amount of data compression.

There are three types of pictures (or frames) used in video compression : I‑frames, P‑frames, and B‑frames. An I‑frame is an 'Intra-coded picture', in effect a fully-specified picture, like a conventional static image file. P‑frames and B‑frames hold only part of the image information, so they need less space to store than an I‑frame, and thus improve video compression rates.

A P‑frame ('Predicted picture') holds only the changes in the image from the previous frame. For example, in a scene where a car moves across a stationary background, only the car's movements need to be encoded. The encoder does not need to store the unchanging background pixels in the P‑frame, thus saving space. P‑frames are also known as delta‑frames . A B‑frame ('Bi-predictive picture') saves even more space by using differences between the current frame and both the preceding and following frames to specify its content.