H.264 extradata (partially) explained

H.264 extradata (partially) explained - for dummies

While this article will seem obvious and redundant to anyone who is fluent in H.264, i'm hoping it will come in useful for those people who stumble upon this issue.

I'm not going to go into any details about H.264 internals, parameters or anything like that. instead, in this short article i'm going to treat H.264 as a big opaque black box which has to be fed an annoying piece of information known as "extradata".

what's so annoying about extradata?
  1. it comes in two different flavors
  2. you need rudimentary knowledge of H.264 bitstream in order to retrieve it
  3. you need rudimentary knowledge of H.264 bitstream in order to know which flavor you need
but first, we need to learn about two different flavors of H.264 bitstram.

Annex B format

in this format, each NAL is preceeded by a four byte start code: 0x00 0x00 0x00 0x01
thus in order to know where a NAL start and where it stops, you would need to read each byte of the bitstream, looking for these start codes, which can be a pain if you need to convert between this format and the other format.

AVCC format

This "non Annex B" format is known as AVCC format. in this format, each NAL is precedded by a nal_size field. the size of the field in bytes is in many cases 4, but it is not assumed to be 4, and in fact this is part of the reason why a decoder needs any "extra data", in the first place.

So, Why does a decoder need extradata anyway?
  1. it needs to know what flavor of the bitstream to expect
  2. if AVCC format is used, it needs to know what the is the size of the nal_size field, in bytes.
  3. if the parameters for decoding are not repeated every keyframe, but rather specified only once (such as in a file), it needs those parameters (the SPS & PPS in H.264 speak)
How do I get this extradata?

when reading from a file, the extradata is usually part of the headers of the file, and you (or the demuxer) need to extract it from there.

if the extradata is repeated with every key frame, you can try to extract it from the bitstream itself, most of the time it will bundled in the same buffer or packet as the keyframe itself, and preceeding it.

if the bitstream is in annex-b format, you're in luck! you don't really need the extradata, because the codec can figure it out itself from the bitstream, at most, you will need to tell the decoder to treat the bitstream as annex-b, which is often achieved by NOT supplying any extradata to begin with.

on the other hand, if the bitstream is in avcc format, you desperately need this extradata, without it the decoder doesn't know how long the nal_size field is, and thus cannot even parse the bitstream.

suppose I have the SPS and PPS information, how do I create the extradata?

again, for annex-b format, you just use the following pseudo code:

write(0x00)
write(0x00)
write(0x00)
write(0x01)
for each byte b in SPS
  write(b)

for each PPS p in PPS_array
  write(0x00)
  write(0x00)
  write(0x00)
  write(0x01)
  for each byte b in p
    write(b)

On the other hand, AVCC format extradata is more complicated:

write(0x1);  // version
write(sps[0].data[1]); // profile
write(sps[0].data[2]); // compatibility
write(sps[0].data[3]); // level
write(0xFC | 3); // reserved (6 bits), NULA length size - 1 (2 bits)
write(0xE0 | 1); // reserved (3 bits), num of SPS (5 bits)
write_word(sps[0].size); // 2 bytes for length of SPS
for(size_t i=0 ; i < sps[0].size ; ++i)
  write(sps[0].data[i]); // data of SPS

write(&b, pps.size());  // num of PPS
for(size_t i=0 ; i < pps.size() ; ++i) {
  write_word(pps[i].size);  // 2 bytes for length of PPS
  for(size_t j=0 ; j < pps[i].size ; ++j)
    write(pps[i].data[j]);  // data of PPS
}


notice how the first byte of the avcc extradata is 1, which makes it obvious it is not a start of an annex-b extradata (which must begin with 0x00)

Notes about .mov files and Quicktime

internally, (at least with version 7.0) quicktime codecs work only with avcc formats and not with annex b format. that means that if you are unlucky enough to have H.264 in annex b format and need to decode it with quicktime codecs (for instance on an iphone) you would need to:
  1. convert the annex b 0x00 0x00 0x00 0x01 start codes into 4-byte long avcc nal_size fields.
    this requires a loop through the entire buffer, searching for these start codes
  2. you would need to extract the SPS and PPS NALs, and create an extradata buffer from them in the special format outlined above.
additionally, since .mov container is basically a quicktime container, it is natural that H.264 is stored on .mov files in AVCC format, and thus .mov muxers will need to know how to convert annex-b formatted H.264 buffers intead AVCC formatted H.264 buffers, and also how to convert the extradata buffer into one usable with AVCC format.




  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
H.264视频流中,extradata是一些元数据信息,例如SPS (Sequence Parameter Set)和PPS (Picture Parameter Set),它们描述了视频流的编解码方式和其他属性。要获取H.264视频流中的extradata,您可以使用GStreamer中的h264parse元素。 以下是一个示例管道,用于从H.264视频文件中提取extradata并打印出来: ``` GST_DEBUG=2 gst-launch-1.0 filesrc location=/path/to/video.h264 ! h264parse ! "video/x-h264,stream-format=byte-stream" ! fakesink dump=true ``` 在这个例子中,我们使用filesrc元素加载H.264视频文件,并将其发送到h264parse元素。h264parse会解析视频流中的元数据,并通过 "video/x-h264,stream-format=byte-stream" caps设置将其转换为字节流格式。最后,我们使用fakesink元素将视频流导出到“/dev/null”,并设置dump=true,以便在控制台输出元数据信息。 如果您只需要获取extradata,而不是整个视频流,您可以使用tee元素将视频流分离为两个分支,一个用于提取extradata,另一个用于播放视频流或进行其他处理。以下是一个示例管道: ``` gst-launch-1.0 filesrc location=/path/to/video.h264 ! h264parse ! tee name=t ! queue ! "video/x-h264,stream-format=byte-stream" ! fakesink dump=true t. ! queue ! h264parse ! mp4mux ! filesink location=output.mp4 ``` 在这个例子中,我们使用tee元素将视频流分离为两个分支。其中一个分支用于提取extradata并输出到控制台,另一个分支用于重新封装为MP4格式并保存到本地文件。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值