MP4的视频H264封装有2种格式:h264和avc1,对于这个细节,很容易被忽略。笔者也是在改编LIVE555流媒体时,增加mp4文件类型支持时遇到了该问题。
(一)首先,从原理上了解一下这2种格式的区别:
AVC1 描述:H.264 bitstream without start codes.一般通过ffmpeg转码生成的视频,是不带起始码0×00000001的。
H264 描述:H.264 bitstream with start codes.一般对于一下HDVD等电影的压制格式,是带有起始码0×00000001的。
来源文档:http://msdn.microsoft.com/zh-cn/library/dd757808(v=vs.85).aspx
(二)其次,通过VLC播放器,可以查看到具体的格式。打开视频后,通过菜单【工具】/【编解码信息】可以查看到【编解码器】具体格式,举例如下,编解码器信息:
编码: H264 – MPEG-4 AVC (part 10) (avc1)
编码: H264 – MPEG-4 AVC (part 10) (h264)
(三)最后,分享一下ffmpeg demux MP4文件后,转换视频流为live555可直接使用的h264 ES流的经验和方法:
针对(avc1),av_read_frame后,取前四个字节为长度,把前四字节直接替换为0×00,0×00,0×00,0×01即可,但注意每个frame可以有多个NAUL:
AVPacket * packet = &pkt ;
av_init_packet (packet ) ;
av_read_frame (ctx , packet ) ;
if (packet ->stream_index == 0 )
{ //is video stream
const char start_code [ 4 ] = { 0 , 0 , 0 , 1 } ;
if (is_avc_ || memcmp (start_code , packet ->data , 4 ) != 0 )
{ //is avc1 code, have no start code of H264
int len = 0 ;
uint8_t *p = packet ->data ;
is_avc_ = True ;
do
{ //add start_code for each NAL, one frame may have multi NALs.
len = ntohl ( * ( ( long * )p ) ) ;
memcpy (p , start_code , 4 ) ;
p += 4 ;
p += len ;
if (p >= packet ->data + packet ->size )
{
break ;
}
} while ( 1 ) ;
}
}
对于另外一种格式,(h264), 则直接对每个packet调用av_bitstream_filter_filter处理每个packet即可:
if (pkt ->stream_index == 0 )
{ //is video stream
AVBitStreamFilterContext * bsfc = bsfc_ ;
int a ;
while (bsfc ) {
AVPacket new_pkt = *pkt ;
a = av_bitstream_filter_filter (bsfc , encode_ctx_ , NULL ,
&new_pkt. data , &new_pkt. size ,
pkt ->data , pkt ->size ,
pkt ->flags & AV_PKT_FLAG_KEY ) ;
if (a == 0 && new_pkt. data != pkt ->data && new_pkt. destruct ) {
uint8_t *t = ( uint8_t * ) (new_pkt. size + FF_INPUT_BUFFER_PADDING_SIZE ) ; //the new should be a subset of the old so cannot overflow
if (t ) {
memcpy (t , new_pkt. data , new_pkt. size ) ;
memset (t + new_pkt. size , 0 , FF_INPUT_BUFFER_PADDING_SIZE ) ;
new_pkt. data = t ;
a = 1 ;
} else
a = AVERROR (ENOMEM ) ;
}
if (a > 0 && pkt ->data != new_pkt. data ) {
av_free_packet (pkt ) ;
new_pkt. destruct = av_destruct_packet ;
} else if (a < 0 ) {
envir ( ) << "!!!!!!!!!!av_bitstream_filter_filter failed" << ",res=" << a << "\n" ;
}
*pkt = new_pkt ;
bsfc = bsfc ->next ;
}
}
我一直疑问为什么有些视频解码时显示格式是:H264,大部分又是:AVC1
我在搜索编程资料时在微软的msdn上发现的:
原文:http://msdn.microsoft.com/en-us/library/dd757808(v=vs.85).aspx
FOURCC:AVC1 描述:H.264 bitstream without start codes.
FOURCC:H264 描述:H.264 bitstream with start codes.
H.264 Bitstream with Start Codes
H.264 bitstreams that are transmitted over the air, or contained in MPEG-2 program or transport streams, or recorded on HD-DVD, are formatted as described in Annex B of ITU-T Rec. H.264. According to this specification, the bitstream consists of a sequence of network abstraction layer units (NALUs), each of which is prefixed with a start code equal to 0x000001 or 0x00000001.
这段话的大致意思是:带有开始码的H.264视频一般是用于无线发射、有线广播或者HD-DVD中的。这些数据流的开始都有一个开始码:0x000001 或者 0x00000001.
H.264 Bitstream Without Start Codes
The MP4 container format stores H.264 data without start codes. Instead, each NALU is prefixed by a length field, which gives the length of the NALU in bytes. The size of the length field can vary, but is typically 1, 2, or 4 bytes.
这段话的大致意思是:没有开始码的H.264视频主要是存储在MP4格式的文件中的。它的数据流的开始是1、2或者4个字节表示长度数据。
原文中的"NALU"简单说是H.264格式中的最基本的单元,是一个数据包。
http://www.mysilu.com/archiver/?tid-721741.html
以下转自:https://msdn.microsoft.com/zh-cn/library/dd757808(v=vs.85).aspx
H.264 Video Types
The following media subtypes are defined for H.264 video.
Subtype | FOURCC | Description |
---|---|---|
MEDIASUBTYPE_AVC1 | 'AVC1' | H.264 bitstream without start codes. |
MEDIASUBTYPE_H264 | 'H264' | H.264 bitstream with start codes. |
MEDIASUBTYPE_h264 | 'h264' | Equivalent to MEDIASUBTYPE_H264, with a different FOURCC. |
MEDIASUBTYPE_X264 | 'X264' | Equivalent to MEDIASUBTYPE_H264, with a different FOURCC. |
MEDIASUBTYPE_x264 | 'x264' | Equivalent to MEDIASUBTYPE_H264, with a different FOURCC. |
These subtype GUIDs are declared in wmcodecdsp.h.
The main difference between these media types is the presence of start codes in the bitstream. If the subtype is MEDIASUBTYPE_AVC1, the bitstream does not contain start codes.
H.264 Bitstream with Start Codes
H.264 bitstreams that are transmitted over the air, or contained in MPEG-2 program or transport streams, or recorded on HD-DVD, are formatted as described in Annex B of ITU-T Rec. H.264. According to this specification, the bitstream consists of a sequence of network abstraction layer units (NALUs), each of which is prefixed with a start code equal to 0x000001 or 0x00000001.
When start codes are present in the bitstream, the following media type is used:
Major type | MEDIATYPE_Video |
---|---|
Subtypes | MEDIASUBTYPE_H264, MEDIASUBTYPE_h264, MEDIASUBTYPE_X264, or MEDIASUBTYPE_x264 |
Format type | FORMAT_VideoInfo, FORMAT_VideoInfo2, FORMAT_MPEG2Video, or GUID_NULL |
If the format type is GUID_NULL, no format structure is present.
When the bitstream contains start codes, any of the format types listed here is sufficient, because the decoder does not require any additional information to parse the stream. The bitstream already contains all of the information needed by the decoder, and the start codes enable the decoder to locate the start of each NALU.
The following subtypes are equivalent:
H.264 Bitstream Without Start Codes
The MP4 container format stores H.264 data without start codes. Instead, each NALU is prefixed by a length field, which gives the length of the NALU in bytes. The size of the length field can vary, but is typically 1, 2, or 4 bytes.
When start codes are not present in the bitstream, the following media type is used.
Major type | MEDIATYPE_Video |
---|---|
Subtype | MEDIASUBTYPE_AVC1 |
Format type | FORMAT_MPEG2Video |
The format block is an MPEG2VIDEOINFO structure. This structure should be filled in as follows:
- hdr: A VIDEOINFOHEADER2 structure that describes the bitstream. No color table is present after the BITMAPINFOHEADERportion of the structure, and biClrUsed must be zero.
- dwStartTimeCode: Not used. Set to zero.
- cbSequenceHeader: The length of the dwSequenceHeader array in bytes.
- dwProfile: Specifies the H.264 profile.
- dwLevel: Specifies the H.264 level.
- dwFlags: The number of bytes used for the length field that appears before each NALU. The length field indicates the size of the following NALU in bytes. For example, if dwFlags is 4, each NALU is preceded by a 4-byte length field. The valid values are 1, 2, and 4.
- dwSequenceHeader: A byte array that may contain sequence parameter set (SPS) and picture parameter set (PPS) NALUs.
The MP4 container might contain sequence parameter sets (SPS) or picture parameter sets (PPS) as special NAL units in file headers or in a separate stream (distinct from the video stream). When the format is established, the media type can specify SPS and PPS NAL units in the dwSequenceHeader array. If cbSequenceHeader is greater than zero, dwSequenceHeader is the start of a byte array containing SPS and PPS NALUs, delimited by 2-byte length fields, all in network byte order (big-endian). It is possible to have both SPS and PPS, only one of these types, or none. The actual type of each NALU can be determined by examining the nal_unit_type field of the NALU itself.
When this media type is used, each media sample starts at the beginning of a NALU, and NAL units do not span samples. This enables the decoder to recover from data corruption or dropped samples.