多媒体封装格式详解(01) MP4

MP4(MPEG-4 Part 14)是一种常见的多媒体容器格式,它是在“ISO/IEC 14496-14”标准文件中定义的。

1.最小组成单元 BOX

像FLV的tag、MKV的EBML、ASF文件中的 ASF object.mp4 是由一系列的box组成,他的最小组成单元就是box.



size;指明了整个box所占用的大小,包括header部分.
type;表示这个box的类型。(附表1)
largesize;如果box很大超过了uint32的最大数值,size就被设置为1,并用接下来的  largesize来存放大小。

2.mp4文件整体结构


mp4文件说白了就是一系列box组成,大box里面有小box。
接下来会深入到具体的box里面,来具体分析mp4格式


附表1

CodeAbstractDefined in/by
ainfAsset information to identify, license and playDECE
albmAlbum title and track number (user-data)3GPP
authMedia author name (user-data)3GPP
avcnAVC NAL Unit Storage BoxDECE
blocBase location and purchase location for license acquisitionDECE
bpccBits per componentJP2
buffBuffering informationAVC
bxmlbinary XML containerISO
ccidOMA DRM Content IDOMA DRM 2.1
cdeftype and ordering of the components within the codestreamJP2
clsfMedia classification (user-data)3GPP
cmapmapping between a palette and codestream componentsJP2
co6464-bit chunk offsetISO
colrspecifies the colourspace of the imageJP2
cprtcopyright etc. (user-data)ISO
crhdreserved for ClockReferenceStream headerMP4V1
cslgcomposition to decode timeline mappingISO
ctts(composition) time to sampleISO
cvruOMA DRM Cover URIOMA DRM 2.1
dcfDMarlin DCF Duration, user-data atom typeOMArlin
dinfdata information box, containerISO
drefdata reference box, declares source(s) of media data in trackISO
dscpMedia description (user-data)3GPP
dsgdDVB Sample Group Description BoxDVB
dstgDVB Sample to Group BoxDVB
edtsedit list containerISO
elstan edit listISO
feciFEC InformatiomISO
fecrFEC ReservoirISO
fiinFD Item InformationISO
fireFile ReservoirISO
fparFile PartitionISO
freefree spaceISO
frmaoriginal format boxISO
ftypfile type and compatibilityJP2ISO
gitnGroup ID to nameISO
gnreMedia genre (user-data)3GPP
grpiOMA DRM Group IDOMA DRM 2.0
hdlrhandler, declares the media (handler) typeISO
hmhdhint media header, overall information (hint track only)ISO
hpixHipix Rich Picture (user-data or meta-data)HIPIX
icnuOMA DRM Icon URIOMA DRM 2.0
ID32ID3 version 2 containerinline
idatItem dataISO
ihdrImage HeaderJP2
iinfitem informationISO
ilocitem locationISO
imifIPMP Information boxISO
infuOMA DRM Info URLOMA DRM 2.0
iodsObject Descriptor container boxMP4V1
iphdreserved for IPMP Stream headerMP4V1
ipmcIPMP Control BoxISO
iproitem protectionISO
irefItem referenceISO
jP$20$20JPEG 2000 SignatureJP2
jp2cJPEG 2000 contiguous codestreamJP2
jp2hHeaderJP2
jp2iintellectual property informationJP2
kywdMedia keywords (user-data)3GPP
lociMedia location information (user-data)3GPP
lrcuOMA DRM Lyrics URIOMA DRM 2.1
m7hdreserved for MPEG7Stream headerMP4V1
mdatmedia data containerISO
mdhdmedia header, overall information about the mediaISO
mdiacontainer for the media information in a trackISO
mdriMutable DRM informationOMA DRM 2.0
mecoadditional metadata containerISO
mehdmovie extends header boxISO
meremetabox relationISO
metaMetadata containerISO
mfhdmovie fragment headerISO
mfraMovie fragment random accessISO
mfroMovie fragment random access offsetISO
minfmedia information containerISO
mjhdreserved for MPEG-J Stream headerMP4V1
moofmovie fragmentISO
moovcontainer for all the meta-dataISO
mvcgMultiview groupAVC
mvciMultiview InformationAVC
mvexmovie extends boxISO
mvhdmovie header, overall declarationsISO
mvraMultiview Relation AttributeAVC
nmhdNull media header, overall information (some tracks only)ISO
ochdreserved for ObjectContentInfoStream headerMP4V1
odafOMA DRM Access Unit FormatOMA DRM 2.0
oddaOMA DRM Content ObjectOMA DRM 2.0
odhdreserved for ObjectDescriptorStream headerMP4V1
odheOMA DRM Discrete Media HeadersOMA DRM 2.0
odrbOMA DRM Rights ObjectOMA DRM 2.0
odrmOMA DRM ContainerOMA DRM 2.0
odttOMA DRM Transaction TrackingOMA DRM 2.0
ohdrOMA DRM Common headersOMA DRM 2.0
padbsample padding bitsISO
paenPartition EntryISO
pclrpalette which maps a single component in index space to a multiple- component imageJP2
pdinProgressive download informationISO
perfMedia performer name (user-data)3GPP
pitmprimary item referenceISO
res$20grid resolutionJP2
rescgrid resolution at which the image was capturedJP2
resddefault grid resolution at which the image should be displayedJP2
rtngMedia rating (user-data)3GPP
sbgpSample to Group boxAVC, ISO
schischeme information boxISO
schmscheme type boxISO
sdepSample dependencyAVC
sdhdreserved for SceneDescriptionStream headerMP4V1
sdtpIndependent and Disposable Samples BoxAVC, ISO
sdvpSD Profile BoxSDV
segrfile delivery session groupISO
sencSample specific encryption dataDECE
sgpdSample group definition boxAVC, ISO
sidxSegment Index Box3GPP
sinfprotection scheme information boxISO
skipfree spaceISO
smhdsound media header, overall information (sound track only)ISO
srmbSystem Renewability MessageDVB
srmcSystem Renewability Message containerDVB
srppSTRP ProcessISO
stblsample table box, container for the time/space mapISO
stcochunk offset, partial data-offset informationISO
stdpsample degradation priorityISO
sthdSubtitle Media Header BoxDECE
stscsample-to-chunk, partial data-offset informationISO
stsdsample descriptions (codec types, initialization etc.)ISO
stshshadow sync sample tableISO
stsssync sample table (random access points)ISO
stszsample sizes (framing)ISO
stts(decoding) time-to-sampleISO
stypSegment Type Box3GPP
stz2compact sample sizes (framing)ISO
subsSub-sample informationISO
swtcMultiview Group RelationAVC
tfadTrack fragment adjustment box3GPP
tfhdTrack fragment headerISO
tfmaTrack fragment media adjustment box3GPP
tfraTrack fragment radom accessISO
tibrTier Bit rateAVC
tiriTier InformationAVC
titlMedia title (user-data)3GPP
tkhdTrack header, overall information about the trackISO
trafTrack fragmentISO
trakcontainer for an individual track or streamISO
treftrack reference containerISO
trextrack extends defaultsISO
trgrTrack grouping informationISO
trikFacilitates random access and trick play modesDECE
truntrack fragment runISO
tselTrack selection (user-data)3GPP
udtauser-dataISO
uinfa tool by which a vendor may provide access to additional information associated with a UUIDJP2
UITSUnique Identifier Technology SolutionUniversal Music
ulsta list of UUID’sJP2
url$20a URLJP2
uuiduser-extension boxISOJP2
vmhdvideo media header, overall information (video track only)ISO
vwdiMultiview Scene InformationAVC
xml$20a tool by which vendors can add XML formatted informationJP2
xml$20XML containerISO
yrrcYear when media was recorded (user-data)3GPP

QuickTime Codes

CodeAbstractDefined in/by
clipVisual clipping region containerQT
crgnVisual clipping region definitionQT
ctabTrack color-tableQT
elngExtended Language TagQT
imapTrack input map definitionQT
kmatCompressed visual track matteQT
loadTrack pre-load definitionsQT
mattVisual track matte for compositingQT
pnotPreview containerQT
wideExpansion space reservationQT



1.File Type Box

Box Type: `ftyp’
这种box一般情况下都会出现在mp4文件的开头,它可以作为mp4容器格式的可表示信息。就像flv头‘F’ 'L' 'V' 3字节,MKV头部的1A 45 DF A3 、ASF_Header_Object 可以作为ASF容器格式的可辨识信息一样。

ftyp box内容结构如下

[cpp]  view plain copy
  1. aligned(8) class FileTypeBox  
  2. extends Box(‘ftyp’) {  
  3. unsigned int(32) major_brand;  
  4. unsigned int(32) minor_version;  
  5. unsigned int(32) compatible_brands[]; // to end of the box  
  6. }  

2.Movie Box

Box Type: ‘moov’

moov 这个box 里面包含了很多个子box,就像上篇那个图上标的。一般情况下moov 会紧跟着 ftyp。moov里面包含着mp4文件中的metedata。音视频相关的基础信息。让我们看看moov 里面都含有哪些重要的box。

2.1 Movie Header Box

Box Type: ‘mvhd’
mvhd 结果如下:
[cpp]  view plain copy
  1. aligned(8) class MovieHeaderBox extends FullBox(‘mvhd’, version, 0) {  
  2. if (version==1) {  
  3. unsigned int(64) creation_time;  
  4. unsigned int(64) modification_time;  
  5. unsigned int(32) timescale;  
  6. unsigned int(64) duration;  
  7. else { // version==0  
  8. unsigned int(32) creation_time;  
  9. unsigned int(32) modification_time;  
  10. unsigned int(32) timescale;  
  11. unsigned int(32) duration;  
  12. }  
  13. template int(32) rate = 0x00010000; // typically 1.0  
  14. template int(16) volume = 0x0100; // typically, full volume  
  15. const bit(16) reserved = 0;  
  16. const unsigned int(32)[2] reserved = 0;  
  17. template int(32)[9] matrix =  
  18. { 0x00010000,0,0,0,0x00010000,0,0,0,0x40000000 };  
  19. // Unity matrix  
  20. bit(32)[6] pre_defined = 0;  
  21. unsigned int(32) next_track_ID;  
  22. }  

Field

Type

Comment

box size

4

box大小

box type

4

box类型

version

1

box版本,0或1,一般为0。

flags

3

 flags

creation time

4

创建时间(相对于UTC时间1904-01-01零点的秒数)

modification time

4

修改时间

time scale

4

文件媒体在1秒时间内的刻度值,可以理解为1秒长度的时间单元数

一般情况下视频的 都是90000

duration

4

该track的时间长度,用duration和time scale值可以计算track时长,比如audio track的time scale = 8000, duration = 560128,时长为

70.016,video track的time scale = 600, duration = 42000,时长为70

rate

4

推荐播放速率,高16位和低16位分别为小数点整数部分和小数部分,即[16.16] 格式,该值为1.0(0x00010000)表示正常前向播放

volume

2

与rate类似,[8.8] 格式,1.0(0x0100)表示最大音量

reserved

10

保留位

matrix

36

视频变换矩阵

pre-defined

24

next track id

4

下一个track使用的id号


所以通过解析这部分内容可以或者duration、rate等主要信息。举个例子:



上面的例子解析可知 time scale = 90000,duration = 15051036(E5A91C)/ time scale = 167s.

2.2 Track Box

Box Type: ‘trak’

在moov 这个box中会含有若干个track box.每个track都是相对独立。tarck box里面会包含很多别的box,有2个很关键  Track Header BoxMedia Box。下图是个普通的mp4文件。可以看到track box的简单结构。


2.2.1 Track Header Box
Box Type: ‘tkhd’
[cpp]  view plain copy
  1. aligned(8) class TrackHeaderBox  
  2. extends FullBox(‘tkhd’, version, flags){  
  3. if (version==1) {  
  4. unsigned int(64) creation_time;  
  5. unsigned int(64) modification_time;  
  6. unsigned int(32) track_ID;  
  7. const unsigned int(32) reserved = 0;  
  8. unsigned int(64) duration;  
  9. else { // version==0  
  10. unsigned int(32) creation_time;  
  11. unsigned int(32) modification_time;  
  12. unsigned int(32) track_ID;  
  13. const unsigned int(32) reserved = 0;  
  14. unsigned int(32) duration;  
  15. }  
  16. const unsigned int(32)[2] reserved = 0;  
  17. template int(16) layer = 0;  
  18. template int(16) alternate_group = 0;  
  19. template int(16) volume = {if track_is_audio 0x0100 else 0};  
  20. const unsigned int(16) reserved = 0;  
  21. template int(32)[9] matrix=  
  22. { 0x00010000,0,0,0,0x00010000,0,0,0,0x40000000 };  
  23. // unity matrix  
  24. unsigned int(32) width;  
  25. unsigned int(32) height;  
  26. }  

Field

Type

Comment

box size

4

box大小

box type

4

box类型

version

1

box版本,0或1,一般为0。

flags

3

按位或操作结果值,预定义如下:
0x000001 track_enabled,否则该track不被播放;
0x000002 track_in_movie,表示该track在播放中被引用;
0x000004 track_in_preview,表示该track在预览时被引用。
一般该值为7,如果一个媒体所有track均未设置track_in_movie和track_in_preview,将被理解为所有track均设置了这两项;对于hint track,该值为0

track id

4

id号,不能重复且不能为0

reserved

4

保留位

duration

4

track的时间长度

reserved

8

保留位

layer

2

视频层,默认为0,值小的在上层

alternate group

2

track分组信息,默认为0表示该track未与其他track有群组关系

volume

2

[8.8] 格式,如果为音频track,1.0(0x0100)表示最大音量;否则为0

reserved

2

保留位

matrix

36

视频变换矩阵

width

4

height

4

高,均为 [16.16] 格式值,与sample描述中的实际画面大小比值,用于播放时的展示宽高



2.2.2  Media Box
Box Type: ‘mdia’
mdia box 结构十分复杂。来个例子。

2.2.2.1 Media Header Box
Box Type: ‘mdhd’
[cpp]  view plain copy
  1. aligned(8) class MediaHeaderBox extends FullBox(‘mdhd’, version, 0) {  
  2. if (version==1) {  
  3. unsigned int(64) creation_time;  
  4. unsigned int(64) modification_time;  
  5. unsigned int(32) timescale;  
  6. unsigned int(64) duration;  
  7. else { // version==0  
  8. unsigned int(32) creation_time;  
  9. unsigned int(32) modification_time;  
  10. unsigned int(32) timescale;  
  11. unsigned int(32) duration;  
  12. }  
  13. bit(1) pad = 0;  
  14. unsigned int(5)[3] language; // ISO-639-2/T language code  
  15. unsigned int(16) pre_defined = 0;  
  16. }  

Field

Type

Comment

box size

4

box大小

box type

4

box类型

version

1

box版本,0或1,一般为0。

creation time

4

创建时间(相对于UTC时间1904-01-01零点的秒数)

modification time

4

修改时间

time scale

4

文件媒体在1秒时间内的刻度值,可以理解为1秒长度的时间单元数

一般情况下视频的 都是90000

duration

4

该track的时间长度

language

2

媒体语言码

pre-defined

2


2.2.2.2 Handler Reference Box
Box Type: ‘hdlr’
从hdlr 这个box里面,我们可以获得这个track的类型信息。
[cpp]  view plain copy
  1. aligned(8) class HandlerBox extends FullBox(‘hdlr’, version = 0, 0) {  
  2. unsigned int(32) pre_defined = 0;  
  3. unsigned int(32) handler_type;  
  4. const unsigned int(32)[3] reserved = 0;  
  5. string name;  
  6. }  

Field

Type

Comment

box size

4

box大小

box type

4

box类型

version

1

box版本,0或1,一般为0。

flags

3

 

pre-defined

4

handler_type

4

‘vide’ Video track

‘soun’ Audio track

‘hint’ Hint track

reserved

12

0

name

string

字符串 track type name


例子 

 00 00 00 2168 64 6C 7200 00 00 00 00 00 00 00 ; ...!hdlr........

76 69 64 65 00 00 00 00 00 00 00 00 00 00 00 00 ; vide............

00
上面的例子 可知track 的类型是 ‘vide’ 这是个video track

2.2.2.3 Media Information Box

Box Type: ‘minf’

minf 里包含着一系列的box。里面是track有关的特征信息。
一般情况minf 包含:Media Information Header Boxes、Data Information Box(dinf)、Sample Table Box。
Media Information Header Boxes 根据类型分‘vmhd’, ‘smhd’, ’hmhd’, ‘nmhd’

2.2.2.3.1 Media Information Header Boxes

Box Types: ‘vmhd’, ‘smhd’, ’hmhd’, ‘nmhd’

Video Media Header Box(vmhd)
[cpp]  view plain copy
  1. aligned(8) class VideoMediaHeaderBox  
  2. extends FullBox(‘vmhd’, version = 0, 1) {  
  3. template unsigned int(16) graphicsmode = 0; // copy, see below  
  4. template unsigned int(16)[3] opcolor = {0, 0, 0};  
  5. }  

Field

Type

Comment

box size

4

box大小

box type

4

box类型

version

1

box版本,0或1,一般为0。

flags

3

flags

graphicsmode

2

specifies a composition mode for this video track, from the following enumerated set,

which may be extended by derived specifications:

copy = 0 copy over the existing image

opcolor

2*3

is a set of 3 colour values (red, green, blue) available for use by graphics modes


例子:

00 00 00 14 76 6D 68 640000 00 01 00 00 00 00 ; ....vmhd........

00 00 00 00                                     ; ....



Sound Media Header Box(smhd)
[cpp]  view plain copy
  1. aligned(8) class SoundMediaHeaderBox  
  2. extends FullBox(‘smhd’, version = 0, 0) {  
  3. template int(16) balance = 0;  
  4. const unsigned int(16) reserved = 0;  
  5. }  

Field

Type

Comment

box size

4

box大小

box type

4

box类型

version

1

box版本,0或1,一般为0。

flags

3

flags

balance

2

立体声平衡,[8.8]格式值,一般为0-1.0表示全部左声道,1.0表示全部右声道

reserved

2

0


例子:

 00 00 00 10 73 6D 68 64 00 00 00 0000 0000 00 ; ....smhd........


Hint Media Header Box(hmhd)
[cpp]  view plain copy
  1. aligned(8) class HintMediaHeaderBox  
  2. extends FullBox(‘hmhd’, version = 0, 0) {  
  3. unsigned int(16) maxPDUsize;  
  4. unsigned int(16) avgPDUsize;  
  5. unsigned int(32) maxbitrate;  
  6. unsigned int(32) avgbitrate;  
  7. unsigned int(32) reserved = 0;  
  8. }  


2.2.2.3.2 Data Information Box

Box Type: ‘dinf’

dinf 里面包含的是 Data Reference Box 可能的类型:‘url ‘, ‘urn ‘, ‘dref’
[cpp]  view plain copy
  1. aligned(8) class DataEntryUrlBox (bit(24) flags)  
  2. extends FullBox(‘url ’, version = 0, flags) {  
  3. string location;  
  4. }  
  5. aligned(8) class DataEntryUrnBox (bit(24) flags)  
  6. extends FullBox(‘urn ’, version = 0, flags) {  
  7. string name;  
  8. string location;  
  9. }  
  10. aligned(8) class DataReferenceBox  
  11. extends FullBox(‘dref’, version = 0, 0) {  
  12. unsigned int(32) entry_count;  
  13. for (i=1; i • entry_count; i++) {  
  14. DataEntryBox(entry_version, entry_flags) data_entry;  
  15. }  
  16. }  

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值