网页视频之MP4文件解析

为了研究如何把H264打包成fMp4文件,今天先开始研究如何从fMp4中解析出H264数据。

fMP4包括ftype+moov+(moof+mdat)*N这种格式组成。

Box的定义:


ftyp格式:


上码流:


size:00 00 00 1C  ===>28

type: 66 74 79 70 =====>ftyp

major_brand:6D 70 34 32 ===>mp42

minor_version :00 00 00 01 ===>1

compatible_brands: 6D 70 34 32===》mp42    ||  next  61 76 63 31 ===>avc1 || next 69 73 6F 35 ==>iso5


****************************************以下为moov的box介绍***********************************

moov格式:

moov格式十分复杂,包含了大量的媒体信息,

先研究第一级目录:


moov开始就8个字节


size:00 00 07 57  ===>1879

type: 6D 6F 6F 76 =====>moov

紧接着mvhd定义如下:



size:00 00 00 6C  ===>108

type: 6D 76 68 64 =====>mvhd

version:00 00 00 00====》0

creation_time: 00 00 00 00

modification_time:00 00 00 00

timescale:00 00 03 E8  ===》1000

duration:00 00 EA BF ===》60095

rate:00 01 00 00

volume:01 00

reserved16:00 00

reserved32:00 00 00 00      00 00 00 00

matrix:00 01 00 00    00 00 00 00    00 00 00 00    00 00 00 00   00 01 00 00  00 00 00 00

00 00 00 00   00 00 00 00  40 00 00 00

pre_defined: 00 00 00 00    00 00 00 00   00 00 00 00   00 00 00 00   00 00 00 00   00 00 00 00

next_track_ID: FF FF FF FF

接着trak格式定义如下:

trak如下:


size:00 00 01 A4  ===>420

type: 74  72 61 6B =====>trak


tkhd定义如下:


上码流


size:00 00 00 5C  ===>92

type: 74  6B 68 64 =====>tkhd

version:00  ==》0

flags:00 00 07 =》7

creation_time:00 00 00 00 =>0

modification_time: 00 00 00 00 =>0

track_ID: 00 00 00 01 ===>1

reserved0 : 00 00 00 00

duration : 00 00 EA BF ===>60095

reserved1 : 00 00 00 00      00 00 00 00

layer: 00 00

alternate_group: 00 00

volume : 00  01

reserved2 : 00 00

matrix: 00 01 00 00      00 00 00 00   00 00 00 00   00 00 00 00    00 01 00 00   00 00 00 00   00 00 00 00   00 00 00 00

04 00 00 00

width: 00 00 00 00

height: 00 00 00 00

mdia如下


size:00 00 01 40  ===>320

type: 6D 64 69 61 =====>mdia


mdhd定义如下:

上码流:

size:00 00 00 20  ===>32

type: 6D 64 68 64 =====>mdhd

version: 00 00 00 00 ====>0

creation_time:====>00 00 00 00

modification_time:====>00 00 00 00

timescale:====>00 00 56 22  ===>22050

duration: 00 00 00 00 ====>0

pad: 0

language: 00101    01110  00111  ===>eng

pre_defined ===>0

hdlr 定义如下:

上码流:



size:00 00 00 35 ===>53

type: 68 64 6C 72 =====>hdlr

version: 00 00 00 00

pre_defined:00 00 00 00

handler_type:73 6F 75 6E ====>soun

reserved:0

name: Bento4 Sound Handler

minf 定义如下:


size:00 00 00 E3 ===>227

type: 6D 69 6E 66 =====>minf

smhd 定义如下:


上码流:

size:00 00 00 10 ===>16

type: 73 6D 68 64 =====>smhd

version: 0

balance:0

reserved:0

dinf 定义如下:

size:00 00 00 24 ===>36

type: 64 69 6E 66 =====>dinf

dref定义如下:


上码流:


size:00 00 00 1C ===>28

type: 64 72 65 66 =====>dref

version:00 00 00 00

entry_count:00 00 00 01

url 定义如下:

上码流:

size:00 00 00 0C ===>12

version: 00 00 00 01

location:same file

stbl 定义如下:


size:00 00 00 A7 ===>12

version: 73 74 62 6C====>stbl

stsd 定义如下:

上码流:


size:00 00 00 5B ===>91

type: 73 74 73 64====>stsd

version:00 00 00 00

entry_count: 00 00 00 01

mp4a定义如下:


上码流:


size:00 00 00 4B ===>78

type: 6D 70 34 61====>mp4a

reserved[6] :00 00 00 00  00 00

data_reference_index: 00 01

reserved1: 00 00 00 00   00 00 00 00

channelcount: 00 02

samplesize: 16

pre_defined: 00 00

reserved2: 00 00

samplerate: 22050


esds定义如下:

没有找到定义,不知道哪位高手知道在哪里。直接上码流:


工具解析截图如下:

参考这里https://stackoverflow.com/questions/3987850/mp4-atom-how-to-discriminate-the-audio-codec-is-it-aac-or-mp3

the value of the 11th Byte

  • 0x40 - MPEG-4 Audio
  • 0x6B - MPEG-1 Audio (MPEG-1 Layers 1, 2, and 3)
  • 0x69 - MPEG-2 Backward Compatible Audio (MPEG-2 Layers 1, 2, and 3)
  • 0x67 - MPEG-2 AAC LC

找了一圈,找到一个解释,但是还是没搞清楚出处。(转自https://blog.csdn.net/evsqiezi/article/details/73920290)

ESDs中可以分为三层,每层为包含关系,分别为MP4ESDescr(0x03开始,一般7个字节),MP4DecConfigDescr(0x04开始,一般13个字节),MP4DecSpecificDescr

esds box分析例子:

这是一段ESDS数据
00001e7: 0000 0027 6573 6473 0000 0000 0319 0000  ...'esds........
00001f7: 0004 1140 1500 01f8 0001 2728 0000 f3e8  ...@......'(....
0000207: 0502 1388 0601 02                        .......

分析如下:
0000 0027:   :esds box长度, 长度是39
6373 6473:   :esds box type: esds
00           :Version为0
  00 0000:   :Flags为0
03           :ES_DescrTag 见14496-1 Table 1
  19         :Length Field:25
     0000:   :ES_ID: 是0
00           :00(hex) =
             :0000 0000(bits)
             :0              :steamDependenceFlag,如果为1,则有16bits的dependsOn_ES_IS
             : 0             :URL_Flag,如果为1,后边则有8bits URLlength, 和相应的URLstring(URLlength)
             :  0            :OCRstreamFlag, 如果为1,有16bits OCR_ES_id;
             :   0 0000      :streamPriority
  04         :DecoderConfigDescriptor TAG
     11      :Length Field:17
       40:   :objectTypeIndication 14496-1 Table8, 0x40是Audio ISO/IEC 14496-3
15           :15(hex) =
             :0001 0101
             :0001 01        :streamType  5是Audio Stream, 14496-1 Table9
             :       0       :upStream
             :        1      :reserved
  00 01f8:   :bufferSizeDB   504
0001 2728:   :maxBitrate  75560          // 可以获取最大码率
0000 f3e8:   :avgBitrate  62440            // 可以获取平均码率
05           :DecSpecificInfotag
  02         :Length Field:2
     1388    :14496-3 1.6
             :1388(hex)=
             :0001 0011 1000 1000(bit)
             :0001 0                     :audioObjectType 2 GASpecificConfig
             :      011 1                :samplingFrequencyIndex
             :           000 1           :channelConfiguration 1
             :                00         :cpConfig
             :                  0        :directMapping
06           :SLConfigDescrTag
  01         :Length Field:1
     02      :predefined 0x02 Reserved for use in MP4 files


stsz定义如下:


上码流:


size:00 00 00 14 ===>20

type: 73 74 73 7A====>stsz

version: 00

flags:00 00 00 00

sample_size:00 00 00 00

sample_count: 00 00 00 00


stsc定义如下:

上码流:


size:00 00 00 10 ===>16

type: 73 74 73 63====>stsc

version: 00

flags:00 00 00 00

entry_count: 00 00 00 00


stts定义如下:


上码流:

size:00 00 00 10 ===>16

type: 73 74 74 73====>stts

version: 00

flags:00 00 00 00

entry_count: 00 00 00 00

stco定义如下:

上码流:


size:00 00 00 10 ===>16

type: 73 74 63 6F====>stco

version: 00

flags:00 00 00 00

entry_count: 00 00 00 00


vmhd定义如下:

上码流:

size:00 00 00 14 ===>20

type: 76 6D 68 64====>vmhd

version: 00

flags:00 00 00 00

graghicsmode: 00 00

opcolor: 00 00   00 00   00 00


avc1定义如下:


上码流:

size:00 00 00 82===>130

type: 61 76 63 31====>avc1

reserved: 00 00 00 00 00 00

data_reference_index : 00 01

pre_defined0:  00 00

reserved1: 00 00

pre_defined1: 00 00 00 00    00 00 00 00    00 00 00 00

width: 02 80 ====>640

height: 01 68 ====>360

horizresolution: 00 48 00 00

vertresolution: 00 48 00 00

reserved2: 00 00 00 00

frame_count : 00 01

compressorname: 00 00 00 00------32 --- 00 00

depth:  00 18

pre_defined2: FF FF


avcc无法找到说明

参考如下:(博客https://blog.csdn.net/stn_lcd/article/details/74520031)

 bits      
8   version ( always 0x01 )  
8   avc profile ( sps[0][1] )  
8   avc compatibility ( sps[0][2] )  
8   avc level ( sps[0][3] )  
6   reserved ( all bits on )  
2   NALULengthSizeMinusOne    // 这个值是(前缀长度-1),值如果是3,那前缀就是4,因为4-1=3  
3   reserved ( all bits on )  
5   number of SPS NALUs (usually 1)  
repeated once per SPS:  
  16     SPS size  
  variable   SPS NALU data  
8   number of PPS NALUs (usually 1)  
repeated once per PPS  
  16    PPS size  
  variable PPS NALU data 


version: 01

sps[0][1]:0x42

sps[0][2]:0xE0

sps[0][3]:0x1E

reserved: 111111

nalulength:11

reserved: 000

nums of sps nulus:00001

sps size: 0x00 0x15 ===>21

sps: 0x27 0x42 0xE0 0x1E 0xA9 0x18 0x14 0x05 0xFF 0x2E 0x00 0xD4 0x18 0x04 0x1A 0xDB 0x0A 0xD7 0xBD 0xF0 0x10

nums of pps nulus: 01

pps size: 00  04 ==>4

pps: 28 DE  09 C0



****************************************以上为moov的box介绍***********************************

*******************************下面开始moof***********************************************

moof定义:

上码流:



size:00 00 02 60===>608

type: 6D 6F 6F 66====>moof

mfhd定义:

上码流:

size:00 00 02 60===>608

type: 6D 6F 6F 66====>moof

version: 00

flags:00 00 00 00

sequence_number: 00 00 00 01 ====>1


traf定义:

上码流:

size:00 00 02 08===>520

type: 74 72 61 66====>traf


tfhd定义:


上码流:


size:00 00 00 10===>16

type: 74 66 68 64====>tfhd

version: 00

flags:00 00 00 00

track_id: 00 00 00 01 ===>1


tfdt定义:


上码流:

size:00 00 00 14===>20

type: 74 66 64 74====>tfdt

version: 01

flags:00 00 00

baseMediaDecoderTime: 00 00 00 00    00 00 00 00 

这个时间特别重要,浏览器就是根据这个时间来解码的,另外我算了一下,如果baseMediaDecoderTime为4个字节,最大可以播放13个小时,所以,后来我还是改成了8个字节,但是要注意version需要修改为01


trun定义:


上码流:

size:00 00 02 18===>536

type: 74 72 75 6E====>trun

version: 00

flags:00 03 05

sample_count:00 00 00 40 ====>64

data_offset: 00 00 02 68 ====>616 ==moof总长度+8

first_sample_flags:02 00 00 00===>33554532

sample_duration: 00 00 00 19 ===>25

sample_size: 00 00 79 8B ===> 31115

sample_flags:

sample_composition_time_offset:


*******************************下面开始mdat***********************************************

mdat比较简单

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值