不论像RTMP服务器推送视频还是音频,都需要按照FLV格式进行封包,然后调用librtmp接口函数进行发送。透过对FLV文件协议的理解,我们在向RTMP服务器发送yin数据包之前,需要需要首先推送一个音频 Tag [Audio Sequence Header] 以下简称“音频同步包”,或者视频 Tag [AVC Sequence Header] 以下简称“视频同步包”。现在我们先来介绍一下这个Tag包:
开头两个字节,表示音频的相关参数,如AAC音频包的头两个字节 AF 00, 前四位 A表示音频类型为AAC,F中,前两位表示采样率,j紧跟着的一位表示采样精度,最后一位表示声道类型(单声道还是双声道)。第二个字节0x00表示传输的音频数据为Audio Sequence Header, 如果时0x01表示发送的数据为音频data数据。
FLV中的格式介绍如下:
对于AAC音频,在这两个字节的后面,还会跟随两个字节AAC特有的头,AudioSpecificConfig,详细指定了AAC音频的参数信息。
前5位,表示AAC的类型(AAC_LC等),紧接着的4个字节表示AAC音频的采样率,紧接着4个位表示音频输出声道信息,最后三位,为三个标志位。例如:例如0X13 0X90, 0x13 0x90(1001110010000) 表示 ObjectProfile=2, AAC-LC,SamplingFrequencyIndex=7,ChannelConfiguration=声道2
具体如下:
详细介绍:
音频类型(5bit):分两种,如果是MPEG-4,支持的种类比较多
- 1: AAC Main
- 2: AAC LC (Low Complexity)
- 3: AAC SSR (Scalable Sample Rate)
- 4: AAC LTP (Long Term Prediction)
- 5: SBR (Spectral Band Replication)
- 6: AAC Scalable
- 7: TwinVQ
- 8: CELP (Code Excited Linear Prediction)
- 9: HXVC (Harmonic Vector eXcitation Coding)
- 10: Reserved
- 11: Reserved
- 12: TTSI (Text-To-Speech Interface)
- 13: Main Synthesis
- 14: Wavetable Synthesis
- 15: General MIDI
- 16: Algorithmic Synthesis and Audio Effects
- 17: ER (Error Resilient) AAC LC
- 18: Reserved
- 19: ER AAC LTP
- 20: ER AAC Scalable
- 21: ER TwinVQ
- 22: ER BSAC (Bit-Sliced Arithmetic Coding)
- 23: ER AAC LD (Low Delay)
- 24: ER CELP
- 25: ER HVXC
- 26: ER HILN (Harmonic and Individual Lines plus Noise)
- 27: ER Parametric
- 28: SSC (SinuSoidal Coding)
- 29: PS (Parametric Stereo)
- 30: MPEG Surround
直播的视频用H264,音频用AAC,从FAAC里面压缩出来的一帧音频数据,要经过简单处理才能打包用RTMP协议发送到FMS上,包括保存成FLV文件,都要稍微处理一下,主要是把AAC的帧头去掉,并提取出相应的信息。
1024字节的G.711A数据,AAC一般也就300多个字节。
可以把FAAC压缩出来的帧直接保存成AAC文件,用windows7自带的播放器可以播放的,方便测试。
AAC的帧头一般7个字节,或者包含CRC校验的话9个字节,这里面包括了声音的相关参数。
结构如下:
Structure
AAAAAAAA AAAABCCD EEFFFFGH HHIJKLMM MMMMMMMM MMMOOOOO OOOOOOPP (QQQQQQQQ QQQQQQQQ)
Header consists of 7 or 9 bytes (without or with CRC).
Letter | Length (bits) | Description |
---|---|---|
A | 12 | syncword 0xFFF, all bits must be 1 |
B | 1 | MPEG Version: 0 for MPEG-4, 1 for MPEG-2 |
C | 2 | Layer: always 0 |
D | 1 | protection absent, Warning, set to 1 if there is no CRC and 0 if there is CRC |
E | 2 | profile, the MPEG-4 Audio Object Type minus 1 |
F | 4 | MPEG-4 Sampling Frequency Index (15 is forbidden) |
G | 1 | private stream, set to 0 when encoding, ignore when decoding |
H | 3 | MPEG-4 Channel Configuration (in the case of 0, the channel configuration is sent via an inband PCE) |
I | 1 | originality, set to 0 when encoding, ignore when decoding |
J | 1 | home, set to 0 when encoding, ignore when decoding |
K | 1 | copyrighted stream, set to 0 when encoding, ignore when decoding |
L | 1 | copyright start, set to 0 when encoding, ignore when decoding |
M | 13 | frame length, this value must include 7 or 9 bytes of header length: FrameLength = (ProtectionAbsent == 1 ? 7 : 9) + size(AACFrame) |
O | 11 | Buffer fullness |
P | 2 | Number of AAC frames (RDBs) in ADTS frame minus 1, for maximum compatibility always use 1 AAC frame per ADTS frame |
Q | 16 | CRC if protection absent is 0 |
其中最重要的就是E,F,H。
E就是类型了
0: AAC Main
1: AAC LC (Low Complexity)
2: AAC SSR (Scalable Sample Rate)
3: AAC LTP (Long Term Prediction)
F就是采样频率
0: 96000 Hz
- 1: 88200 Hz
- 2: 64000 Hz
- 3: 48000 Hz
- 4: 44100 Hz
- 5: 32000 Hz
- 6: 24000 Hz
- 7: 22050 Hz
- 8: 16000 Hz
- 9: 12000 Hz
- 10: 11025 Hz
- 11: 8000 Hz
- 12: 7350 Hz
H就是声道
1: 1 channel: front-center
- 2: 2 channels: front-left, front-right
- 3: 3 channels: front-center, front-left, front-right
- 4: 4 channels: front-center, front-left, front-right, back-center
- 5: 5 channels: front-center, front-left, front-right, back-left, back-right
- 6: 6 channels: front-center, front-left, front-right, back-left, back-right, LFE-channel
- 7: 8 channels: front-center, front-left, front-right, side-left, side-right, back-left, back-right, LFE-channel