MPEG-4 Audio有一个非常重要的header叫做Audio Specific Config,该header包含了Audio编码器的重要信息,比如编码器类别,音频频率,音频通道数。
例如:AAC LC,双通道48K编码的header:0001000110010000 ==> 0x11 0x90
Object Type = 2, 用5bit的二进制表示为:00010
frequency index = 3,用4bit的二进制表示为:0011
channel configuration = 2,用4bit的二进制表示为:0010
000是填充位,总共16bit 两字节
在Android平台上初始化Audio decoder的时候,可以通过setByteBuffer将值设置到MediaFormat,key为csd-0
mediaFormat.setByteBuffer("csd-0", ByteBuffer.allocate(2).put(new byte[]{(byte) 0x11, (byte)0x90}));
例如:AAC ELD,双通道48K编码的header:1111100011100110010
Object Type = 39, 39无法用5位二进制表示,因此需要借助if (object type == 31)这个条件,所以前5个bit是11111,即31
然后用6bit+32来表示,因为这object type是39,所以6bit需要表示的数是7,用二进制表示为:000111.
所以object type的二进制最终为(11bit):11111000111
frequency index = 3,用4bit的二进制表示为:0011
channel configuration = 2,用4bit的二进制表示为:0010
WIKI:MPEG-4 Audio - MultimediaWiki
Audio Specific Config
The Audio Specific Config is the global header for MPEG-4 Audio:
5 bits: object type
if (object type == 31)
6 bits + 32: object type
4 bits: frequency index
if (frequency index == 15)
24 bits: frequency
4 bits: channel configuration
var bits: AOT Specific Config
Audio Object Types
MPEG-4 Audio Object Types:
- 0: Null
- 1: AAC Main
- 2: AAC LC (Low Complexity)
- 3: AAC SSR (Scalable Sample Rate)
- 4: AAC LTP (Long Term Prediction)
- 5: SBR (Spectral Band Replication)
- 6: AAC Scalable
- 7: TwinVQ
- 8: CELP (Code Excited Linear Prediction)
- 9: HXVC (Harmonic Vector eXcitation Coding)
- 10: Reserved
- 11: Reserved
- 12: TTSI (Text-To-Speech Interface)
- 13: Main Synthesis
- 14: Wavetable Synthesis
- 15: General MIDI
- 16: Algorithmic Synthesis and Audio Effects
- 17: ER (Error Resilient) AAC LC
- 18: Reserved
- 19: ER AAC LTP
- 20: ER AAC Scalable
- 21: ER TwinVQ
- 22: ER BSAC (Bit-Sliced Arithmetic Coding)
- 23: ER AAC LD (Low Delay)
- 24: ER CELP
- 25: ER HVXC
- 26: ER HILN (Harmonic and Individual Lines plus Noise)
- 27: ER Parametric
- 28: SSC (SinuSoidal Coding)
- 29: PS (Parametric Stereo)
- 30: MPEG Surround
- 31: (Escape value)
- 32: Layer-1
- 33: Layer-2
- 34: Layer-3
- 35: DST (Direct Stream Transfer)
- 36: ALS (Audio Lossless)
- 37: SLS (Scalable LosslesS)
- 38: SLS non-core
- 39: ER AAC ELD (Enhanced Low Delay)
- 40: SMR (Symbolic Music Representation) Simple
- 41: SMR Main
- 42: USAC (Unified Speech and Audio Coding) (no SBR)
- 43: SAOC (Spatial Audio Object Coding)
- 44: LD MPEG Surround
- 45: USAC
Sampling Frequencies
There are 13 supported frequencies:
- 0: 96000 Hz
- 1: 88200 Hz
- 2: 64000 Hz
- 3: 48000 Hz
- 4: 44100 Hz
- 5: 32000 Hz
- 6: 24000 Hz
- 7: 22050 Hz
- 8: 16000 Hz
- 9: 12000 Hz
- 10: 11025 Hz
- 11: 8000 Hz
- 12: 7350 Hz
- 13: Reserved
- 14: Reserved
- 15: frequency is written explictly