实验要求:理解程序整体框架;感知音频编码的设计思想;理解心理声学;码率分配。
程序设计基本框架
/************************************************************************
*
* main
*
* PURPOSE: MPEG II Encoder with
* psychoacoustic models 1 (MUSICAM) and 2 (AT&T)
*
* SEMANTICS: One overlapping frame of audio of up to 2 channels are
* processed at a time in the following order:
* (associated routines are in parentheses)
*
* 1. Filter sliding window of data to get 32 subband
* samples per channel.
* (window_subband,filter_subband)
*
* 2. If joint stereo mode, combine left and right channels
* for subbands above #jsbound#.
* (combine_LR)
*
* 3. Calculate scalefactors for the frame, and
* also calculate scalefactor select information.
* (*_scale_factor_calc)
*
* 4. Calculate psychoacoustic masking levels using selected
* psychoacoustic model.
* (psycho_i, psycho_ii)
*
* 5. Perform iterative bit allocation for subbands with low
* mask_to_noise ratios using masking levels from step 4.
* (*_main_bit_allocation)
*
* 6. If error protection flag is active, add redundancy for
* error protection.
* (*_CRC_calc)
*
* 7. Pack bit allocation, scalefactors, and scalefactor select
*headerrmation onto bitstream.
* (*_encode_bit_alloc,*_encode_scale,transmission_pattern)
*
* 8. Quantize subbands and pack them into bitstream
* (*_subband_quantization, *_sample_encoding)
*
************************************************************************/
1、第一条线:通过滤波器得到32个子带
2、计算框架的缩放因子,计算SCALEFACTOR选择信息。
3、使用哪一种心里声学模型
4、掩蔽,比特分配
5、量化、采样、编码
音频编码的设计思想
编码器原理图分为两条线,上面一条线(核心)做滤波器分解成多个子带,下面一条线(难度大)做心理声学模型和比例因子等。
时-频分析的矛盾:等带宽的滤波器组与人类听觉系统的临界频带 不对应 ,在低频区域,单个子带会覆盖多个临界频带。在这种情况下,量化比特数不能兼每个临界频带。
心理声学模型:计算信号中不可听觉感知的部分
临界频带: 当某个纯音被以它为中心频率、且具有一定带 宽的连续噪声所掩蔽时,如果该纯音刚好被听到时的功率等 于这一频带内的噪声功率,这个带宽为临界频带宽度。
掩蔽值计算的思路
void psycho_0(double SMR[2][SBLIMIT], int nch, unsigned int scalar[2][3][SBLIMIT], FLOAT sfreq) {
int ch, sb, gr;
int minscaleindex[2][SBLIMIT];