一.实验原理
听觉系统中存在一个听觉阈值电平,低于这个电平的声音信号就听不到,听觉阈值的大小随声音频率的改变而改变。一个人是否听到声音取决于声音的频率,以及声音的幅度是否高于这种频率下的听觉阈值,听觉掩蔽特性。即听觉阈值电平是自适应的,会随听到的不同频率声音而发生变化。音调音的掩蔽阈的宽度随频率而变化,掩蔽曲线不对称,高频段一侧的曲线斜率缓些,低频音容易对高频音产生掩蔽。
如果有多个频率成分的复杂信号存在,那么频谱的总掩蔽阈值与频率的关系取决于各掩蔽音的强度、频率和它们之间的距离。
人类听觉系统大致等效于一个在0Hz到20KHz频率范围内由25个重叠的带通滤波器组成的滤波器组。人耳不能区分同一频带内同时发生的不同声音,人耳频带被称为临界频带(critical band),500Hz以下每个临界频带的带宽大约是100Hz,从500Hz起,临界频带带宽线性增加。
临界频带是指当某个纯音被以它为中心频率、且具有一定带宽的连续噪声所掩蔽时,如果该纯音刚好被听到时的功率等于这一频带内的噪声功率,这个带宽为临界频带宽度。
掩蔽效应在一定频率范围内不随带宽增大而改变,直至超过某个频率值。
(1)MPEG_1的心理声学模型:计算信号中不可听觉感知的部分
1)通过子带分析滤波器组使信号具有高的时间分辨率,确保在短暂冲击信号情况下,编码的声音信号具有足够高的质量;
2)又可以使信号通过FFT运算具有高的频率分辨率,因为掩蔽阈值是从功率谱密度推出来的;
3)在低频子带中,为了保护音调和共振峰的结构,就要求用较小的量化阶、较多的量化级数,即分配较多的位数来表示样本值。而话音中的摩擦音和类似噪声的声音,通常出现在高频子带中,对它分配较少的位数;
(2)MPEG_1的多相滤波器摸模型:将PCM样本变换到32个子带的频域信号
缺点:
1)等带宽的滤波器组与人类听觉系统的临界频带不对应;
2)在低频区域,单个子带会覆盖多个临界频带。在这种情况下,量化比特数不能兼每个临界频带;
3)滤波器组与其逆过程不是无失真的,但滤波器组引入的误差差很小,且听不到;
4)子带间频率有混叠滤波后的相邻子带有频率混叠现象,一个子带中的信号可以影响相邻子带的输出。
(3)比特分配模型:根据心理声学模型的计算结果,为每个子带信号分配比特数
使整个一帧和每个子带的总噪声-掩蔽比最小。这是一个循环过程,每一次循环使获益最大的子带的量化级别增加一级,当然所用比特数不能超过一帧所能提供的最大数目。
在调整到固定的码率之前,先确定可用于样值编码的有效比特数,这个数值取决于比例因子、比例因子选择信息、比特分配信息以及辅助数据所需比特数,比特分配的过程对对每个子带计算掩蔽-噪声比MNR,是信噪比SNR –信掩比SMR,即:MNR = SNR –SMR。
算法1:使整帧和每个子带的总噪声—掩蔽比最小,计算掩蔽-噪声比(mask-to-noise ratio, MNR):MNR = SNR –SMR (dB),其中SNR 由MPEG-I标准给定(为量化水平的函数),MNR:表示波形误差与感知测量之间的误差,子带信号可压缩到MNR。
算法2:循环,直到没有比特可用,MNR = SNR –SMR (dB),对最低MNR的子带分配比特,使获益最大的子带的量化级别增加一级,重新计算分配了更多比特子带的MNR。
(4)装帧:产生MPEG-1兼容的比特流
二.MPEG音频编码器原理框图
编码器说明:输入声音信号经过一个多相滤波器组,变换到多个子带。同时经过“心理声学模型”计算以频率为自变量的噪声掩蔽阈值。量化和编码部分用信掩比SMR决定分配给子带信号的量化位数,使量化噪声<掩蔽域值。最后通过数据帧包装将量化的子带样本和其它数据按照规定的帧格式组装成比特数据流。
三.实验主要代码
adb= available_bits (&header, &glopts);//可用比特函数
if (frameNum == 20)
{
printf("每帧分配比特数 = %d\n", adb);
}
lg_frame = adb / 8;
scale_factor_calc (*sb_sample, scalar, nch, frame.sblimit);//比例因子设置函数
if (frameNum == 20)
{
int a, b, c;
for (a=0;a<nch;a++)
{
printf("channel[%d] = \n", a + 1);
for (b=0;b<frame.sblimit;b++)
{
for (c=0;c<3;c++)
{
printf("scalar[%d][%d]= %d %d %d\n",b,c,scalar[a][0][b], scalar[a][1][b], scalar[a][2][b]);
}
}
}
}
main_bit_allocation (smr, scfsi, bit_alloc, &adb, &frame, &glopts);//比特分配函数
sample_encoding (*subband, bit_alloc, &frame, &bs);
if (frameNum == 20)
{
printf("sample rate=%.1f kHz\n", s_freq[header.version][header.sampling_frequency]);
printf("target rate=%d\n",bitrate[header.version][header.bitrate_index]);
int a, b;
for (a = 0; a<2; a++)
{
for (b = 0; b<frame.sblimit; b++)
{
printf("bit_alloc[%d][%d] =%d\n", a, b, bit_alloc[a][b]);
}
}
}
void main_bit_allocation (double perm_smr[2][SBLIMIT],
unsigned int scfsi[2][SBLIMIT],
unsigned int bit_alloc[2][SBLIMIT], int *adb,
frame_info * frame, options * glopts)//比特分配函数在主函数进行调用
{
int noisy_sbs;
int mode, mode_ext, lay;
int rq_db; /* av_db = *adb; Not Used MFC Nov 99 */
int vbrlimits[2][3][2] = {
/* MONO */
{ /* 44 */ {6, 10},
/* 48 */ {3, 10},
/* 32 */ {6, 10}},
/* STEREO */
{ /* 44 */ {10, 14},
/* 48 */ {7, 14},
/* 32 */ {10, 14}}
};
static int init = 0;
static int lower = 10, upper = 10;
static int bitrateindextobits[15] =
{ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
int guessindex = 0;
if (init == 0) {
int nch = 1;
int sfreq;
frame_header *header = frame->header;
init++;
if (header->version == 0) {
/* LSF: so can use any bitrate index from 1->15 */
lower = 1;
upper = 14;
} else {
if (frame->actual_mode == MPG_MD_MONO)
nch = 0;
sfreq = header->sampling_frequency;
lower = vbrlimits[nch][sfreq][0];
upper = vbrlimits[nch][sfreq][1];
}
if (glopts->verbosity > 2)
fprintf (stdout, "VBR bitrate index limits [%i -> %i]\n", lower, upper);
{
int brindex;
frame_header *header = frame->header;
for (brindex = lower; brindex <= upper; brindex++) {
bitrateindextobits[brindex] =
(int) (1152.0 / s_freq[header->version][header->sampling_frequency]) *
((double) bitrate[header->version][brindex]);
}
}
}
if ((mode = frame->actual_mode) == MPG_MD_JOINT_STEREO) {
frame->header->mode = MPG_MD_STEREO;
frame->header->mode_ext = 0;
frame->jsbound = frame->sblimit;
if ((rq_db = bits_for_nonoise (perm_smr, scfsi, frame)) > *adb) {
frame->header->mode = MPG_MD_JOINT_STEREO;
mode_ext = 4; /* 3 is least severe reduction */
lay = frame->header->lay;
do {
--mode_ext;
frame->jsbound = js_bound (mode_ext);
rq_db = bits_for_nonoise (perm_smr, scfsi, frame);
}
while ((rq_db > *adb) && (mode_ext > 0));
frame->header->mode_ext = mode_ext;
} /* well we either eliminated noisy sbs or mode_ext == 0 */
}
/* decide on which bit allocation method to use */
if (glopts->vbr == FALSE) {
/* Just do the old bit allocation method */
noisy_sbs = a_bit_allocation (perm_smr, scfsi, bit_alloc, adb, frame);
} else {
/* do the VBR bit allocation method */
frame->header->bitrate_index = lower;
*adb = available_bits (frame->header, glopts);
{
int brindex;
int found = FALSE;
/* Work out how many bits are needed for there to be no noise (ie all MNR > 0.0 + VBRLEVEL) */
int req =
VBR_bits_for_nonoise (perm_smr, scfsi, frame, glopts->vbrlevel);
/* Look up this value in the bitrateindextobits table to find what bitrate we should use for
this frame */
for (brindex = lower; brindex <= upper; brindex++) {
if (bitrateindextobits[brindex] > req) {
/* this method always *overestimates* the bits that are needed
i.e. it will usually guess right but
when it's wrong it'll guess a higher bitrate than actually required.
e.g. on "messages from earth" track 6, the guess was
wrong on 75/36341 frames. each time it guessed higher.
MFC Feb 2003 */
guessindex = brindex;
found = TRUE;
break;
}
}
/* Just for sanity */
if (found == FALSE)
guessindex = upper;
}
frame->header->bitrate_index = guessindex;
*adb = available_bits (frame->header, glopts);
/* update the statistics */
vbrstats[frame->header->bitrate_index]++;
if (glopts->verbosity > 2) {
/* print out the VBR stats every 1000th frame */
static int count = 0;
int i;
if ((count++ % 1000) == 0) {
for (i = 1; i < 15; i++)
fprintf (stdout, "%4i ", vbrstats[i]);
fprintf (stdout, "\n");
}
/* Print out *every* frames bitrateindex, bits required, and bits available at this bitrate */
if (glopts->verbosity > 5)
fprintf (stdout,
"> bitrate index %2i has %i bits available to encode the %i bits\n",
frame->header->bitrate_index, *adb,
VBR_bits_for_nonoise (perm_smr, scfsi, frame,
glopts->vbrlevel));
}
noisy_sbs =
VBR_bit_allocation (perm_smr, scfsi, bit_alloc, adb, frame, glopts);
}
}
四.实验结果
(1)比特数和比例因子输出
(2)比特分配结果