Wav文件直接反映了一个声音在每个时刻的大小值,比如说以下一段波形:
我们按每人0.1秒取一点,得到的wav文件数值就是0,1,1,-1,0,1。因此,假如我们能把许多Wav文件的数据直接相加,你听到的就是所有的声音,这就是混音器的原理。
Step 1, Get the Raw data of the two files, (Example, of the sample 8bit and 8Kh, means one sample is of 8bit)
Step 2 Let the two audio signal be A and B respectively, the range is between 0 and 255. Where A and B are the Sample Values (Each raw data) And store the resultant into the Y
If Both the samples Values are possitv Y = A + B - A * B / 255
Where Y is the resultant signal which contains both signal A and B, merging two audio streams into single
stream by this method solves the problem of overflow and information loss to an extent.
If the range of 8-bit sampling is between -127 to 128
If both A and B are negative Y = A +B - (A * B / (-127))
Else Y = A + B - A * B / 128
Similarly for the nbit (ex 16bit data)
For n-bit sampling audio signal
If both A and B are negative Y = A + B - (A * B / (-(2 pow(n-1) -1)))
Else Y = A + B - (A * B / (2 pow(n-1))
Step 3.
Add the Header to the Resultant (mixed) data and play back.
If some thing is unclear and ambigious let me know.
Regards
Ranjeet Gupta.
还有简单C程序示意代码,但是其中包含了核心算法:
#include
#include
#include
int main(int argc,char *argv[]) {
char mixname[255];
FILE *pcm1, *pcm2, *mix;
char sample1, sample2;
int value;
pcm1 = fopen(argv[1],"r");
pcm2 = fopen(argv[2],"r");
strcpy (mixname, argv[1]);
strcat (mixname, "_temp.wav");
mix = fopen(mixname, "w");
while(!feof(pcm1)) {
sample1 = fgetc(pcm1);
sample2 = fgetc(pcm2);
if ((sample1 < 0) && (sample2 < 0)) {
value = sample1 + sample2 - (sample1 * sample2 / -(pow(2,16-1)-1));
}else{
value = sample1 + sample2 - (sample1 * sample2 / (pow(2,16-1)-1));
}
fputc(value, mix);
}
fclose(pcm1);
fclose(pcm2);
fclose(mix);
return 0;
}
for (int i = 0; i < oAcc->frame_size*2; i=i+2)
{
uint8_t* pMicOut = frame_audioMicOut->extended_data[0] + i;
uint8_t* pMicIn = frame_audioMicIn->extended_data[0] + i;
short tempMicOut = *(short*)pMicOut;
short tempMicIn = *(short*)pMicIn;
int tempOut = 0;
if (tempMicOut < 0 && tempMicIn < 0)
tempOut = tempMicOut + tempMicIn - tempMicOut*tempMicIn / (-(pow(2, 15) - 1));
else if (tempMicOut > 0 && tempMicIn > 0)
tempOut = tempMicOut + tempMicIn - tempMicOut*tempMicIn / (pow(2, 15));
pMicIn = (uint8_t*)tempOut;
}
线性叠加后求平均
优点:不会产生溢出,噪音较小;
缺点:衰减过大,影响通话质量;
- 1
- 2
- 3
- 4
- 5
- 1
- 2
- 3
- 4
- 5
归一化混音(自适应加权混音算法)
思路:
使用更多的位数(32 bit)来表示音频数据的一个样本,混完音后在想办法降低其振幅,使其仍旧分布在16 bit所能表示的范围之内,这种方法叫做归一法.
方法:
为避免发生溢出,使用一个可变的衰减因子对语音进行衰减。这个衰减因子也就代表语音的权重,衰减因子随着音频数据的变化而变化,所以称为自适应加权混音。当溢出时,衰减因子较小,使得溢出的数据在衰减后能够处于临界值以内,而在没有溢出时,又让衰减因子慢慢增大,使数据较为平缓的变化.
代码:
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
下面是我从newlc上找到的一个关于PCM脉冲编码的音频信号的混音实现,其中包含了一个关键的混音算法!
- 1
- 2
- 3
- 4
- 1
- 2
- 3
- 4
切割时间片,重采样算法
可以把各个通道的声音叠到一起,让声音的采样率按倍增加,如果提高声音的播放频率,声音可以正常的播放,声音实现了叠加;如果不想修改声音的播放输出频率,可以通过声音的重采样后输出自己想要的输出频率;