unity获取麦克风音量_Unity-麦克风检查是否静音

最新推荐文章于 2024-01-17 18:56:34 发布

李首良

最新推荐文章于 2024-01-17 18:56:34 发布

阅读量428

点赞数 1

文章标签： unity获取麦克风音量

本文链接：https://blog.csdn.net/weixin_35838394/article/details/113030912

版权

在Unity中，通过Microphone.Start方法记录音频并播放以检查用户是否停止说话。使用AudioSource获取SpectrumData，通过分析频谱数据的平均值判断是否有人在讲话。设置阈值来过滤噪声，当平均值低于该阈值时，认为用户已停止说话。此方法可以实现实时检测，并避免用户听到自己的声音，通过AudioMixer将音量设为-80来实现。

摘要由CSDN通过智能技术生成

We use the standard method of recording audio in Unity:

_sendingClip = Microphone.Start(_device, true, 10, 16000);

where _sendingClip is the AudioClip and _device is the device name.

I'd like to know when the user stops speaking, which can happen after 2 seconds, or even 10.

I've looked at different sources to find an answer, but could not find one:

The idea is that when a user stops talking, the audio is send to a speech recognition server without a delay and without audio getting cut off when the user is still speaking.

Solutions don't need to be in code format. A general direction of where to look would be nice.

解决方案

You can send the recording audioclip to an AudioSource and play it using:

audioSource.clip = Microphone.Start(_device, true, 60, 16000);

while (!(Microphone.GetPosition(null) > 0)) { }

audioSource.Play();

When it is playing, you can get the SpectrumData from the audio. When the user is speaking the spectrumdata will show more peaks. You can check the average of the SpectrumData audio to determine if someone is speaking. You should set some sort of minimum level, as you will probably have some noise in the recordings. If the average of the spectrumdata is above the determined level, someone is speaking, if it's below that, the user stopped speaking.

float[] clipSampleData = new float[1024];

bool isSpeaking=false;

void Update(){

audioSource.GetSpectrumData(clipSampleData, 0, FFTWindow.Rectangular);

float currentAverageVolume = clipSampleData.Average();

if(currentAverageVolume>minimumLevel){

isSpeaking=true

}

else if(isSpeaking){

isSpeaking=false;

//volume below level, but user was speaking before. So user stopped speaking

}

You can put that check in the Update method, the spectrumdata will be the spectrumdata of the last frame. So it will be close to realtime.

The minimum level can be determined by just recording something silent, you can do that before the user needs to speak, or in a set-up kind of way.

With this solution the user will hear itself speak, you can set the output of the audiosource to the audiomixer, and put that volume to -80. So it will still recognize the data, but doesn't output the sound to the user. Setting the volume to 0 on the audioSource will give 0 spectrumdata, so use the audiomixer in that case.