人声检测原理VAD

残诗

于 2025-01-23 23:16:49 发布

阅读量403

点赞数 3

文章标签：语音识别人工智能机器人单片机嵌入式硬件

本文链接：https://blog.csdn.net/cnbloger/article/details/145327779

版权

在机器人的研究中，机器人与人语音交互是一个重要的功能，在语音交互中，人声检测至关重要。不论是在手机中，还是在esp32芯片上，都需要一种简单快捷的方式来检测本地语音，滤掉杂音和噪音。

机器人启动后会一直在后台工作采集环境的声音，当本地检测到人声时，会发送给大模型识别，当大模型正确识别语音后，会把识别后的文字转给大模型分析回复，回复的结果通过超拟人合成人声发出来。通过这样的设定，机器人会在工作过程中随时接受语音指令，陪人聊天和与人交互。

package com.example.sparkchaindemo.llm.online_llm.bm;

import android.util.Log;

public class VAD {
    private int sampleRate;
    private int frameSize;
    private double energyThreshold; // 能量阈值

    public VAD(int sampleRate, int frameSize) {
        this.sampleRate = sampleRate;
        this.frameSize = frameSize;
        this.energyThreshold = 0.01; // 根据实际情况调整
    }

    // 检测音频帧是否包含人声
    public boolean detectVoice(byte[] audioFrame) {
        double energy = calculateEnergy(audioFrame);
        Log.i("jiaAAA", "energy="+energy);
        return energy > energyThreshold;
    }

    // 计算音频帧的能量
    private double calculateEnergy(byte[] audioFrame) {
        double sum = 0;
        for (int i = 0; i < audioFrame.length; i +=2) {
            //short sampleShort = (short)0xffff;
            //sampleShort &=audioFrame[i+1];
            //sampleShort = (short)((sampleShort<<8)|audioFrame[1]);
            short sampleShort = (short)((audioFrame[i]&0xff)|(audioFrame[i+1]<<8));

            double sample = sampleShort / 32768.0;
            sum += sample * sample;
        }
        return sum / audioFrame.length;
    }
}

演示：

【讯飞机器狗对话豆包大模型-哔哩哔哩】 https://b23.tv/EolJbEq