如何在项目中使用语音识别sherpa-onnx

smile 

已于 2024-05-09 19:15:28 修改

阅读量424

点赞数 5

文章标签：语音识别人工智能 qt c++ 集成测试

于 2024-05-09 19:14:19 首次发布

本文链接：https://blog.csdn.net/smileac/article/details/138625968

版权

记录如何使用预先编出的可执行文件和预编译模型，在自己的项目中进行一些操作。以下都已离线语音识别模型为例，可执行文件为sherpa-onnx-microphone-offline.exe。

需要用到的文件

可执行文件sherpa-onnx-microphone-offline.exe
预编译模型model.int8.onnx
文本数据库tokens.txt

在上一篇文章中说明了这些文件是如何获取的，此处不再细说。
将这三个文件放到一个文件夹里面，然后放在自己的项目里面，随意一个位置都行。

代码处理

此处只描述如何启动外部程序，具体的业务实现不进行讲解。
使用来进行启动外部程序：

// sherpa-onnx-microphone-offline --tokens=./tokens.txt  --paraformer=./model.int8.onnx
// ShellExecute(指定父窗口 一般为NULL，打开方式，要打开的文件或要执行的程序，参数，缺省目录 一般为NULL，命令框打开方式)
QString cmdPath = QCoreApplication::applicationDirPath() + "/../src/LyIndividualCombatSystem/LySherpaOnnx";
cmdPath = cmdPath.replace("/", "\\");
QString exePath = cmdPath + QString("\\sherpa-onnx-microphone-offline.exe");
QString tokensPath = cmdPath + QString("\\tokens.txt");
QString modelPath = cmdPath + QString("\\model.int8.onnx");
QString strParam = /*QString("/c ") +*/ QString("--tokens=") + tokensPath + QString("  --paraformer=") + modelPath;
const wchar_t* wcParam = reinterpret_cast<const wchar_t*>(strParam.utf16());
const wchar_t* wcExePath = reinterpret_cast<const wchar_t*>(exePath.utf16());
//SHELLEXECUTEINFO shexecInfo = { 0 };
shexecInfo.cbSize = sizeof(SHELLEXECUTEINFO);
shexecInfo.fMask = SEE_MASK_NOCLOSEPROCESS;
shexecInfo.hwnd = NULL;
shexecInfo.lpVerb = (LPCTSTR)L"open";
shexecInfo.lpFile = (LPCTSTR)wcExePath;
shexecInfo.lpParameters = (LPCTSTR)wcParam;
shexecInfo.lpDirectory = NULL;
shexecInfo.nShow = SW_HIDE;
shexecInfo.hInstApp = NULL;
ShellExecuteEx(&shexecInfo);
//auto proc = ShellExecute(0, (LPCTSTR)L"open", (LPCTSTR)wcExePath, (LPCTSTR)wcParam, (LPCTSTR)L"", SW_HIDE);

emit sig_speech_process(shexecInfo.hProcess);

TerminateProcess(shexecInfo.hProcess, 0);

需要注意有几点：

在使用路径时建议使用以上所示的方法，相对路径可能不正确，大概是因为当前路径不确定，所以可能找不到对应的文件，此处我踩坑了，还有就是左斜杠右斜杠的问题需要注意。
SHELLEXECUTEINFO中的lpDirectory我此处为空可行，但是其他有人说不能为空否则不行，所以如果不成功可尝试修改此处。其他变量可自行查询，网上有很多对此的解释说明。
最后需要调用TerminateProcess()终止线程，如果使用ShellExecute()来启动的外部程序使用此接口终止失败，但是使用ShellExecuteEx()来启动程序，及以上所示用到的方法则可终止成功。
我还遇到一个问题是整个程序退出后，这个外部程序还在后台运行，最终版本好像是没有这个问题，后续我再测试一下。
在代码最后我发送了一个信号给最上层的窗口，是为了避免没有手动关闭语音识别功能时，在最上层窗口的析构函数中执行TerminateProcess()。

接收UDP数据

因为是Qt项目，所以此处用的时Qt语言进行接收的，具体怎么用请自行查阅。

m_udpSocketSpeech = new QUdpSocket(this);
const QHostAddress address = QHostAddress("127.0.0.1");
bool result = m_udpSocketSpeech->bind(address, 8080,
	QUdpSocket::ShareAddress | QUdpSocket::ReuseAddressHint);

connect(m_udpSocketSpeech, &QUdpSocket::readyRead, this, &TaskManagement::dataRecvSpeechRecognition);

void TaskManagement::dataRecvSpeechRecognition()
{
	if (!m_udpSocketSpeech->hasPendingDatagrams() || m_udpSocketSpeech->pendingDatagramSize() <= 0)
		return false;
	while (m_udpSocketSpeech->hasPendingDatagrams()) {
		QByteArray datagram;
		datagram.resize(m_udpSocketSpeech->pendingDatagramSize());
		m_udpSocketSpeech->readDatagram(datagram.data(), datagram.size());
	
		const QString recv_text = QString::fromUtf8(datagram.data());
		if (recv_text != "") {
			if (recv_text.at(0) == '5' && !this->isHidden()) {
				this->close();
			}
		}
	}
	return true;
}

smile 

关注

5
点赞
踩
8

收藏

觉得还不错? 一键收藏
3
评论
如何在项目中使用语音识别sherpa-onnx

记录如何使用预先编出的可执行文件和预编译模型，在自己的项目中进行一些操作。以下都已离线语音识别模型为例，可执行文件为sherpa-onnx-microphone-offline.exe。
复制链接

扫一扫