基于科大讯飞语音云windows平台开发

最新推荐文章于 2024-05-08 21:30:45 发布

如梦如幻2015

最新推荐文章于 2024-05-08 21:30:45 发布

阅读量1.9k

点赞数

分类专栏：语音识别文章标签：讯飞开发实例

语音识别专栏收录该内容

23 篇文章 4 订阅

订阅专栏

前记：

前段时间公司没事干，突发奇想想做一个语音识别系统，看起来应该很简单的，但做起来却是各种问题，这个对电气毕业的我，却是挺为难的。谷姐已经离我们而去，感谢度娘，感谢CSDN各位大神，好歹也做的是那么回事了，虽然还是不好用，但基本功能实现了。

该软件使用VS2008C++/CLR开发，由于科大讯飞提供的是C的API接口，结果到这边就是各种不兼容，CLR是基于托管堆运行的，而这个API有是非托管堆的，使用了各种指针，原本打算使用C#来做，最后门外汉的我也没能做到C#和C指针完美结合，真怀恋单片机写代码的年代啊。还有录音方面需要directX是的支持。软件下载地址：http://download.csdn.net/detail/liucheng5037/7509003

软件运行界面如下图所示：

左边实现文字转语音，需要在文本框中输入文字，然后根据需要配置好声音，音量，速度选项，点击播放，软件会先通过讯飞的API获取语音，然后以设定的方式播放出来。

右边实现语音转文字，直接按住button说话，松开后软件通过讯飞的API将语音信息传递给语音云，最后返回文字显示在文本框。

看起来很简单的东西折腾了我不少时间啊！！！

系统组成

该系统由4部分组成：讯飞云、语音录入、语音播放、系统控制。语音播放和讯飞云封装在一个类XunFeiSDK里面，语音录入使用的是网上找的基于DirectX的类SoundRecord，基于C#写的，本想把录入也写到讯飞云那个类里去，结果说什么非托管的类不能有基于托管堆的成员，无奈只有单独出来，作为一个dll文件存在。系统控制是在form类里面。

讯飞语音云：

（先复制一段官方说法）

讯飞移动语音平台是基于讯飞公司已有的ISP 和IMS 产品，开发出的一款符合移动互联网用户使用的语音应用开发平台，提供语音合成、语音听写、语音识别、声纹识别等服务，为语音应用开发爱好者提供方便易用的开发接口，使得用户能够基于该开发接口进行多种语音应用开发。其主要功能有：

1)实现基于HTTP 协议的语音应用服务器，集成讯飞公司最新的语音引擎，支持语音合成、语音听写、语音识别、声纹识别等服务；

2)提供基于移动平台和PC 上的语音客户端子系统，内部集成音频处理和音频编解码模块，提供关于语音合成、语音听写、语音识别和声纹识别完善的API。

（复制完毕）

由于只想写的玩玩，没有太多时间，故直接把官方C语言写的demo复制过来，转变成一个类。官方提供了一个dll文件一个lib文件还有一堆H文件。具体执行的代码时封装在dll文件里的，我们看不到，我们需要引入lib文件来间接调用语音函数。引入lib的方式如下：

 
   [cpp]  
   view plain 
  
 #ifdef _WIN64  
 #pragma comment(lib,"../lib/msc_x64.lib")//x64  
 #else  
 #pragma comment(lib,"../lib/msc.lib")//x86  
 #endif  

然后需要include下面几个H文件：

 
   [cpp]  
   view plain 
  
 #include "../include/qisr.h"  
 #include "../include/qtts.h"  
 #include "../include/msp_cmn.h"  
 #include "../include/msp_errors.h"  

类XunFeiSDK不能使用ref来修饰，不然又是各种托管堆和非托管堆不能互通之类的报错。讯飞语音一个转换来回如下：

讯飞语音详细的说明可以到这里下载http://open.voicecloud.cn/index.php/services/voicebase?type=tts&tab_index=1

选择windowsSDK开发包，里面有一些简单的demo和说明，不过需要事先注册才能下载。

有一点要注意的是，语音返回的音频格式是PCM这种格式和wav很像，一般支持WAV的播放器都支持PCM。不同的语音播放方式如普通话女声和东北话使用的语音引擎不同，具体可参考类SoundType。

登录可以在软件打开时执行，登出可以在软件关闭时执行，中间的转换每次需要执行一次，因为每次执行的sessionID不一样，每次需要重新发起会话。

语音录入：

这一部分花的时间比较长，刚开始时什么都不知道啊，一点录入的概念都没有，完全不知道该调用什么API，用什么控件，只有到处百度，试了各种办法，最后，果然CSDN是大神出没的地方，被我找到了，地址如下：C#中使用DirectSound录音。

这个类封装的很好，就只有3个函数。

SetFileName():录音文件存放位置和名称

RecStart():开始录音

RecStop():结束录音

整个录音过程是在一个单独线程上运行的，不会影响主程序运行。

C#DLL文件移植到C++的方法：

1、使用#using把文件包含进来#using "SoundRecord.dll"；

2、增加命名空间usingnamespace VoiceRecord;

3、声明一个对象，注意类名不可以和命名空间名一致，这样虽然声称dll时不会出错，但编译会出错， SoundRecord^ recorder;

语音播放

该部分比较简单，直接使用了System::Media命名空间下的类SoundPlayer，在使用时直接gcnew一个对象，然后load()，然后play()，当然，load可以不要的。这个play可以支持播放PCM和WAV格式语音，其他格式未试验。

系统控制部分

在这一部分声明了一个静态的XunFeiSDK类指针，还有一个录音类的托管对象，还有转换进程等。

 
   [cpp]  
   view plain 
  
 static XunFeiSDK* xunfei;  
 static SoundRecord^ recorder;  
 Thread^ xunfei_thread;  

音频转文字部分采用了单独线程，由于子线程不可以访问主线程的form控件，无奈又加了个定时器和标志位来检测子线程是否完成，网上说可以采用委托的方式来访问控件，但本人实在弄不懂委托，只有放弃，这一部分做的很单片机的style。

在文字转语音部分没有采用进程，会有在这里卡一会。

知识点

1、由于使用的是讯飞的C库，又用到了C++/CLR的form，托管堆和非托管堆的鸿沟很麻烦。

本程序使用了微软提供的转换函数。需要include的内容：

 
   [cpp]  
   view plain 
  
 #include <windows.h>  
 #include <string>  
 #include <iostream>  
 #include <sstream>  
 include <msclr\marshal.h>  

a、std::string转const char *

const char *strp=str.c_str();

b、System::String^转 string 和 const char*

Stringstd_str = (constchar*)(Marshal::StringToHGlobalAnsi(nowTime.ToString(Sys_str))).ToPointer();

c、char* 转 System::String^

Sys_str = Marshal::PtrToStringAnsi((IntPtr)char_str);

d、int 转 std::String

ostringstreamoss1;

oss1<<int_num;

std_str = oss1.str();

2、同一文件下，若一个类需要使用另一个类，则需要在前面声明一下，这和C函数类似。

Eg：refclass SoundType;

3、在非托管类下，不能使用托管类作为成员；实例化托管对象需要使用gcnew，实例化非托管对象直接使用new。

4、对List之类的对象，可以直接添加任何对象，包括form上的List，比如ComboBox，显示是显示该对象的ToString方法。

TOString方法重载：

 
   [cpp]  
   view plain 
  
 virtual System::String^ ToString() override//重载ToString方法  
     {  
         return voice;  
     }  

5、switchcase 不支持string类型的值输入。

部分源代码如下：

 
   [cpp]  
   view plain 
  
 //类XunFeiSDK  
 /* 
  string str("hello"); 
  const char *strp=str.c_str();    string转const char* 
 */  
 //#using "Microsoft.DirectX.DirectSound.dll"  
 //#using "Microsoft.DirectX.dll"  
   
 #include "../SoundTest/stdafx.h"  
 //#include "stdafx.h"  
 #include "stdlib.h"  
 #include "stdio.h"  
 #include <windows.h>  
 #include <conio.h>  
 #include <errno.h>  
 #include <iostream>  
 #include <sstream>  
 #include <fstream>  
 #include <time.h>  
 #include <string>  
 #include <msclr\marshal.h>  
   
 using namespace std;  
   
 #include "../include/qisr.h"  
 #include "../include/qtts.h"  
 #include "../include/msp_cmn.h"  
 #include "../include/msp_errors.h"  
   
   
   
 #ifdef _WIN64  
 #pragma comment(lib,"../lib/msc_x64.lib")//x64  
 #else  
 #pragma comment(lib,"../lib/msc.lib")//x86  
 #endif  
   
 #define DebugPrint(str_x,msg_y) fprintf(out_file,(str_x),(msg_y))  
   
 typedef int SR_DWORD;  
 typedef short int SR_WORD ;  
   
 //音频头部格式  
 struct wave_pcm_hdr  
 {  
     char            riff[4];                        // = "RIFF"  
     SR_DWORD        size_8;                         // = FileSize - 8  
     char            wave[4];                        // = "WAVE"  
     char            fmt[4];                         // = "fmt "  
     SR_DWORD        dwFmtSize;                      // = 下一个结构体的大小: 16  
   
     SR_WORD         format_tag;              // = PCM : 1  
     SR_WORD         channels;                       // = 通道数: 1  
     SR_DWORD        samples_per_sec;        // = 采样率: 8000 | 6000 | 11025 | 16000  
     SR_DWORD        avg_bytes_per_sec;      // = 每秒字节数: dwSamplesPerSec * wBitsPerSample / 8  
     SR_WORD         block_align;            // = 每采样点字节数: wBitsPerSample / 8  
     SR_WORD         bits_per_sample;         // = 量化比特数: 8 | 16  
   
     char            data[4];                        // = "data";  
     SR_DWORD        data_size;                // = 纯数据长度: FileSize - 44   
 } ;  
   
 //默认音频头部数据  
 const struct wave_pcm_hdr default_pcmwavhdr =   
 {  
     { 'R', 'I', 'F', 'F' },  
     0,  
     {'W', 'A', 'V', 'E'},  
     {'f', 'm', 't', ' '},  
     16,  
     1,  
     1,  
     16000,  
     32000,  
     2,  
     16,  
     {'d', 'a', 't', 'a'},  
     0    
 };  
   
 namespace SoundTest {  
     using namespace System;  
     using namespace System::Runtime::InteropServices;  
     using namespace System::Media;  
     using namespace msclr::interop;  
     ref class SoundType;  
   
     public class XunFeiSDK{  
       
     private: FILE* out_file;//输出log文件  
              string appid;  
              int ret;  
              string pcm_path;//存储音频文件的文件名  
              string user;  
              string password;  
              string voice_type;//语言类型  
              string volunm;//音量0-10  
              string engin;//引擎  
              string voice_speed;//语速-10  
                
     public: XunFeiSDK()  
             {  
                 DateTime nowTime = DateTime::Now;  
                 string nowTimes = (const char*)(Marshal::StringToHGlobalAnsi(nowTime.ToString("yyyy-MM-dd HH:mm:ss"))).ToPointer();  
                 fopen_s(&out_file,"log.txt","at+");  
                 if(out_file == NULL)  
                 {  
                     ret = -1;  
                     return;  
                 }  
                 fseek(out_file, 0, 2);  
                 fprintf(out_file,"begin Time:%s \n",nowTimes.c_str());  
   
                 appid = "";  
                 user = "";  
                 password = "53954218";//可以上官网注册专属自己的ID   
                 pcm_path = "PCM_SPEED.pcm";  
                 voice_type = "xiaoyan";  
                 volunm = "7";  
                 voice_speed = "5";  
                 engin = "intp65";  
             }  
              ~XunFeiSDK()  
             {  
                 string nowTimes = (const char*)(Marshal::StringToHGlobalAnsi(DateTime::Now.ToString("yyyy-MM-dd HH:mm:ss"))).ToPointer();  
                 fprintf(out_file,"Time:%s end\n",nowTimes.c_str());  
                 fclose(out_file);  
             }  
   
     public: int status()  
             {  
                 return ret;  
             }  
   
             bool Login()//登录  
             {  
                 string logins = "appid = " + appid + ",work_dir =   .  ";  
                 ret = MSPLogin(user.c_str(), password.c_str(), logins.c_str());  
                 if ( ret != MSP_SUCCESS )  
                 {  
                     fprintf(out_file,"MSPLogin failed , Error code %d.\n",ret);  
                     return false;  
                 }  
                 return true;  
             }  
   
             void Logout()  
             {  
                 MSPLogout();//退出登录  
             }  
   
             int TextToSpeed(System::String^ Ssrc_text)//字符串转音频，音频存放在PCM_SPEED.pcm下  
             {  
                 #pragma region 字符串转音频  
                 struct wave_pcm_hdr pcmwavhdr = default_pcmwavhdr;  
                 const char* sess_id = NULL;  
                 unsigned int text_len = 0;  
                 char* audio_data = NULL;  
                 unsigned int audio_len = 0;  
                 int synth_status = MSP_TTS_FLAG_STILL_HAVE_DATA;  
                 FILE* fp = NULL;  
                 string params = "vcn=xiaoyan, spd = 50, vol = 50";//参数可参考可设置参数列表  
                 ret = -1;//失败  
                 //参数配置  
                 params = "vcn=" + voice_type + ", spd = " + voice_speed + ", vol = " + volunm + ", ent = "+engin;  
                 const char* src_text = (const char*)(Marshal::StringToHGlobalAnsi(Ssrc_text)).ToPointer();  
   
                 pcm_path = "PCM_SPEED.pcm";  
   
                 fprintf(out_file,"begin to synth source = %s\n",src_text);  
                 if (NULL == src_text)  
                 {  
                     fprintf(out_file,"params is null!\n");  
                     return ret;  
                 }  
                 text_len = strlen(src_text);//获取文本长度  
                   
                 fopen_s(&fp,pcm_path.c_str(),"wb");//打开PCM文件  
                 if (NULL == fp)  
                 {  
                     fprintf(out_file,"open PCM file %s error\n",pcm_path);  
                     return ret;  
                 }  
   
                 sess_id = QTTSSessionBegin(params.c_str(), &ret);//开始一个会话                 
                 if ( ret != MSP_SUCCESS )  
                 {  
                     fprintf(out_file,"QTTSSessionBegin: qtts begin session failed Error code %d.\n",ret);  
                     return ret;  
                 }  
                 fprintf(out_file,"sess_id = %s\n",sess_id);  
                 ret = QTTSTextPut(sess_id, src_text, text_len, NULL );//发送txt信息  
                 if ( ret != MSP_SUCCESS )  
                 {  
                     fprintf(out_file,"QTTSTextPut: qtts put text failed Error code %d.\n",ret);  
                     QTTSSessionEnd(sess_id, "TextPutError");//异常，结束  
                     return ret;  
                 }  
                 fwrite(&pcmwavhdr, sizeof(pcmwavhdr) ,1, fp);//把开始文件写到最前面  
   
                 while (1)//循环读取音频文件并存储  
                 {  
                     const void *data = QTTSAudioGet(sess_id, &audio_len, &synth_status, &ret);  
                     if (NULL != data)  
                     {  
                         fwrite(data, audio_len, 1, fp);  
                         pcmwavhdr.data_size += audio_len;//修正pcm数据的大小  
                     }  
                     if (synth_status == MSP_TTS_FLAG_DATA_END || ret != 0)   
                         break;  
                 }//合成状态synth_status取值可参考开发文档  
   
                 //修正pcm文件头数据的大小  
                 pcmwavhdr.size_8 += pcmwavhdr.data_size + 36;  
   
                 //将修正过的数据写回文件头部  
                 fseek(fp, 4, 0);  
                 fwrite(&pcmwavhdr.size_8,sizeof(pcmwavhdr.size_8), 1, fp);  
                 fseek(fp, 40, 0);  
                 fwrite(&pcmwavhdr.data_size,sizeof(pcmwavhdr.data_size), 1, fp);  
                 fclose(fp);  
   
                 ret = QTTSSessionEnd(sess_id, NULL);  
                 if ( ret != MSP_SUCCESS )  
                 {  
                     fprintf(out_file,"QTTSSessionEnd: qtts end failed Error code %d.\n",ret);  
                 }  
                 fprintf(out_file,"program end");  
                 return ret;  
                 #pragma endregion  
             }  
   
             System::String^ GetPcmName()//获取音频文件路径  
             {  
                 return gcnew String(pcm_path.c_str());  
             }  
               
             int Play(System::String^ text)//播放音频文件  
             {  
                 if(text == "") return -1;  
                 SoundPlayer^ player = (gcnew SoundPlayer(text));//音频播放器  
                 player->SoundLocation = text;  
                 player->Load();  
                 player->Play();  
                 return 0;  
             }  
   
             int StartRecord()//开始录音  
             {  
               
             }  
   
             int EndRecord()//结束录音  
             {  
               
             }  
   
             System::String^ SpeedToText(System::String^ text)//语音转文字，输入语音文件名，返回文字信息  
             {  
                 System::String^ Sys_value = "No data return";  
                 const char* src_wav_filename = (const char*)(Marshal::StringToHGlobalAnsi(text)).ToPointer();  
                 //test = Marshal::PtrToStringAnsi((IntPtr)(char *)src_text);  
                 //return test;  
                 char rec_result[1024] = {0};//存放返回结果  
                 const char *sessionID = NULL;  
                 FILE *f_pcm = NULL;//  
                 char *pPCM = NULL;//存放音频文件缓存  
                 int lastAudio = 0 ;  
                 int audStat = MSP_AUDIO_SAMPLE_CONTINUE ;  
                 int epStatus = MSP_EP_LOOKING_FOR_SPEECH;  
                 int recStatus = MSP_REC_STATUS_SUCCESS ;  
                 long pcmCount = 0;  
                 long pcmSize = 0;//音频文件大小  
                 int errCode = 10 ;  
                 string param = "sub=iat,auf=audio/L16;rate=16000,aue=speex-wb,ent=sms16k,rst=plain,rse=gb2312";  
   
                 fprintf(out_file,"Start iat...\n");  
                 sessionID = QISRSessionBegin(NULL, param.c_str(), &errCode);//开始一路会话  
                 fopen_s(&f_pcm,src_wav_filename, "rb");  
                 if (NULL != f_pcm) {  
                     fseek(f_pcm, 0, SEEK_END);  
                     pcmSize = ftell(f_pcm);//获取音频大小  
                     fseek(f_pcm, 0, SEEK_SET);  
                     pPCM = (char *)malloc(pcmSize);//分配内存存放音频  
                     fread((void *)pPCM, pcmSize, 1, f_pcm);  
                     fclose(f_pcm);  
                     f_pcm = NULL;  
                 }//读取音频文件,读到pPCM中  
                 else  
                 {  
                     fprintf(out_file,"media %s not found\n",src_wav_filename);  
                     return Sys_value;  
                 }  
   
                 while (1) {//开始往服务器写音频数据  
                 unsigned int len = 6400;  
                 int ret = 0;  
                 if (pcmSize < 12800) {  
                     len = pcmSize;  
                     lastAudio = 1;//音频长度小于  
                 }  
                 audStat = MSP_AUDIO_SAMPLE_CONTINUE;//有后继音频  
                 if (pcmCount == 0)  
                     audStat = MSP_AUDIO_SAMPLE_FIRST;  
                 if (len<=0)  
                 {  
                     break;  
                 }  
                 fprintf(out_file,"csid=%s,count=%d,aus=%d,",sessionID,pcmCount/len,audStat);  
                 ret = QISRAudioWrite(sessionID, (const void *)&pPCM[pcmCount], len, audStat, &epStatus, &recStatus);//写音频  
                 fprintf(out_file,"eps=%d,rss=%d,ret=%d\n",epStatus,recStatus,errCode);  
                 if (ret != 0)  
                 break;  
                 pcmCount += (long)len;  
                 pcmSize -= (long)len;  
                 if (recStatus == MSP_REC_STATUS_SUCCESS) {  
                     const char *rslt = QISRGetResult(sessionID, &recStatus, 0, &errCode);//服务端已经有识别结果，可以获取  
                     fprintf(out_file,"csid=%s,rss=%d,ret=%d\n",sessionID,recStatus,errCode);  
                     if (NULL != rslt)  
                         strcat_s(rec_result,rslt);  
                 }  
                 if (epStatus == MSP_EP_AFTER_SPEECH)  
                     break;  
                 Sleep(150);//模拟人说话时间间隙  
                 }  
                 QISRAudioWrite(sessionID, (const void *)NULL, 0, MSP_AUDIO_SAMPLE_LAST, &epStatus, &recStatus);//写入结束  
                 free(pPCM);  
                 pPCM = NULL;  
                 while (recStatus != MSP_REC_STATUS_COMPLETE && 0 == errCode) {  
                     const char *rslt = QISRGetResult(sessionID, &recStatus, 0, &errCode);//获取结果  
                     fprintf(out_file,"csid=%s,rss=%d,ret=%d\n",sessionID,recStatus,errCode);  
                     if (NULL != rslt)  
                     {  
                         strcat_s(rec_result,rslt);  
                     }  
                     Sleep(150);  
                 }  
                 QISRSessionEnd(sessionID, NULL);  
                 fprintf(out_file,"The result is: %s\n",rec_result);  
                 if(NULL != rec_result)//不为空时返回正确值  
                 Sys_value = Marshal::PtrToStringAnsi((IntPtr)rec_result);//数值转换  
   
                 return Sys_value;  
             }  
   
             void set_tts_params(System::String^ e_voice_type , System::String^ e_engin , int e_volunm , int e_speed)  
             {  
                 const char* src_text = (const char*)(Marshal::StringToHGlobalAnsi(e_voice_type)).ToPointer();  
                 voice_type = src_text;  
                 src_text = (const char*)(Marshal::StringToHGlobalAnsi(e_engin)).ToPointer();  
                 engin = src_text;  
                 ostringstream oss1;  
                 ostringstream oss2;  
                 oss1<<e_volunm;  
                 volunm = oss1.str();//音量  
                 oss2<<e_speed;  
                 voice_speed = oss2.str();//语速  
             }  
               
     };  
   
     public ref class SoundType{  
     public: System::String^ engin;//语音引擎  
             System::String^ voice_type;//说话类型  
             System::String^ voice;//显示  
   
             SoundType(System::String^ e_voice)//switch case 不支持string的输入  
             {  
                 voice = e_voice;  
   
                 if (e_voice == "普通话女声") {engin = "intp65";voice_type = "xiaoyan";}  
                 else if(e_voice == "普通话男声") {engin = "intp65";voice_type = "xiaoyu";}  
                 else if(e_voice == "英文女声") {engin = "intp65_en";voice_type = "Catherine";}  
                 else if(e_voice == "英文男声") {engin = "intp65_en";voice_type = "henry";}  
                 else if(e_voice == "粤语") {engin = "vivi21";voice_type = "vixm";}  
                 else if(e_voice == "台湾话") {engin = "vivi21";voice_type = "vixl";}  
                 else if(e_voice == "四川话") {engin = "vivi21";voice_type = "vixr";}  
                 else if(e_voice == "东北话") {engin = "vivi21";voice_type = "vixyun";}  
                 else {engin = "intp65";voice_type = "xiaoyan";voice = "普通话女声";}  
   
             }  
             SoundType()  
             {  
                 engin = "intp65";voice_type = "xiaoyan";voice = "普通话女声";  
             }  
   
             virtual System::String^ ToString() override//重载ToString方法  
             {  
                 return voice;  
             }  
   
     };  
   
 }  

FORM类：

局部变量：

 
   [cpp]  
   view plain 
  
 private: static XunFeiSDK* xunfei;  
 private: Thread^ xunfei_thread;  
          static int end_flag;  
          static String^ end_result;  
          ArrayList^ voice_types;  
 private: static SoundRecord^ recorder;  

 
   [cpp]  
   view plain 
  
 #pragma region 控件触发函数  
     private: System::Void Form1_Load(System::Object^  sender, System::EventArgs^  e) {  
                  xunfei = (new XunFeiSDK());  
                  end_flag = 0;  
                  if(-1 == xunfei->status())  
               {  
                   MessageBox::Show("初始化失败");  
                   this->Close();//关闭窗体  
                   return;                 
               }  
                  if(!(xunfei->Login()))  
               {  
                   MessageBox::Show("登录失败");  
                   this->Close();//关闭窗体  
                   return;  
               }  
                  volunm_lab->Text = "音量 " + volunm_bar->Value;  
                  speed_lab->Text = "速度 " + speed_bar->Value;  
              }  
     private: System::Void Form1_FormClosing(System::Object^  sender, System::Windows::Forms::FormClosingEventArgs^  e) {  
                  xunfei->Logout();//登出  
                  delete xunfei;//必须释放才会调用析构函数  
                  delete recorder;  
              }  
   
     private: System::Void play_tts_btn_Click(System::Object^  sender, System::EventArgs^  e) {  
                 // tts_status_lab->Text = "先转换，再播放语音";  
                  set_xunfei_param();//参数设置  
                  if(-1 == xunfei->TextToSpeed(txt_speak->Text))  
                  {  
                      MessageBox::Show("转换失败");  
                  }  
                  else  
                  {  
                      xunfei->Play(xunfei->GetPcmName());  
                  }         
              }  
   
     private: System::Void speak_btn_MouseDown(System::Object^  sender, System::Windows::Forms::MouseEventArgs^  e) {  
                  StartRecord();//开始录音线程  
                  status_lab->Text = "录音中.....";  
              }  
   
     private: System::Void speak_btn_MouseUp(System::Object^  sender, System::Windows::Forms::MouseEventArgs^  e) {  
                  status_lab->Text = "结束录音，转换中...";  
                  xunfei_thread = (gcnew Thread(gcnew ThreadStart(EndRecord)));  
                  xunfei_thread->Start();  
              }  
     private: System::Void timer1_Tick(System::Object^  sender, System::EventArgs^  e) {  
                  if(1 == end_flag)  
                  {  
                      end_flag = 0;  
                      result_box->Text = end_result;  
                      status_lab->Text = "转换结束";  
                  }  
              }  
     private: System::Void volunm_bar_Scroll(System::Object^  sender, System::EventArgs^  e) {  
                  volunm_lab->Text = "音量 " + volunm_bar->Value;  
              }  
     private: System::Void speed_bar_Scroll(System::Object^  sender, System::EventArgs^  e) {  
                  speed_lab->Text = "速度 " + speed_bar->Value;  
              }  
 #pragma endregion   
   
 #pragma region 自定义函数  
     private: void set_xunfei_param()//讯飞语音参数设置  
              {  
                  SoundType^ sound_type;  
   
                  sound_type = (SoundType^)(voice_type->SelectedItem);//获取选中的对象  
                  xunfei->set_tts_params(sound_type->voice_type , sound_type->engin , volunm_bar->Value , speed_bar->Value);        
              }  
     private: static void StartRecord()  
              {  
                  recorder = (gcnew SoundRecord());  
                  recorder->SetFileName("record.wav");  
                  recorder->RecStart();   //开始录音          
              }  
   
     private:static void EndRecord()  
             {  
                 //  String text;  
                 recorder->RecStop();  
                 delete recorder;  
                 end_result = xunfei->SpeedToText("record.wav");//录音结束，显示语音转换结果                     
                 end_flag = 1;  
             }  
 #pragma endregion   

如梦如幻2015

关注

0
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
基于科大讯飞语音云windows平台开发

前记：前段时间公司没事干，突发奇想想做一个语音识别系统，看起来应该很简单的，但做起来却是各种问题，这个对电气毕业的我，却是挺为难的。谷姐已经离我们而去，感谢度娘，感谢CSDN各位大神，好歹也做的是那么回事了，虽然还是不好用，但基本功能实现了。该软件使用VS2008C++/CLR开发，由于科大讯飞提供的是C的API接口，结果到这边就是各种不兼容，CLR是基于托管堆运行的，而这个API
复制链接

扫一扫

专栏目录