[转]语音合成与识别技术在C#中的应用

最新推荐文章于 2015-02-15 11:16:00 发布

aosharwygg96827

最新推荐文章于 2015-02-15 11:16:00 发布

阅读量101

点赞数

文章标签：人工智能 c#

原文链接：http://www.cnblogs.com/Jrong/archive/2009/02/04/1383564.html

版权

[转]语音合成与识别技术在C#中的应用

我们要想实现中文发音或中文语音识别，必需先安装微软的Speech Application SDK（SASDK），它的最新版本是 SAPI 5.1 他能够识别中、日、英三种语言，你可以在这里下载：http://www.microsoft.com/speech/download/sdk51/,需要安装这两个文件Speech SDK 5.1和5.1 Language Pack，其中5.1 Language Pack可以选择安装支持的语言。安装好以后，我们就可以开始进行语音程序的开发了。

下面我们设计一个能够朗读中英文混合语言的类：

我们将用单例模式实现该类，类的代码如下，我们将详细解释：

Code

public class Speach

{

　 private static Speach _Instance = null ;

　 private SpeechLib.SpVoiceClass voice =null;

　 private Speach()

{

　　 BuildSpeach() ;

　 }

public static Speach instance()

{

　 if (_Instance == null)

　　 _Instance = new Speach() ;

　　 return _Instance ;

}

private void SetChinaVoice()

{

　 voice.Voice = voice.GetVoices(string.Empty,string.Empty).Item(0) ;

}

private void SetEnglishVoice()

{

　 voice.Voice = voice.GetVoices(string.Empty,string.Empty).Item(1) ;

}

private void SpeakChina(string strSpeak)

{

　 SetChinaVoice() ;

　 Speak(strSpeak) ;

}

private void SpeakEnglishi(string strSpeak)

{

　 SetEnglishVoice() ;

　 Speak(strSpeak) ;

}

public void AnalyseSpeak(string strSpeak)

{

　 int iCbeg = 0 ;

　 int iEbeg = 0 ;

　 bool IsChina = true ;

　 for(int i=0;i<strSpeak.Length;i++)

{

　　 char chr = strSpeak[i] ;

　　 if (IsChina)

{

　　　 if (chr<=122&&chr>=65)

{

　　　　 int iLen = i - iCbeg ;

　　　　 string strValue = strSpeak.Substring(iCbeg,iLen) ;

　　　　 SpeakChina(strValue) ;

　　　　 iEbeg = i ;

　　　　 IsChina = false ;

　　　 }

　　 else

{

　　　 if (chr>122||chr<65)

{

　　　　 int iLen = i - iEbeg ;

　　　　 string strValue = strSpeak.Substring(iEbeg,iLen) ;

　　　　 this.SpeakEnglishi(strValue) ;

　　　　 iCbeg = i ;

　　　　 IsChina = true ;

　　　 }

　 }//end for

　 if (IsChina)

{

　　 int iLen = strSpeak.Length - iCbeg ;

　　 string strValue = strSpeak.Substring(iCbeg,iLen) ;

　　 SpeakChina(strValue) ;

　 }

　 else

{

　　 int iLen = strSpeak.Length - iEbeg ;

　　 string strValue = strSpeak.Substring(iEbeg,iLen) ;

　　 SpeakEnglishi(strValue) ;

　 }

private void BuildSpeach()

{

　 if (voice == null)

　　 voice = new SpVoiceClass() ;

}

public int Volume

{

　 get

{

　　 return voice.Volume ;

　 }

　 set

{

　　 voice.SetVolume((ushort)(value)) ;

　 }

public int Rate

{

　 get

{

　　 return voice.Rate ;

　 }

　 set

{

　　 voice.SetRate(value) ;

　 }

private void Speak(string strSpeack)

{

　 try

{

　　 voice.Speak(strSpeack,SpeechVoiceSpeakFlags.SVSFlagsAsync) ;

　 }

　 catch(Exception err)

{

　　 throw(new Exception("发生一个错误："+err.Message)) ;

　 }

public void Stop()

{

　 voice.Speak(string.Empty,SpeechLib.SpeechVoiceSpeakFlags.SVSFPurgeBeforeSpeak) ;

}

public void Pause()

{

　 voice.Pause() ;

}

public void Continue()

{

　 voice.Resume() ;

}

}//end class

#p#在 private SpeechLib.SpVoiceClass voice =null;这里，我们定义个一个用来发音的类，并且在第一次调用该类时，对它用BuildSpeach方法进行了初始化。

我们还定义了两个属性Volume和Rate，能够设置音量和语速。

我们知道，SpVoiceClass 有一个Speak方法，我们发音主要就是给他传递一个字符串，它负责读出该字符串，如下所示。

private void Speak(string strSpeack)
{
　 try
　 {
　　 voice.Speak(strSpeack,SpeechVoiceSpeakFlags.SVSFlagsAsync) ;
　 }
　 catch(Exception err)
　 {
　　 throw(new Exception("发生一个错误："+err.Message)) ;
　 }
}
其中SpeechVoiceSpeakFlags.SVSFlagsAsync表示异步发音。

但是，这个方法本身并不知道你给的字符串是什么语言，所以需要我们它这个字符串用什么语言读出。SpVoiceClass 类的Voice 属性就是用来设置语种的，我们可以通过SpVoiceClass 的GetVoices方法得到所有的语种列表，然后在根据参数选择相应的语种，比如设置语种为汉语如下所示：

private void SetChinaVoice()
{
　 voice.Voice = voice.GetVoices(string.Empty,string.Empty).Item(0) ;
}
0表示是汉用，1234都表示英语，就是口音不同。

这样，我们就设置了语种，如果结合发音方法，我们就可以设计出一个只发汉语语音的方法

private void SpeakChina(string strSpeak)
{
　 SetChinaVoice() ;
　 Speak(strSpeak) ;
}
只发英语语音的方法也是类似的，上面程序里有。

对于一段中英文混合的语言，我们让程序读出混合语音的方法就是：编程把这段语言的中英文分开，对于中文调用SpeakChina方法，英文调用SpeakEnglishi方法；至于怎样判断一个字符是英文还是中文，我采用的是判断asc码的方法，具体的类方法是通过AnalyseSpeak实现的。

这样，对于一段中英文混合文字，我们只需把它作为参数传递给AnalyseSpeak就可以了，他能够完成中英文的混合发音。

当然，对于发音的暂定、继续、停止等操作，上面也给出了简单的方法调用，很容易明白。

下面简单介绍一下中文语音识别的方法：

先把该语音识别的类源代码贴在下面，然后再做说明：

Code

public class SpRecognition

{

　 private static SpRecognition _Instance = null ;

　 private SpeechLib.ISpeechRecoGrammar isrg ;

　 private SpeechLib.SpSharedRecoContextClass ssrContex =null;

　 private System.Windows.Forms.Control cDisplay ;

　 private SpRecognition()

{

　　 ssrContex = new SpSharedRecoContextClass() ;

　　 isrg = ssrContex.CreateGrammar(1) ;

　　 SpeechLib._ISpeechRecoContextEvents_RecognitionEventHandler recHandle = new _ISpeechRecoContextEvents_RecognitionEventHandler(ContexRecognition) ;

　　 ssrContex.Recognition += recHandle ;

　 }

　 public void BeginRec(Control tbResult)

{

　　 isrg.DictationSetState(SpeechRuleState.SGDSActive) ;

　　 cDisplay = tbResult ;

　 }

　 public static SpRecognition instance()

{

　　 if (_Instance == null)

　　　 _Instance = new SpRecognition() ;

　　　 return _Instance ;

　 }

　 public void CloseRec()

{

　　 isrg.DictationSetState(SpeechRuleState.SGDSInactive) ;

　 }

　 private void ContexRecognition(int iIndex,object obj,SpeechLib.SpeechRecognitionType type,SpeechLib.ISpeechRecoResult result)

{

　　 cDisplay.Text += result.PhraseInfo.GetText(0,-1,true) ;

　 }

我们定义了ssrContex 和isrg为语音识别的上下文和语法，通过设置isrg的DictationSetState方法，我们可以开始或结束识别，在上面的程序中是BeginRec和CloseRec方法。cDisplay 是我们用来输出识别结果的地方，为了能够在大部分控件上都可以显示结果，我用了一个Control 类来定义它。当然，每次语音识别后都会触发ISpeechRecoContextEvents_RecognitionEventHandler 事件，我们定义了一个这样的方法ContexRecognition来响应事件，并且在这个方法里输出识别结果。

posted on 2009-02-04 08:59 Jrong 阅读( ...) 评论( ...) 编辑收藏

转载于:https://www.cnblogs.com/Jrong/archive/2009/02/04/1383564.html