CMUSphinx Learn - Generating a dictionary

最新推荐文章于 2024-07-03 09:34:54 发布

IT_FISH629

最新推荐文章于 2024-07-03 09:34:54 发布

阅读量1.4k

点赞数

语音识别 - cmu sphinx 专栏收录该内容

10 篇文章 0 订阅

订阅专栏

Generating a dictionary

产生字典

There are various tools to help you to extend an existing dictionary for new words or to build a new dictionary from scratch. If your language already has a dictionary it's recommended to use since it's carefully tuned for best performance. If you starting a new language you need to account for various reductions and coarticulations effects. They make it very hard to create accurate rules to convert text to sounds. However, the practice shows that even naive conversion could produce a good results for speech recognition. For example, many developers were successful to create ASR with simple grapheme-based synthesis where each letter is just mapped to itself not to the corresponding phone.

有很多工具来帮助你为新词扩展已有的字典，或者构建一个新的字典。如果你的语言已经有了一个字典，建议调整到最佳性能再使用，如果你开始建立一个新的语言模型，你需要考虑各种削减和协同发音效果，它们使建立从文本到声音转换的精准规则变得非常困难。然而，实践表明即使纯粹的转换也可以产生好的语音识别的效果，比如，很多开发者成功建立基于简单字母合成的ASR，每个字母仅仅映射到它自己而不是相应的音素。

For most of the languages you need to use specialized grapheme to phoneme (g2p) code to do the conversion using machine learning methods and existing small database. Nowdays most accurate g2p tools are Phonetisaurus:

http://code.google.com/p/phonetisaurus

对大多数语言而言，你需要使用专门的字母到音素代码来做转换，使用机器学习方法和现存的小数据库。现在很多的精确的g2p工具是Phonetisaurus？？？

And sequitur-g2p:

http://www-i6.informatik.rwth-aachen.de/web/Software/g2p.html

Also note that almost each TTS package has G2P code included. For example you can use g2p code from FreeTTS written in Java:

http://cmusphinx.sourceforge.net/projects/freetts

还要注意，几乎每个TTS软件包都有G2P代码包括在内，比如，你可以使用Java语言编写的FreeTTS中g2p的代码：

See FreeTTS example in Sphinx4 here

在Sphinx4中FreeTTS的例子

OpenMary Java TTS: OpenMary的Java语言的TTS

http://mary.dfki.de/

or espeak for C: 或者C语言的espeak

http://espeak.sourceforge.net

Please note that if you use TTS you often need to do phoneset conversion. TTS phonesets are usually more extensive than required for ASR. However, there is a great adavantage in TTS tools because they usually contain more required functionality than simple G2P. For example, they are doing tokenization by converting numbers and abbreviations to spoken format.

请注意，如果你使用TTS，你需要经常做音素集的转换。TTS音素集通常比ASR的需求更广泛，然而，TTS工具有一个巨大的优势，因为它们通常包含比简单G2P工具更多需要的功能，比如，它们通过将数字和缩写转化成口语格式进行分词。

For English you can use simplier capabilities by using on-line webservice:

对于英语，你可以通过使用在线网络服务使用简单的功能

http://www.speech.cs.cmu.edu/tools/lmtool.html

Online LM Tool, produces a dictionary which matches its language model. It uses the latest CMU dictionary as a base, and is programmed to guess at pronunciations of words not in the existing dictionary. You can look at the log file to find which words were guesses, and make your own corrections, if necessary. With the advanced option, LM Tool can use a hand-made dictionary that you specify for your specialized vocabulary, or for your own pronunciations as corrections. The hand dictionary must be in the same format as the main dictionary

在线LM工具，产生一个和语言模型匹配的字典，它使用最新的CMU字典作为基础，能够猜测不在现存字典里的单词的发音，你可以查看日志文件来找到哪个单词是猜的，如必要，自己做修正。使用高级选项，LM工具能够使用指定专门词汇的或为纠正自己发音的手写字典，手写字典必须和主要字典格式相同。

If you want to run lmtool offline you can checkout it from subversion:

如果你想运行lm离线工具，你可以从下面地址下载：

http://cmusphinx.svn.sourceforge.net/viewvc/cmusphinx/trunk/logios

The pronunciation generation code currently only supports US English.

产生发音的代码现在仅支持US英文的。