科大讯飞语音识别使用过程中踩过的坑

眨眼之间又是好久没有写博客了,佩服那些写了很多博客的大神,坚持和耐心值得我学习

 

这次写的是一个知名的语音识别sdk,就是科大讯飞的语音识别,其实也没有多复杂,但是由于我是用swift语言,期间出了很多差错 ,搞了好久,在这里就慢慢记录下来吧。

使用步骤: (有很多步骤网上都已经有了,我大概写一下主要的步骤,最主要还是把自己遇到的问题说出来,供大家参考)

1、首先是去讯飞官网创建应用,下载SDK,这些都不复杂,百度一大堆


2、下载好之后,把 iflyMSC.framework 拖入到项目中,这地方推荐是把它拷贝到项目文件夹下,然后在工程中add进去。


3、很重要的一个步骤,就是要添加很多相关的库,如果添加不全,就会报很奇葩的错误,而且找不到头绪,本人就是傻傻的在这里出了好多错……


4、添加完文件之后,由于是swift语言,还要创建一个桥接文件,然后把SDK的头文件包含进去:

 

 

#import "iflyMSC/IFlySpeechRecognizerDelegate.h"
#import "iflyMSC/IFlySpeechRecognizer.h"
#import "iflyMSC/IFlyRecognizerViewDelegate.h"
#import "iflyMSC/IFlyRecognizerView.h"
#import "iflyMSC/IFlyContact.h"
#import "iflyMSC/IFlyUserWords.h"

#import "iflyMSC/IFlyDataUploader.h"

#import "iflyMSC/IFlySpeechSynthesizerDelegate.h"
#import "iflyMSC/IFlySpeechSynthesizer.h"

#import "iflyMSC/IFlySpeechUtility.h"
#import "iflyMSC/IFlySpeechConstant.h"
#import "iflyMSC/IFlySpeechError.h"

#import "iflyMSC/IFlySpeechUnderstander.h"
#import "iflyMSC/IFlyTextUnderstander.h"

#import "iflyMSC/IFlySetting.h"


5、做完这些之后,就可以在你的项目中放心使用了,它分为两种界面,一种是无语音识别界面,一种是有语音识别界面。两种的使用方法都差不多相同

无界面:

 

class ViewController: UIViewController,<span style="color:#33cc00;">IFlySpeechRecognizerDelegate</span>,UITextViewDelegate {
    
    
    var iflySpeechRecognizer:<span style="color:#33cc00;">IFlySpeechRecognizer</span>!
    var resultText = ""
    var textView = UITextView()
    var backView:UIView!
    var VoiceTextView:UITextView!
    var voiceView:UIView!
    
    
    
    override func viewDidLoad() {
        super.viewDidLoad()
        
        var initString:String!
        initString = "appid=575fb8bb"
        IFlySpeechUtility.createUtility(initString)
        
        self.iflySpeechRecognizer = IFlySpeechRecognizer.sharedInstance() as IFlySpeechRecognizer
        self.iflySpeechRecognizer.delegate = self
        self.iflySpeechRecognizer.setParameter("iat", forKey: IFlySpeechConstant.IFLY_DOMAIN())
        self.iflySpeechRecognizer.setParameter("16000", forKey: IFlySpeechConstant.SAMPLE_RATE())
        self.iflySpeechRecognizer.setParameter("plain", forKey: IFlySpeechConstant.RESULT_TYPE())
        self.iflySpeechRecognizer.setParameter("-1", forKey: IFlySpeechConstant.SPEECH_TIMEOUT())
        self.iflySpeechRecognizer.setParameter("8000", forKey: IFlySpeechConstant.VAD_EOS())
        self.iflySpeechRecognizer.setParameter("8000", forKey: IFlySpeechConstant.VAD_BOS())
        self.iflySpeechRecognizer.setParameter("500000", forKey: IFlySpeechConstant.SPEECH_TIMEOUT())
        self.iflySpeechRecognizer.setParameter("50000", forKey: IFlySpeechConstant.NET_TIMEOUT())
        
        
        textView.frame = CGRectMake(50, 200, 200, 100)
        textView.textColor = UIColor.whiteColor()
        textView.text = "lllllll"
        self.view.addSubview(textView)
        
        let btn:UIButton = UIButton(frame:CGRectMake(100,100,100,100))
        btn.backgroundColor = UIColor.redColor()
        btn.setTitle("语音识别", forState: UIControlState.Normal)
        btn.addTarget(self, action: #selector(startVoiceBtn), forControlEvents: UIControlEvents.TouchUpInside)
        self.view.addSubview(btn)
    }
<p class="p1"><span class="s1">func</span><span class="s2"> startVoiceBtn() {</span></p><p class="p1"><span class="s2"></span></p><p class="p1"><span class="s1"><span style="white-space:pre">	</span>iflySpeechRecognizer</span><span class="s2">.</span><span class="s3">startListening</span><span class="s2">()</span></p><p class="p1"><span style="font-family: Arial, Helvetica, sans-serif;">}</span></p><p class="p1"><span style="font-family: Arial, Helvetica, sans-serif;"></span></p><p class="p1"><span class="s1">    </span><span class="s2">func</span><span class="s1"> onResults(results: [</span><span class="s3">AnyObject</span><span class="s1">]!, isLast: </span><span class="s3">Bool</span><span class="s1">) {</span></p><p class="p2"><span class="s1">        </span></p><p class="p1"><span class="s1">        </span><span class="s2">var</span><span class="s1"> resultStr : </span><span class="s3">String</span><span class="s1"> = </span><span class="s4">""</span></p><p class="p1"><span class="s1">        </span><span class="s2">if</span><span class="s1"> results != </span><span class="s2">nil</span><span class="s1"> {</span></p><p class="p1"><span class="s1">            </span><span class="s2">let</span><span class="s1"> resultDic : </span><span class="s3">Dictionary</span><span class="s1"><</span><span class="s3">String</span><span class="s1">, </span><span class="s3">String</span><span class="s1">> = results[</span><span class="s5">0</span><span class="s1">] </span><span class="s2">as</span><span class="s1">! </span><span class="s3">Dictionary</span><span class="s1"><</span><span class="s3">String</span><span class="s1">, </span><span class="s3">String</span><span class="s1">></span></p><p class="p2"><span class="s1">            </span></p><p class="p1"><span class="s1">            </span><span class="s2">for</span><span class="s1"> key </span><span class="s2">in</span><span class="s1"> resultDic.</span><span class="s3">keys</span><span class="s1"> {</span></p><p class="p1"><span class="s1">                resultStr += key</span></p><p class="p1"><span class="s1">            }</span></p><p class="p1"><span class="s1">        }</span></p><p class="p2"><span class="s1">        </span></p><p class="p1"><span class="s1">        </span><span class="s2">if</span><span class="s1"> </span><span class="s6">resultText</span><span class="s1"> != </span><span class="s4">""</span><span class="s1"> {</span></p><p class="p1"><span class="s1">            </span><span class="s2">if</span><span class="s1"> (</span><span class="s6">resultText</span><span class="s1"> </span><span class="s2">as</span><span class="s1"> </span><span class="s3">NSString</span><span class="s1">).</span><span class="s7">substringWithRange</span><span class="s1">(</span><span class="s7">NSMakeRange</span><span class="s1">( </span><span class="s6">resultText</span><span class="s1">.</span><span class="s3">characters</span><span class="s1">.</span><span class="s3">count</span><span class="s1"> - </span><span class="s5">1</span><span class="s1">, </span><span class="s5">1</span><span class="s1">)) != </span><span class="s4">","</span><span class="s1"> {</span></p><p class="p1"><span class="s1">                </span><span class="s6">resultText</span><span class="s1"> += </span><span class="s4">","</span></p><p class="p1"><span class="s1">            }</span></p><p class="p1"><span class="s1">        }</span></p><p class="p1"><span class="s1">        </span><span class="s6">resultText</span><span class="s1"> += resultStr</span></p><p class="p3"><span class="s8">        </span><span class="s1">VoiceTextView</span><span class="s8">.</span><span class="s3">text</span><span class="s8"> = </span><span class="s1">resultText</span></p><p class="p1"><span class="s1">    }</span></p><p class="p2"><span class="s1">    </span></p><p class="p1"><span class="s1">    </span><span class="s2">func</span><span class="s1"> onError(errorCode: </span><span class="s6">IFlySpeechError</span><span class="s1">!) {</span></p><p class="p1"><span class="s1">        </span><span class="s7">print</span><span class="s1">(</span><span class="s4">"</span><span class="s9">识别出错:</span><span class="s1">\</span><span class="s4">(</span><span class="s1">errorCode.</span><span class="s6">errorCode</span><span class="s4">)"</span><span class="s1">)</span></p><p class="p1"><span class="s1">        </span><span class="s2">if</span><span class="s1"> errorCode.</span><span class="s6">errorCode</span><span class="s1"> == </span><span class="s5">0</span><span class="s1"> {</span></p><p class="p3"><span class="s8">            </span><span class="s1">iflySpeechRecognizer</span><span class="s8">.</span><span class="s10">startListening</span><span class="s8">()</span></p><p class="p1"><span class="s1">        }</span></p><p class="p1"><span class="s1">    }</span></p>

 

其实没有多少注意的地方,就是遵循那个代理,然后实现两个代理方法就行了,真机测试之后就会发现,在onReaults中就能打印出识别的文字了,然后根据自己的需求不同来对字符串进行处理,这里特别注意的是联网,真机测试,然后用普通话说出想要识别的文字,如果光发出声响,他就会一直报识别错误的错。

有界面:

 

import UIKit

class voiceViewController: UIViewController,IFlyRecognizerViewDelegate {

    var iflyRecognizerView:IFlyRecognizerView!
    
    var isRecongnizer = false
    var resultText = ""
    var textView = UITextView()
    
    
    override func viewDidLoad() {
        super.viewDidLoad()
        
        
        var initString:String!
        initString = "appid=575fb8bb"
        IFlySpeechUtility.createUtility(initString)
        
        self.iflyRecognizerView = IFlyRecognizerView.init(center: self.view.center)as IFlyRecognizerView
        self.iflyRecognizerView.delegate = self
        self.iflyRecognizerView.setParameter("iat", forKey: IFlySpeechConstant.IFLY_DOMAIN())
        self.iflyRecognizerView.setParameter("16000", forKey: IFlySpeechConstant.SAMPLE_RATE())
        // | result_type   | 返回结果的数据格式 plain,只支持plain
        self.iflyRecognizerView.setParameter("plain", forKey: IFlySpeechConstant.RESULT_TYPE())
        
        
        textView.frame = CGRectMake(50, 200, 200, 100)
        textView.backgroundColor = UIColor.grayColor()
        textView.textColor = UIColor.whiteColor()
        textView.text = "lllllll"
        self.view.addSubview(textView)
        
        let btn:UIButton = UIButton(frame:CGRectMake(100,100,100,100))
        btn.backgroundColor = UIColor.redColor()
        btn.setTitle("语音识别", forState: UIControlState.Normal)
        btn.addTarget(self, action: #selector(startVoiceBtn), forControlEvents: UIControlEvents.TouchUpInside)
        self.view.addSubview(btn)

    }
    
    func startVoiceBtn() {
        print("开始识别")
        iflyRecognizerView.start()
        
    }
    
    func onResult(results: [AnyObject]!,isLast: Bool) {
        var resultStr : String = ""
        if results != nil {
            let resultDic : Dictionary<String, String> = results[0] as! Dictionary<String, String>
            
            for key in resultDic.keys {
                resultStr += key
            }
        }
        
        if resultText != "" {
            if (resultText as NSString).substringWithRange(NSMakeRange( resultText.characters.count - 1, 1)) != "," {
                resultText += ","
            }
        }
        
        resultText += resultStr
        textView.text = resultText
        
        if isRecongnizer {
            iflyRecognizerView.start()
        } else {
            iflyRecognizerView.cancel()
            if resultText != "" {
                resultText = (resultText as NSString).substringWithRange(NSMakeRange( 0, resultText.characters.count - 1)) + "。"
                textView.text = resultText
            }
        }
    }
    
    func onError(error: IFlySpeechError!) {
        print("识别出错:\(error.errorCode)")
    }
    

 

 

6、两种方法就是遵循的代理不同,然后用到的类不同,其实也没有多复杂的地方,只是刚开始的时候有点蒙,用起来就好了

 

2 预备工作 2.1 创建iOS工程 在XCode建立你的工程,或者打开已经建立的工程。 2.2 添加静态库 将开发工具包lib目录下的iflyMSC.framework添加到新建工程(如下图所示)。 提交 图一 图二 提交 图三 2.3 添加framework 按下图添加SDK所需要的iOS库,请注意libz.dylib,CoreTelephoney.framework不要遗漏。 提交 图四 注:如果使用的是离线识别,还需要增加libc++.dylib。 2.4 确认SDK的路径 提交 图五 请确认上图红色部分的路径能够找到iflyMSC.framework。为了支持多人开发,建议双击红色部分,把路径改为相对路径,例如像下图所示。 提交 图六 注意:请把不必要的路径删除。例如更新了SDK后,新的SDK与旧的SDK不在同一路径,请把旧的路径删除,避免引用到旧的库。对应集成SDK后发现编译失败,提示找不到头文件,请先检查这个路径是否正确。 2.5 导入头文件 在你需要使用MSC服务的文件导入相应的头文件 例如: C/C++ Code //带界面的语音识别控件 #import “iflyMSC/IFlyRecognizerViewDelegate.h” #import “iflyMSC/IFlyRecognizerView.h” C/C++ Code //不带界面的语音识别控件 #import “iflyMSC/IFlySpeechRecognizerDelegate.h” #import “iflyMSC/IFlySpeechRecognizer.h” C/C++ Code //不带界面的语音合成控件 #import “iflyMSC/IFlySpeechSynthesizerDelegate.h” #import “iflyMSC/IFlySpeechSynthesizer.h” 2.6 集成帮助文档到Xcode 打开终端(termainl或iterm),cd 到压缩包的doc 目录,执行以下命令: 注:不同的xcode版本,对应的docset路径可能有变化,需要根据实际路径来操作。 C/C++ Code cp -R -f -a com.iflytek.documentation.IFlyMSC.docset ~/Library/Developer/Shared/Documentation/DocSets/ 然后执行命令 C/C++ Code open ~/Library/Developer/Shared/Documentation/DocSets/ 请核对文档的版本为最新下载的版本 提交 图七 打开Xcode的帮助文档就可以看到已经集成的文档 提交 图八 2.7 初始化 必须在初始化后才可以使用语音服务,初始化是异步过程,推荐在程序入口处调用。 Appid是应用的身份信息,具有唯一性,初始化时必须要传入Appid。可以从demo的Definition.h APPID_VALUE查看此信息。Demo和SDK申请地址:http://xfyun.cn C/C++ Code //将“12345678”替换成您申请的APPID。 NSString *initString = [[NSString alloc] initWithFormat:@"appid=%@",@” 12345678”]; [IFlySpeechUtility createUtility:initString]; 3 语音听写 使用示例如下: C/C++ Code //头文件定义 //需要实现IFlyRecognizerViewDelegate,为识别会话的服务代理 @interface RecognizerViewController : UIViewController<IFlyRecognizerViewDelegate> { IFlyRecognizerView *_iflyRecognizerView; } //初始化语音识别控件 _iflyRecognizerView = [[IFlyRecognizerView alloc] initWithCenter:self.view.center]; _iflyRecognizerView.delegate = self; [_iflyRecognizerView setParameter: @"iat" forKey: [IFlySpeechConstant IFLY_DOMAIN]]; //asr_audio_path保存录音文件名,如不再需要,设置value为nil表示取消,默认目录是documents [_iflyRecognizerView setParameter:@"asrview.pcm " forKey:[IFlySpeechConstant ASR_AUDIO_PATH]]; //启动识别服务 [_iflyRecognizerView start]; /*识别结果返回代理 @param resultArray 识别结果 @ param isLast 表示是否最后一次结果 */ - (void)onResult: (NSArray *)resultArray isLast:(BOOL) isLast { } /*识别会话错误返回代理 @ param error 错误码 */ - (void)onError: (IFlySpeechError *) error { } 4 语音识别 4.1 在线语音识别 上传联系人,使用示例如下: C/C++ Code //创建上传对象 _uploader = [[IFlyDataUploader alloc] init]; //获取联系人集合 IFlyContact *iFlyContact = [[IFlyContact alloc] init]; NSString *contactList = [iFlyContact contact]; //设置参数 [_uploader setParameter:@"uup" forKey:@"subject"]; [_uploader setParameter:@"contact" forKey:@"dtt"]; //启动上传 [_uploader uploadDataWithCompletionHandler:^(NSString * grammerID, IFlySpeechError *error) { //接受返回的grammerID和error [self onUploadFinished:grammerID error:error]; }name:@"contact" data: contactList]; 上传用户词表,使用示例如下: C/C++ Code //创建上传对象 _uploader = [[IFlyDataUploader alloc] init]; //生成用户词表对象 //用户词表 #define USERWORDS @"{\"userword\":[{\"name\":\"iflytek\",\"words\":[\"德国盐猪手\",\"1912酒吧街\",\"清蒸鲈鱼\",\"挪威三文鱼\",\"黄埔军校\",\"横沙牌坊\",\"科大讯飞\"]}]}" IFlyUserWords *iFlyUserWords = [[IFlyUserWords alloc] initWithJson:USERWORDS ]; #define NAME @"userwords" //设置参数 [_uploader setParameter:@"iat" forKey:@"sub"]; [_uploader setParameter:@"userword" forKey:@"dtt"]; //上传词表 [_uploader uploadDataWithCompletionHandler:^(NSString * grammerID, IFlySpeechError *error) { //接受返回的grammerID和error [self onUploadFinished:grammerID error:error]; } name:NAME data:[iFlyUserWords toString]]; abnf语法上传,示例如下: C/C++ Code // ABNF语法示例,可以说”北京到上海” #define ABNFPARAM @”sub=asr,dtt=abnf” #define ABNFDATA = “#ABNF 1.0 gb2312; language zh-CN; mode voice; root $main; $main = $place1 到$place2 ; $place1 = 北京 | 武汉 | 南京 | 天津 | 天京 | 东京; $place2 = 上海 | 合肥;” //创建上传对象 _uploader = [[IFlyDataUploader alloc] init]; //设置参数 [_uploader setParameter:@"asr" forKey:@"sub"]; [_uploader setParameter:@"abnf" forKey:@"dtt"]; //上传abnf语法 [_uploader uploadDataWithCompletionHandler:^(NSString * grammerID, IFlySpeechError *error) { //接受返回的grammerID和error [self setGrammerId:grammerID]; }name:ABNFNAME data:ABNFDATA]; 4.2 本线语音识别 1) 创建识别对象(注:如果使用的是离线识别,还需要增加libc++.dylib) C/C++ Code //此方法为demo封装,具体实现请参照demo。 self.iFlySpeechRecognizer = [RecognizerFactory CreateRecognizer:self Domain:@"asr"]; 2)设置参数 C/C++ Code //开启候选结果 [_iflySpeechRecognizer setParameter:@"1" forKey:@"asr_wbest"]; //设置引擎类型,clound或者local [_iflySpeechRecognizer setParameter:@”local” forKey:[IFlySpeechConstant ENGINE_TYPE]]; //设置字符编码为utf-8 [_iflySpeechRecognizer setParameter:@"utf-8" forKey:[IFlySpeechConstant TEXT_ENCODING]]; //语法类型,本地是bnf,在线识别是abnf [_iflySpeechRecognizer setParameter:@”bnf” forKey:[IFlyResourceUtil GRAMMARTYPE]]; //启动asr识别引擎 [[IFlySpeechUtility getUtility] setParameter:@"asr" forKey:[IFlyResourceUtil ENGINE_START]]; //设置服务类型为asr识别 [_iflySpeechRecognizer setParameter:@"asr" forKey:[IFlySpeechConstant IFLY_DOMAIN]]; //设置语法构建路径,该路径为sandbox下的目录,请确保目录存在 [_iflySpeechRecognizer setParameter:_grammBuildPath forKey:[IFlyResourceUtil GRM_BUILD_PATH]]; //设置引擎资源文件路径,如demo的aitalkResource的common.mp3 [_iflySpeechRecognizer setParameter:_aitalkResourcePath forKey:[IFlyResourceUtil ASR_RES_PATH]]; 3)编译语法文本
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值