IOS 集成讯飞语音唤醒+语音识别，实现语音控制效果

最新推荐文章于 2025-04-13 00:25:19 发布

F小志

最新推荐文章于 2025-04-13 00:25:19 发布

阅读量2.2k

点赞数

分类专栏： iOS13 文章标签：语音识别 ios objective-c

本文链接：https://blog.csdn.net/FF_lz/article/details/109668842

版权

iOS13 专栏收录该内容

1 篇文章

订阅专栏

前言
最近项目上需要实现语音调度，一开始是想用苹果原生speech Framework框架的，但是网上找了很久都没有原生实现语音唤醒功能的栗子，到时有不少百度，讯飞的语音唤醒，语音识别的栗子，不过都是单独实现的代码，没有将这两个功能整合到一块，就有下面这篇博客语音唤醒+语音识别，实现语音控制效果。

一、项目环境配置

注册讯飞开放平台，创建应用得到APPID，下载SDK，配置项目

讯飞开放平台 https://www.xfyun.cn/

打开👆链接，选择注册，进行注册账号

注册完成，进入控制台，创建应用得到appid

进入应用，选择语音唤醒，设置唤醒词，可以设置多个，以逗号分隔

设置完，直接前往SDK下载中心

下载语音唤醒，离线命令词识别IOS SDK

将离线命令词识别Demo中的lib文件copy到自己项目路径下

打开项目，加入

库名称添加范围功能
iflyMSC.framework 必要讯飞开放平台静态库。
libz.tbd 必要用于压缩、加密算法。
AVFoundation.framework 必要用于系统录音和播放。
SystemConfiguration.framework 系统库用于系统设置。
Foundation.framework 必要基本库。
CoreTelephony.framework 必要用于电话相关操作。
AudioToolbox.framework 必要用于系统录音和播放。
UIKit.framework 必要用于界面显示。
CoreLocation.framework 必要用于定位。
Contacts.framework 必要用于联系人。
AddressBook.framework 必要用于联系人。
QuartzCore.framework 必要用于界面显示。
CoreGraphics.framework 必要用于界面显示。
libc++.tbd 必要用于支持C++。

info.plist文件中配置Privacy - Microphone Usage Description

需要确保以上框架导入，不然就会出现👇这些问题

在配置一下Enable Bitcode选项为NO，

不配置出现以下错误

版本过低，编译就会出现👇问题，选一下IOS 10.0即可

解决

以上基本配置完成了，有点啰嗦。

二、整合语音唤醒+语音识别，实现语音控制

SpeechControlManagers.h

//
//  SpeechControlManagers.h
//  SpeechControlDemo
//
//  Created by FF on 2020/11/10.
//  Copyright © 2020 www.ff.com. All rights reserved.
//

#import <Foundation/Foundation.h>
#import "IFlyMSC/IFlyMSC.h"

NS_ASSUME_NONNULL_BEGIN

@interface SpeechControlManagers : NSObject<IFlyVoiceWakeuperDelegate,IFlySpeechRecognizerDelegate>
/**语音识别类*/
@property (nonatomic, strong) IFlySpeechRecognizer * iflySpeechRecognizer;
/**语音唤醒类*/
@property (nonatomic,strong)  IFlyVoiceWakeuper    * iflyVoiceWakeuper;
/**当前会议的结果*/
@property (nonatomic, strong) NSMutableString      * curResult;
/**取消*/
@property (nonatomic)         BOOL                  isCanceled;
/**线程循环状态*/
@property (nonatomic)         BOOL                  m_isCanRun;
/**语音唤醒状态 YES已经开启识别，NO未开启识别*/
@property (nonatomic)         BOOL                  isVoiceWakeuperState;
/**语音识别状态 YES已经开启识别，NO未开启识别*/
@property (nonatomic)         BOOL                  isSpeechRecognizerState;
/*用于播放提供的音频文件*/
@property (nonatomic,assign) SystemSoundID soundId;

/**识别引擎类型*/
@property (nonatomic, strong)         NSString             * engineType;
/**语法类型*/
@property (nonatomic, strong)         NSString             * grammarType;
/**云端识别唯一ID，需上传语法才会返回*/
@property (nonatomic, strong)         NSString             * cloudGrammerid;
/**本地识别唯一ID，需上传语法才会返回*/
@property (nonatomic, strong)         NSString             * localgrammerId;
/**grm路径*/
@property (nonatomic, strong)         NSString             * grammBuildPath;
/**common.jet文件路径*/
@property (nonatomic, strong)         NSString             * aitalkResourcePath;
/**本地语法文件路径*/
@property (nonatomic, strong)         NSString             * bnfFilePath;
/**云端语法文件路径*/
@property (nonatomic, strong)         NSString             * abnfFilePath;
/**唤醒词文件路径*/
@property (nonatomic, strong)         NSString             * wakupEnginPath;

@property (nonatomic, strong) NSArray              * engineTypesArray;
/**唤醒词*/
@property (nonatomic,strong)  NSDictionary         * wakeupWordsDictionary;

/**单例*/
+(instancetype) shareInstance;
/**初始化讯飞语音接口*/
- (void) initIFly;
/**初始化语音控制相关对象*/
- (void) initVoiceControlObject;
/**初始化音频播放器*/
- (void) initAudioPlayer;
/**构建语法并上传*/
- (void) buildGrammar;
/**配置语音唤醒参数*/
- (void) configureVoiceWakeuperParam;
/**开始语音控制*/
- (void) startVoiceControl;
/**停止语音控制*/
- (void) stopVoiceControl;
/**开始播放语音*/
-(void) startAudioPlayer;
@end

NS_ASSUME_NONNULL_END

SpeechControlManagers.m

//
//  SpeechControlManagers.m
//  SpeechControlDemo
//
//  Created by FF on 2020/11/10.
//  Copyright © 2020 www.ff.com. All rights reserved.
//

#import "SpeechControlManagers.h"
#import "Definition.h"
#import "RecognizerFactory.h"
#import "SpeechDataHelper.h"
#import "SpeechControlDataHelper.h"
#import <AudioToolbox/AudioToolbox.h>
#define kOFFSET_FOR_KEYBOARD 110.0

#define GRAMMAR_TYPE_BNF    @"bnf"
#define GRAMMAR_TYPE_ABNF    @"abnf"
#define GRAMMAR_DICRECTORY  @"/grm"

static SpeechControlManagers* _instance = nil;

@implementation SpeechControlManagers
+(instancetype) shareInstance
{
    
    static dispatch_once_t onceToken;
    dispatch_once(&onceToken, ^{
        _instance = [[self alloc]init];
    });
    return _instance;

}

- (void) initIFly
{
    //设置日志打印等级
    [IFlySetting setLogFile:LVL_NONE];
    
    //设置是否打印控制台
    [IFlySetting showLogcat:YES];

    //设置日志文件的路径
    NSArray *paths = NSSearchPathForDirectoriesInDomains(NSCachesDirectory, NSUserDomainMask, YES);
    NSString *cachePath = [paths objectAtIndex:0];
    [IFlySetting setLogFilePath:cachePath];
    
    //设置APPID
    NSString *initString = [[NSString alloc] initWithFormat:@"appid=%@",APPID_VALUE];
    
    [IFlySpeechUtility createUtility:initString];
}

#pragma mark - Initialization

-(void) initParam
{
    //初始化类型
    NSString *documentsPath = nil;
    NSArray *appArray = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
    if ([appArray count] > 0) {
        documentsPath = [appArray objectAtIndex:0];
    }
    NSString *appPath = [[NSBundle mainBundle] resourcePath];
    
    _grammBuildPath = [documentsPath stringByAppendingString:GRAMMAR_DICRECTORY];
    
    _aitalkResourcePath = [[NSString alloc] initWithFormat:@"fo|%@/aitalk/common.jet",appPath] ;
    
    _bnfFilePath = [[NSString alloc] initWithFormat:@"%@/bnf/call.bnf",appPath];
    
    _abnfFilePath = [[NSString alloc] initWithFormat:@"%@/oneshotbnf/oneshot_cloud.abnf",appPath];
    
    _wakupEnginPath = [[NSString alloc] initWithFormat:@"%@/ivw/%@.jet",appPath,APPID_VALUE];
}

- (void) initAudioPlayer
{
    //定义URl，要播放的音乐文件是win.wav
    NSURL *audioPath = [[NSURL alloc] initFileURLWithPath:[[NSBundle mainBundle] pathForResource:@"operating_face_rec"ofType:@"wav"]];
    
    //C语言的方法调用
    //注册服务
    AudioServicesCreateSystemSoundID((__bridge CFURLRef)audioPath, &_soundId);
    
    //增添回调方法
    AudioServicesAddSystemSoundCompletion(_soundId,NULL, NULL, NULL,NULL);
}

- (void) initVoiceControlObject
{
    
    [self initParam];//初始化配置文件路径
    self.wakeupWordsDictionary = [[NSDictionary alloc] initWithObjectsAndKeys:@"小扣小扣",@"0",nil];
    
    [self initAudioPlayer];//初始化音频播放器
    
    //配置默认为本地识别
    self.engineType = [IFlySpeechConstant TYPE_LOCAL];
    self.grammarType = GRAMMAR_TYPE_BNF;
    self.isCanceled = NO;
    self.isVoiceWakeuperState = NO;
    self.isSpeechRecognizerState = NO;
    self.m_isCanRun = YES;
    self.localgrammerId = nil;
    self.cloudGrammerid = nil;

    //创建语音唤醒
    self.iflyVoiceWakeuper = [IFlyVoiceWakeuper sharedInstance];
    self.iflyVoiceWakeuper.delegate = self;
    [_iflyVoiceWakeuper setParameter:@"" forKey:[IFlySpeechConstant PARAMS]];
    
    //创建语音识别类
    self.iflySpeechRecognizer = [RecognizerFactory CreateRecognizer:self Domain:@"asr"];
    [_iflySpeechRecognizer setParameter:@"" forKey:[IFlySpeechConstant PARAMS]];
    [self.iflySpeechRecognizer setParameter:@"1" forKey:@"asr_wbest"];
    [self createDirec:GRAMMAR_DICRECTORY];//
    
    //构建语法并上传
    [self buildGrammar];
    
    //配置语音唤醒参数
    [self configureVoiceWakeuperParam];
    
    __weak typeof(self) weakSelf = self;
    /* 创建一个并发队列 */
    dispatch_async(dispatch_queue_create("net.bujige.testQueue", DISPATCH_QUEUE_CONCURRENT), ^{
        /* 耗时任务... */
        //开始语音控制
        [weakSelf startVoiceControl];
    });
}



-(void) startAudioPlayer
{
    //开始播放
    AudioServicesPlayAlertSound(_soundId);
}
//-(void)sound{
//
//    SystemSoundID soundID;
//    //NSBundle来返回音频文件路径
//    NSString *soundFile = [[NSBundle mainBundle] pathForResource:@"operating_face_rec" ofType:@"wav"];
//    //建立SystemSoundID对象，但是这里要传地址(加&符号)。 第一个参数需要一个CFURLRef类型的url参数，要新建一个NSString来做桥接转换(bridge)，而这个NSString的值，就是上面的音频文件路径
//    AudioServicesCreateSystemSoundID((__bridge CFURLRef)[NSURL fileURLWithPath:soundFile], &soundID);
//    //播放提示音 带震动
    AudioServicesPlayAlertSound(soundID);
//    //播放系统声音
//    AudioServicesPlaySystemSound(soundID);
//}

- (void)startVoiceControl
{
    while (_m_isCanRun) {
        if (!self.isVoiceWakeuperState) {//判断唤醒状态
            self.isVoiceWakeuperState = [_iflyVoiceWakeuper startListening];
            if(self.isVoiceWakeuperState)
            {
                NSLog(@"==>开启唤醒");
            }
            else
            {
//                NSLog(@"启动识别服务失败，请稍后重试!");
            }
        }else{
//            NSLog(@"请先上传语法!");
        }
        if ( [self isCommitted] && self.isSpeechRecognizerState) {//判断语法是否已经上传，是否上传成功
            [self startAudioPlayer];

            //睡眠两秒，等待语音播放完
            [NSThread sleepForTimeInterval:2];
            
            BOOL ret = [_iflySpeechRecognizer startListening];
            if (ret) {
                NSLog(@"启动语音识别成功");
                self.isSpeechRecognizerState = NO;
            }
            else
            {
//                NSLog(@"启动语音识别失败");
            }
        }else{
        }
        usleep(10);
    };
  
}

/**停止语音控制*/
- (void) stopVoiceControl
{
     [_iflyVoiceWakeuper stopListening];
}

#pragma mark - upload grammar

-(void) buildGrammar
{
    NSString *grammarContent = nil;

    //set engine type, clound or local

    [[IFlySpeechUtility getUtility] setParameter:@"asr" forKey:[IFlyResourceUtil ENGINE_START]];

    [_iflySpeechRecognizer setParameter:@"" forKey:[IFlySpeechConstant PARAMS]];
    [_iflySpeechRecognizer setParameter:@"utf-8" forKey:[IFlySpeechConstant TEXT_ENCODING]];
    [_iflySpeechRecognizer setParameter:self.engineType forKey:[IFlySpeechConstant ENGINE_TYPE]];
    [_iflySpeechRecognizer setParameter:self.grammarType forKey:[IFlyResourceUtil GRAMMARTYPE]];
    [_iflySpeechRecognizer setParameter:@"asr" forKey:[IFlySpeechConstant IFLY_DOMAIN]];
    
    if([self.engineType isEqualToString: [IFlySpeechConstant TYPE_LOCAL]])
    {//类型：本地识别
        grammarContent = [self readFile:_bnfFilePath];

        [_iflySpeechRecognizer setParameter:_grammBuildPath forKey:[IFlyResourceUtil GRM_BUILD_PATH]];
        [_iflySpeechRecognizer setParameter:_aitalkResourcePath forKey:[IFlyResourceUtil ASR_RES_PATH]];
        [_iflySpeechRecognizer setParameter: @"utf-8" forKey:@"result_encoding"];
        [_iflySpeechRecognizer setParameter:@"json" forKey:[IFlySpeechConstant RESULT_TYPE]];
    }
    else
    {//类型：在线识别
        grammarContent = [self readFile:_abnfFilePath];
    }

    //开始上传语法
    __weak typeof(self) weakSelf = self;
    [_iflySpeechRecognizer buildGrammarCompletionHandler:^(NSString * grammerID, IFlySpeechError *error){

        dispatch_async(dispatch_get_main_queue(), ^{
            
            if (![error errorCode]) {
                NSLog(@"=====>上传成功 errorCode=%d",[error errorCode]);
            }
            else {
                NSLog(@"=====>上传失败，错误码:%d",error.errorCode);
            }
            if ([weakSelf.engineType isEqualToString: [IFlySpeechConstant TYPE_LOCAL]]) {
                weakSelf.localgrammerId = grammerID;
                [weakSelf.iflySpeechRecognizer setParameter:weakSelf.localgrammerId  forKey:[IFlySpeechConstant LOCAL_GRAMMAR]];
            }
            else{
                weakSelf.cloudGrammerid = grammerID;
                [weakSelf.iflySpeechRecognizer setParameter:weakSelf.cloudGrammerid forKey:[IFlySpeechConstant CLOUD_GRAMMAR]];
            
            }
        });

    }grammarType:self.grammarType grammarContent:grammarContent];

}

- (void)configureVoiceWakeuperParam
{
    
    [_iflyVoiceWakeuper setParameter:@"0:1450" forKey:[IFlySpeechConstant IVW_THRESHOLD]];
    NSString *ivwResourcePath = [IFlyResourceUtil generateResourcePath:_wakupEnginPath];
    [self.iflyVoiceWakeuper setParameter:ivwResourcePath forKey:@"ivw_res_path"];
    [_iflyVoiceWakeuper setParameter:self.engineType forKey:[IFlySpeechConstant ENGINE_TYPE]];
    [_iflyVoiceWakeuper setParameter:@"utf8" forKey:[IFlySpeechConstant RESULT_ENCODING]];
    [self.iflyVoiceWakeuper setParameter:@"2000" forKey:[IFlySpeechConstant VAD_EOS]];
    [self.iflyVoiceWakeuper setParameter:@"wakeup" forKey:[IFlySpeechConstant IVW_SST]];
    [self.iflyVoiceWakeuper setParameter:@"0" forKey:[IFlySpeechConstant KEEP_ALIVE]];
    [self.iflyVoiceWakeuper setParameter:@"json" forKey:[IFlySpeechConstant RESULT_TYPE]];
    [_iflyVoiceWakeuper setParameter:@"asr" forKey:[IFlySpeechConstant IFLY_DOMAIN]];
    
}


-(BOOL) createDirec:(NSString *) direcName
{
    NSArray *paths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
    NSString *documentsDirectory = [paths objectAtIndex:0];
    
    NSFileManager *fileManager = [NSFileManager defaultManager];
    NSString *subDirectory = [documentsDirectory stringByAppendingPathComponent:direcName];
    
    BOOL ret = YES;
    if(![fileManager fileExistsAtPath:subDirectory])
    {
        ret = [fileManager createDirectoryAtPath:subDirectory withIntermediateDirectories:YES attributes:nil error:nil];
    }
    
    return ret;
}

/*
 read file 读取在线识别或本地识别的语法文件
 */
-(NSString *)readFile:(NSString *)filePath
{
    NSData *reader = [NSData dataWithContentsOfFile:filePath];
    return [[NSString alloc] initWithData:reader
                                 encoding:NSUTF8StringEncoding];
}


-(BOOL) isCommitted
{
    if ([self.engineType isEqualToString:[IFlySpeechConstant TYPE_LOCAL]]) {
        if(_localgrammerId == nil || _localgrammerId.length ==0)
            return NO;
    }
    else{
        if (_cloudGrammerid == nil || _cloudGrammerid.length == 0) {
            return NO;
        }
    }
    return YES;
}


#pragma mark - IFlyVoiceWakeuperDelegate
/**音量值改变*/
- (void) onVolumeChanged: (int)volume
{
//    NSString * vol = [NSString stringWithFormat:@"%@：%d", NSLocalizedString(@"T_RecVol", nil),volume];
}

/**开始录音*/
- (void) onBeginOfSpeech
{
    NSLog(@"onBeginOfSpeech");
}

/**停止录音*/
- (void) onEndOfSpeech
{
    
    NSLog(@"onEndOfSpeech");
}

/**
语音结束
 */
- (void) onCompleted:(IFlySpeechError *) error
{
    NSLog(@"error=%d",[error errorCode]);
}

/**
唤醒成功后回调
 */
-(void) onResult:(NSMutableDictionary *)resultDic
{
    
    NSLog(@"我在,你说");
    self.isSpeechRecognizerState = YES;
    
}
/**
识别成功后回调
 */
- (void) onResults:(NSArray *) results isLast:(BOOL)isLast
{
    NSMutableString *result = [[NSMutableString alloc] init];
    NSDictionary *dic = results[0];
    for (NSString *key in dic) {
        [result appendFormat:@"%@",key];
        NSLog(@"----%@",key);
        if([self.engineType isEqualToString:[IFlySpeechConstant TYPE_LOCAL]])
        {
            [[SpeechControlDataHelper shareInstance]analysisLocalSpeechControlData:result];
        }
        else
        {
            [[SpeechControlDataHelper shareInstance]analysisCloudSpeechControlData:result];
        }
    }
    
    self.isVoiceWakeuperState = NO;

}
@end

AppDelegate.m 中初始一下讯飞SDK接口

- (BOOL)application:(UIApplication *)application didFinishLaunchingWithOptions:(NSDictionary *)launchOptions {
    // Override point for customization after application launch.
    [[SpeechControlManagers shareInstance]initIFly];//初始化讯飞SDK接口
    return YES;
}

调用 initVoiceControlObject 接口开启语音识别

- (void)viewDidLoad {
    [super viewDidLoad];
    // Do any additional setup after loading the view.
    [[SpeechControlManagers shareInstance]initVoiceControlObject];//初始化语音控制相关对象
}

主要实现将语音唤醒+语音识别整合到一块并加入唤醒后语音回应功能，讯飞SDK中是提供oneshot模式就是唤醒+识别模式的，但oneshot模式是唤醒后是不中断录音的，导致就无法语音唤醒后播放语音回应，就需要自己重新将这两个功能整合到一块。👆解析部分代码的缺少其他类，可以直接注释掉，需要哪些类的话，下载demo

最后凡是讯飞框架有错误码的问题，都用👇这个链接查一下，可以快速定位问题
SDK&API 错误码查询 https://www.xfyun.cn/document/error-code?code=23108