讯飞星火大模型+讯飞TTS语音听写 Android Studio开发初步实现

程序效果演示

本程序只是单纯的把API摆出来裸跑,没做任何线程优化。纯用来学习的

程序运行机制

程序API概览

程序源码链接

https://github.com/VIII-Cygnusx/Android_Developing/tree/1_iflytek_STT_LLMicon-default.png?t=N7T8https://github.com/VIII-Cygnusx/Android_Developing/tree/1_iflytek_STT_LLM

申请创建平台API调用

进入科大讯飞开放平台讯飞开放平台-以语音交互为核心的人工智能开放平台 (xfyun.cn)

创建一个API 没有号的先注册哈

(普通用户星火大模型每个版本只能买一次,我之前买了这个版本在我另外的项目)

下载SDK

然后下载语音听写的SDK 与 星火大模型的SDK(注意你下载的这些SDK包是与你的APPID深度绑定的,直接在相同的项目换APPID是会直接导致API关闭的)

创建一个Android Studio项目

导入语音听写&星火API接口SDK包

左边文件目录为你的Android studio工程目录,右边目录为你下载的两个SDK的目录

(Android Studio 导入包的方式有很多,这里演示的导包方法不是很正规,如果想看正规的导包方法请去其他地方请教)

导入后就像这样

然后找到app目录下面的build.gradle.kts文件添加如下内容

android {
    .....................
    .....................

    sourceSets {
        getByName("main") {
            jniLibs.srcDirs("libs")
        }
    }


}

dependencies {
    .....................
    .....................    

    implementation(files("libs\\SparkChain.aar"))
    implementation(files("libs\\Msc.jar"))

}

 点击导航栏File->Sync Project with Gradle Files后如果生成如下文件则证明导入成功

 接入星火API & 语音听写API

activity_main.xml(直接复制?)

<?xml version="1.0" encoding="utf-8"?>
<androidx.constraintlayout.widget.ConstraintLayout xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:app="http://schemas.android.com/apk/res-auto"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    tools:context=".MainActivity">

    <TextView
        android:id="@+id/TX"
        android:layout_width="188dp"
        android:layout_height="94dp"
        android:text="Show your text"
        app:layout_constraintBottom_toBottomOf="parent"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintHorizontal_bias="0.174"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toTopOf="parent"
        app:layout_constraintVertical_bias="0.974" />

    <Button
        android:id="@+id/B_1"
        android:layout_width="145dp"
        android:layout_height="50dp"
        android:onClick="B_1_C"
        android:text="Send"
        app:layout_constraintBottom_toBottomOf="parent"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintHorizontal_bias="0.853"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toTopOf="parent"
        app:layout_constraintVertical_bias="0.911" />

    <TextView
        android:id="@+id/RX"
        android:layout_width="359dp"
        android:layout_height="470dp"
        android:text="Result"
        app:layout_constraintBottom_toTopOf="@+id/TX"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toTopOf="parent"
        app:layout_constraintVertical_bias="0.006" />

    <EditText
        android:id="@+id/ETX"
        android:layout_width="357dp"
        android:layout_height="62dp"
        android:ems="10"
        android:gravity="start|top"
        android:hint="write some......"
        android:inputType="textMultiLine"
        app:layout_constraintBottom_toTopOf="@+id/B_1"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintHorizontal_bias="0.703"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toTopOf="parent"
        app:layout_constraintVertical_bias="1.0" />

    <Button
        android:id="@+id/B_2"
        android:layout_width="177dp"
        android:layout_height="52dp"
        android:onClick="B_2_C"
        android:text="STT_Send"
        app:layout_constraintBottom_toBottomOf="parent"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintHorizontal_bias="0.97"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toBottomOf="@+id/B_1"
        app:layout_constraintVertical_bias="0.0" />

    <TextView
        android:id="@+id/STT_RES"
        android:layout_width="346dp"
        android:layout_height="53dp"
        android:text="STT_RES"
        app:layout_constraintBottom_toBottomOf="parent"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintHorizontal_bias="0.492"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toTopOf="@+id/RX"
        app:layout_constraintVertical_bias="0.711" />

</androidx.constraintlayout.wid

AndroidManifest.xml(复制uses-permission标签)

<?xml version="1.0" encoding="utf-8"?>
<manifest xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:tools="http://schemas.android.com/tools">

    <!-- 连接网络权限,用于执行云端语音能力 -->
    <uses-permission android:name="android.permission.INTERNET" /> <!-- 获取手机录音机使用权限,听写、识别、语义理解需要用到此权限 -->
    <uses-permission android:name="android.permission.RECORD_AUDIO" /> <!-- 读取网络信息状态 -->
    <uses-permission android:name="android.permission.ACCESS_NETWORK_STATE" /> <!-- 获取当前wifi状态 -->
    <uses-permission android:name="android.permission.ACCESS_WIFI_STATE" />
    <uses-permission android:name="android.permission.READ_PHONE_STATE" tools:node="remove" />

    <application ...............>
        ....................
        ....................
    </application>

</manifest>

MainActivity.java(直接复制?)

package com.example.myapplication;

import androidx.appcompat.app.AppCompatActivity;
import androidx.core.app.ActivityCompat;
import androidx.core.content.ContextCompat;

import android.Manifest;
import android.content.pm.PackageManager;
import android.os.Bundle;
import android.os.Environment;
import android.view.View;
import android.widget.Button;
import android.widget.EditText;
import android.widget.ImageView;
import android.widget.TextView;


//Spark-v3.5
import com.iflytek.sparkchain.core.LLM;
import com.iflytek.sparkchain.core.LLMConfig;
import com.iflytek.sparkchain.core.LLMOutput;
import com.iflytek.sparkchain.core.SparkChain;
import com.iflytek.sparkchain.core.SparkChainConfig;
import com.iflytek.sparkchain.utils.AESUtil;

import java.security.PrivilegedAction;
import java.util.ArrayList;


//TTS
import com.iflytek.cloud.ErrorCode;
import com.iflytek.cloud.InitListener;
import com.iflytek.cloud.RecognizerListener;
import com.iflytek.cloud.RecognizerResult;
import com.iflytek.cloud.SpeechConstant;
import com.iflytek.cloud.SpeechError;
import com.iflytek.cloud.SpeechRecognizer;
import com.iflytek.cloud.SpeechUtility;
import com.iflytek.cloud.ui.RecognizerDialog;
import com.iflytek.cloud.ui.RecognizerDialogListener;

import org.json.JSONException;
import org.json.JSONObject;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.LinkedHashMap;

import kotlinx.coroutines.sync.Mutex;


public class MainActivity extends AppCompatActivity {

    private SpeechRecognizer                STT;                //
    private RecognizerDialog                STT_Dialog;         //
    private HashMap<String, String>         STT_Results;        //

    public SparkChainConfig                 SC_USER_cfg;        //
    public LLMConfig                        SC_LLM_cfg;         //
    public LLMOutput                        SC_LLM_Result;      //
    public LLM                              SC_LLM;             //

    private TextView                        TX;
    private TextView                        RX;
    private EditText                        ETX;
    private TextView                        STT_RES;

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);

        TX  = findViewById(R.id.TX);
        RX  = findViewById(R.id.RX);
        ETX = findViewById(R.id.ETX);
        STT_RES = findViewById(R.id.STT_RES);

        Android_Permission_init();
        SC_cfg_init();
        STT_cfg_init();
    }

    //Spark part
    //Spark LLM cfg init
    public void SC_cfg_init(){
        SC_USER_cfg = SparkChainConfig.builder()
                .appID("d925daa0")
                .apiKey("89740f71074865073383d3efceb798fd")
                .apiSecret("OGU5MDFmYzM1N2YxNDQ1ZjVkNzcxYzQ4");
        int res = SparkChain.getInst().init(this, SC_USER_cfg);       //Spark SDK init
        if(res == 0) TX.setText("Engine by Spark-v3.5\npowered by AIUI_IFLYTEK\nversion:1.1.4.5.1.4\n");

        SC_LLM_cfg = LLMConfig.builder()
                .domain("general")
                .url("wss://spark-api.xf-yun.com/v3.1/chat");

        SC_LLM = new LLM(SC_LLM_cfg);                              //Spark LLM param init
    }

    //Start Spark demo
    public void B_1_C(View view){
        RX.append("\n");
        String text = ETX.getText().toString();
        SC_LLM_Result = SC_LLM.run(text);
        if(SC_LLM_Result.getErrCode()==0){
            RX.append(SC_LLM_Result.getContent());
        }else{
            RX.setText("ERROR");
        }
    }

    //Android permission ask for
    private void Android_Permission_init(){
        ArrayList<String> NOPER_List = new ArrayList<String>();
        String              tempList[];
        String              Perm[]     = {android.Manifest.permission.RECORD_AUDIO,              //录音权限
                android.Manifest.permission.ACCESS_NETWORK_STATE,      //络连接信息权限
                android.Manifest.permission.INTERNET,                  //连网权限
                Manifest.permission.WRITE_EXTERNAL_STORAGE};   //应用写入设备的外部存储

        //is permission granted ? otherwise add to 'List'
        for(String P : Perm)
            if(PackageManager.PERMISSION_GRANTED!= ContextCompat.checkSelfPermission(this,P))
                NOPER_List.add(P);

        tempList = new String[NOPER_List.size()];
        if(!NOPER_List.isEmpty())
            ActivityCompat.requestPermissions(this,NOPER_List.toArray(tempList),123);

    }


    //TTS part
    //STT cfg init
    public void STT_cfg_init(){
        SpeechUtility.createUtility(this, SpeechConstant.APPID +"=92d5c0e8");                       //Create STT server,where the SDK package is deeply bound to this APPID parameter
        STT = SpeechRecognizer.createRecognizer(MainActivity.this, mInitListener);                  //Initialize STT server
        STT_params_init();                                                                          //Initialize STT params
        STT_Results = new LinkedHashMap<String, String>();                                          //Initialize Hashmap
        STT_Dialog = new RecognizerDialog(MainActivity.this, mInitListener);
    }

    //STT start button
    public void B_2_C(View v) {
        STT_Results.clear();                                    //Clean dat
        STT_Dialog.setListener(mRecognizerDialogListener);      //set Dialog event Listener
        STT_Dialog.show();                                      //Show [IFLYTEK] API Interactive animations
    }

    //Initialize listener(only show ERROR messages)
    private InitListener mInitListener = new InitListener() {
        @Override
        public void onInit(int code) {
            if (code != ErrorCode.SUCCESS)
                RX.setText("ERROR");
        }
    };

    //output the remote server result(iflytek API interface function)
    private RecognizerDialogListener mRecognizerDialogListener = new RecognizerDialogListener() {

        public void onResult(RecognizerResult results, boolean isLast) {

            //RX.append(results.getResultString());     //Print Original json data
            printResult(results);                     //show after analyze data

        }

        public void onError(SpeechError error) {
        }
    };

    //JSON data analyze
    private void printResult(RecognizerResult results) {
        RX.append("\n");
        String text = JsonParser.parseIatResult(results.getResultString());
        String sn = null;
        // 读取json结果中的sn字段
        try {
            JSONObject resultJson = new JSONObject(results.getResultString());
            sn = resultJson.optString("sn");
        } catch (JSONException e){
            e.printStackTrace();
        }

        STT_Results.put(sn, text);

        StringBuffer resultBuffer = new StringBuffer();
        for (String key : STT_Results.keySet())
            resultBuffer.append(STT_Results.get(key));

        String res  = resultBuffer.toString();
        STT_RES.setText(res);//听写结果显示
        SC_LLM_Result = SC_LLM.run(res);
        if(SC_LLM_Result.getErrCode()==0){
            RX.append(SC_LLM_Result.getContent());
        }else{
            RX.setText("ERROR");
        }
    }

    //STT Param set
    public void STT_params_init() {
        STT.setParameter(SpeechConstant.CLOUD_GRAMMAR,  null );
        STT.setParameter(SpeechConstant.SUBJECT,        null );
        STT.setParameter(SpeechConstant.PARAMS,         null);
        STT.setParameter(SpeechConstant.ENGINE_TYPE,    "cloud");
        STT.setParameter(SpeechConstant.RESULT_TYPE,    "json");
        STT.setParameter(SpeechConstant.LANGUAGE,       "zh_cn");
        STT.setParameter(SpeechConstant.ACCENT,         "mandarin");
        STT.setParameter(SpeechConstant.VAD_BOS,        "4000");
        STT.setParameter(SpeechConstant.VAD_EOS,        "1000");
        STT.setParameter(SpeechConstant.ASR_PTT,        "1");
        STT.setParameter(SpeechConstant.AUDIO_FORMAT,   "wav");
        STT.setParameter(SpeechConstant.ASR_AUDIO_PATH, Environment.getExternalStorageDirectory() + "/msc/iat.wav");
    }

}

JsonParser.java(直接复制?)

添加此行代码跟你的MainActivity.java同目录

package com.example.myapplication;

import org.json.JSONArray;
import org.json.JSONObject;
import org.json.JSONTokener;

/**
 * Json结果解析类
 */
public class JsonParser {

	public static String parseIatResult(String json) {
		StringBuffer ret = new StringBuffer();
		try {
			JSONTokener tokener = new JSONTokener(json);
			JSONObject joResult = new JSONObject(tokener);

			JSONArray words = joResult.getJSONArray("ws");
			for (int i = 0; i < words.length(); i++) {
				// 转写结果词,默认使用第一个结果
				JSONArray items = words.getJSONObject(i).getJSONArray("cw");
				JSONObject obj = items.getJSONObject(0);
				ret.append(obj.getString("w"));
//				如果需要多候选结果,解析数组其他字段
//				for(int j = 0; j < items.length(); j++)
//				{
//					JSONObject obj = items.getJSONObject(j);
//					ret.append(obj.getString("w"));
//				}
			}
		} catch (Exception e) {
			e.printStackTrace();
		} 
		return ret.toString();
	}
	
	public static String parseGrammarResult(String json) {
		StringBuffer ret = new StringBuffer();
		try {
			JSONTokener tokener = new JSONTokener(json);
			JSONObject joResult = new JSONObject(tokener);

			JSONArray words = joResult.getJSONArray("ws");
			for (int i = 0; i < words.length(); i++) {
				JSONArray items = words.getJSONObject(i).getJSONArray("cw");
				for(int j = 0; j < items.length(); j++)
				{
					JSONObject obj = items.getJSONObject(j);
					if(obj.getString("w").contains("nomatch"))
					{
						ret.append("没有匹配结果.");
						return ret.toString();
					}
					ret.append("【结果】" + obj.getString("w"));
					ret.append("【置信度】" + obj.getInt("sc"));
					ret.append("\n");
				}
			}
		} catch (Exception e) {
			e.printStackTrace();
			ret.append("没有匹配结果.");
		} 
		return ret.toString();
	}
	
	public static String parseLocalGrammarResult(String json) {
		StringBuffer ret = new StringBuffer();
		try {
			JSONTokener tokener = new JSONTokener(json);
			JSONObject joResult = new JSONObject(tokener);

			JSONArray words = joResult.getJSONArray("ws");
			for (int i = 0; i < words.length(); i++) {
				JSONArray items = words.getJSONObject(i).getJSONArray("cw");
				for(int j = 0; j < items.length(); j++)
				{
					JSONObject obj = items.getJSONObject(j);
					if(obj.getString("w").contains("nomatch"))
					{
						ret.append("没有匹配结果.");
						return ret.toString();
					}
					ret.append("【结果】" + obj.getString("w"));
					ret.append("\n");
				}
			}
			ret.append("【置信度】" + joResult.optInt("sc"));

		} catch (Exception e) {
			e.printStackTrace();
			ret.append("没有匹配结果.");
		} 
		return ret.toString();
	}

	public static String parseTransResult(String json,String key) {
		StringBuffer ret = new StringBuffer();
		try {
			JSONTokener tokener = new JSONTokener(json);
			JSONObject joResult = new JSONObject(tokener);
			String errorCode = joResult.optString("ret");
			if(!errorCode.equals("0")) {
				return joResult.optString("errmsg");
			}
			JSONObject transResult = joResult.optJSONObject("trans_result");
			ret.append(transResult.optString(key));
			/*JSONArray words = joResult.getJSONArray("results");
			for (int i = 0; i < words.length(); i++) {
				JSONObject obj = words.getJSONObject(i);
				ret.append(obj.getString(key));
			}*/
		} catch (Exception e) {
			e.printStackTrace();
		}
		return ret.toString();
	}
}

到此所有工作准备完毕,你就可以正常的与星火大模型进行语音对话或者文字输入对话了

目前源码已更新至github

参考链接

Android 科大讯飞语音识别(详细步骤+源码)_android讯飞实时语音识别-CSDN博客

语音听写 Android SDK 文档 | 讯飞开放平台文档中心 (xfyun.cn)

Spark Android SDK接入文档 | 讯飞开放平台文档中心 (xfyun.cn)


评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值