本周我做了语音情感识别方面的工作。
因为人机交互的老师推荐过这方面的相关接口,所以我选择了Emokit,该公司主要是做情感识别的,包括语音、人脸、心率等方面的情感计算。
1.首先是id的申请,比较简单,很容易就通过了。
之后下载对应的SDK,里面有对应的Manual和demo,可以帮助我实现代码。
2.预备工作
(1)新建工程,导入SDK。将开发工具包中libs目录下的emokitsdk.jar复制到Android工程的libs目录(如果工程无libs目录,需自行创建),如下图所示:
(2)添加权限:在工程AndroidManifest.xml文件中添加如下权限(因为我只用语音情感识别,所以不用添加摄像头权限)
<!-- 开启摄像头 -->
<uses-feature android:name="android.hardware.camera"/>
<uses-feature android:name="android.hardware.camera.flash"/>
<uses-permission android:name="android.permission.CAMERA"/>
<uses-permission android:name="android.permission.WAKE_LOCK"/>
<!—设置录音 -->
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<!-- 网络状态 -->
<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.ACCESS_NETWORK_STATE" />
<uses-permission android:name="android.permission.ACCESS_WIFI_STATE" />
<uses-permission android:name="android.permission.CHANGE_NETWORK_STATE" />
<uses-permission android:name="android.permission.READ_PHONE_STATE" />
<!-- sd 卡获得写的权限-->
<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" />
(3)添加ADI和KEY:初始化时需要配置从Emokit开发者中心申请的AID和KEY。在AndroidManifest.xml中配置如下
<meta-data
android:name="EMOKIT_AID"
android:value="自己申请的id" />
<meta-data
android:name="EMOKIT_KEY"
android:value="自己申请的key" />
<meta-data
android:name="EMOKIT_RecordTaskAnimation"
android:value="1" />
EMOKIT_RecordTaskAnimation为调用语音按钮时候是否启动交互框,默认为1启动;0为不启动
(4)初始化SDK:
添加 SDK 的初始化代码
在应用入口处Activity的onCreate方法内调用SDKAppInit.createInstance(this); 初始化 SDK,示例代码如下:
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
SDKAppInit.createInstance(this);
}
开启Debug模式:
集成调试过程中若需查看 log 日志,请在应用入口处 Activity 的 onCreate 方法内调用调 用 SDKAppInit.setDebugMode(true),否则默认为 false 。
3.Activity编写
首先在onCreate方法中新建一个SpeechEmotionDetect对象
speechEmotionDetect = SpeechEmotionDetect.createRecognizer(this, mInitListener);并在识别按钮上绑定一个监听器,当点击按钮时,speechEmotionDetect对象就用mSpeechEmotionListener开始监听
stplayer.setOnClickListener(new OnClickListener() { @Override public void onClick(View v) { mIatResults = new ArrayList<String>(); emoResults = new ArrayList<String>(); emoResulText.setText(""); speechResultText.setText(""); speechEmotionDetect.startListening(mSpeechEmotionListener, SDKConstant.RC_TYPE_5, false, "52c8cef6"); } });
语音识别监听器mSpeechEmotionListener是一个接口类,onEmotionResult主要返回情绪类型(平静、伤感、生气、开心、害怕,onSpeechResult返回语音识别的具体内容。但语音内容识别一直没有成功。
4.应用演示
5.结果评价
不仅语音内容识别没能实现,而且情感的识别也很不准确,同一个句子的录音,分几次识别结果都会不同。但老师的确提醒过结果可能不太准确。我分析可能是它的情感种类太多的原因,毕竟我们的应用不需要这么细致的情感分类。所以我打算试一下腾讯云的sdk,它的情绪只分为3中:正面情绪、消极负面情绪和中性情绪,我猜测效果可能会准确一点。
6.完整代码
MainActivity.java
import java.util.ArrayList;
import java.util.List;
import org.json.JSONException;
import org.json.JSONObject;
import com.emokit.sdk.InitListener;
import com.emokit.sdk.basicinfo.AdvancedInformation;
import com.emokit.sdk.heartrate.EmoRateListener;
import com.emokit.sdk.heartrate.RateDetect;
import com.emokit.sdk.record.SpeechEmotionListener;
import com.emokit.sdk.record.SpeechEmotionDetect;
import com.emokit.sdk.senseface.ExpressionListener;
import com.emokit.sdk.senseface.ExpressionDetect;
import com.emokit.sdk.util.JsonParser;
import com.emokit.sdk.util.SDKAppInit;
import com.emokit.sdk.util.SDKConstant;
import android.os.Bundle;
import android.os.Handler;
import android.os.Message;
import android.app.Activity;
import android.util.Log;
import android.view.View;
import android.view.View.OnClickListener;
import android.widget.Button;
import android.widget.EditText;
import android.content.Context;
import android.content.Intent;
public class MainActivity extends Activity {
private Button stplayer;
private SpeechEmotionDetect speechEmotionDetect;
protected Context mcontext;
EditText emoResulText, speechResultText;
String resultt2 = "";
boolean showt2 = false;
// 用HashMap存储听写结果
private List<String> mIatResults = new ArrayList<String>();
private List<String> emoResults = new ArrayList<String>();
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
mcontext = this;
emoResulText = (EditText) findViewById(R.id.voiceshow);
speechResultText = (EditText) findViewById(R.id.faceshow);
SDKAppInit.createInstance(this);
speechEmotionDetect = SpeechEmotionDetect.createRecognizer(this, mInitListener);
stplayer = (Button) findViewById(R.id.isr_recognize);
if (showt2) {
speechResultText.setText(resultt2);
}
stplayer.setOnClickListener(new OnClickListener() {
@Override
public void onClick(View v) {
mIatResults = new ArrayList<String>();
emoResults = new ArrayList<String>();
emoResulText.setText("");
speechResultText.setText("");
speechEmotionDetect.startListening(mSpeechEmotionListener, SDKConstant.RC_TYPE_5, false, "52c8cef6");
}
});
}
/**
* 初始化监听器。
*/
private InitListener mInitListener = new InitListener() {
@Override
public void onInit(int code) {
// 获取设备ID
AdvancedInformation pp = AdvancedInformation.getSingleton(mcontext);
SDKAppInit.registerforuid("AndroidSDKDemo",
pp.getp().getSimSerial(), "123456");
}
};
private void printEmotionResult(String results) {
// Log.e("printResult", results);
JSONObject jsonObject;
try {
jsonObject = new JSONObject(results);
int resultcode = jsonObject.getInt("resultcode");
if (resultcode == 200) {
String emoCode = jsonObject.getString("rc_main");
if(emoCode.equals("K")) {
emoCode = "平静";
} else if(emoCode.equals("C")) {
emoCode = "伤感";
} else if(emoCode.equals("Y")) {
emoCode = "生气";
} else if(emoCode.equals("M")) {
emoCode = "开心";
} else if(emoCode.equals("W")) {
emoCode = "害怕";
}
emoResults.add(emoCode);
StringBuffer resultBuffer = new StringBuffer();
for (String iterable_element : emoResults) {
resultBuffer.append(iterable_element).append(", ");
}
Message msg = new Message();
msg.what = 0;
msg.obj = resultBuffer.toString();
mainhandler.sendMessage(msg);
// Log.e("XUNFEI", resultBuffer.toString());
}
} catch (JSONException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
private void printSpeechResult(String results) {
Log.e("printResult", results);
String text = JsonParser.parseIatResult(results);
String sn = null;
// 读取json结果中的sn字段
try {
JSONObject resultJson = new JSONObject(results);
sn = resultJson.optString("sn");
} catch (JSONException e) {
e.printStackTrace();
}
mIatResults.add(text);
StringBuffer resultBuffer = new StringBuffer();
for (String iterable_element : mIatResults) {
resultBuffer.append(iterable_element);
}
Message msg = new Message();
msg.what = 1;
msg.obj = resultBuffer.toString();
mainhandler.sendMessage(msg);
// Log.e("XUNFEI", resultBuffer.toString());
}
Handler mainhandler = new Handler() {
@Override
public void handleMessage(Message msg) {
switch (msg.what) {
case 0:
emoResulText.setText((String) msg.obj);
break;
case 1:
speechResultText.setText((String) msg.obj);
speechResultText.setVisibility(View.VISIBLE);
break;
case 2:
emoResulText.setText((String) msg.obj);
speechResultText.setVisibility(View.INVISIBLE);
break;
default:
break;
}
};
};
/**
* 语音识别监听器。
*/
private SpeechEmotionListener mSpeechEmotionListener = new SpeechEmotionListener() {
@Override
public void onVolumeChanged(int volume) {
Log.e("tag", volume + "");
}
@Override
public void onEndOfSpeech() {
Log.e("EmotionVoiceListener", "结束说话");
}
@Override
public void onBeginOfSpeech() {
Log.e("EmotionVoiceListener", "开始说话");
}
@Override
public void onEmotionResult(String result) {
printEmotionResult(result);
Log.e("emotion",result);
}
@Override
public void onSpeechResult(String result) {
printSpeechResult(result);
}
};
}
activity_main.xml
<?xml version="1.0" encoding="utf-8"?>
<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
android:layout_width="fill_parent"
android:layout_height="fill_parent"
android:orientation="vertical" >
<EditText
android:id="@+id/voiceshow"
android:layout_width="match_parent"
android:layout_height="200dp" />
<EditText
android:id="@+id/faceshow"
android:layout_width="match_parent"
android:layout_height="200dp" />
<LinearLayout
android:layout_width="match_parent"
android:layout_height="wrap_content">
<Button
android:id="@+id/isr_recognize"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:text="recognize"/>
</LinearLayout>
</LinearLayout>
AndroidManifest.xml
<?xml version="1.0" encoding="utf-8"?>
<manifest xmlns:android="http://schemas.android.com/apk/res/android"
package="com.example.liche.emotionrecog">
<application
android:allowBackup="true"
android:icon="@mipmap/ic_launcher"
android:label="@string/app_name"
android:roundIcon="@mipmap/ic_launcher_round"
android:supportsRtl="true"
android:theme="@style/AppTheme">
<activity android:name=".MainActivity">
<intent-filter>
<action android:name="android.intent.action.MAIN" />
<category android:name="android.intent.category.LAUNCHER" />
</intent-filter>
</activity>
<meta-data
android:name="EMOKIT_AID"
android:value="101375" />
<meta-data
android:name="EMOKIT_KEY"
android:value="2669e99e1b76ef4eff762d5b8acd1725" />
<meta-data
android:name="EMOKIT_RecordTaskAnimation"
android:value="1" />
</application>
<!--设置录音 -->
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<!-- 网络状态 -->
<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.ACCESS_NETWORK_STATE" />
<uses-permission android:name="android.permission.ACCESS_WIFI_STATE" />
<uses-permission android:name="android.permission.CHANGE_NETWORK_STATE" />
<uses-permission android:name="android.permission.READ_PHONE_STATE" />
<!-- sd 卡获得写的权限-->
<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" />
</manifest>