使用阿里智能语音实现录音文件识别

1.需求场景

有一个电话录音文件转换成文字的需求,经过研究决定使用阿里OSS(对象存储)和智能语音交互实现功能。

2.名词解释

OSS:阿里云对象存储服务(Object Storage Service,简称 OSS),是阿里云提供的海量、安全、低成本、高可靠的云存储服务。
Bucket:存储空间。存储空间是用于存储对象(Object)的容器,所有的对象都必须隶属于某个存储空间。存储空间具有各种配置属性,包括地域、访问权限、存储类型等。可以根据实际需求,创建不同类型的存储空间来存储不同的数据。
Endpoint:Endpoint 表示 OSS 对外服务的访问域名。
AccessKey:访问密钥。AccessKey(简称 AK)指的是访问身份验证中用到的 AccessKeyId 和 AccessKeySecret。
智能语音交互:(Intelligent Speech Interaction),是基于语音识别、语音合成、自然语言理解等技术,为企业在多种实际应用场景下,赋予产品“能听、会说、懂你”式的智能人机交互体验。

3.技术架构

springboot+thymeleaf+layui+mysql

4.项目开发

4.1 搭建springboot项目,配置pom.xml引入相关依赖
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.2.4.RELEASE</version>
        <relativePath/> <!-- lookup parent from repository -->
    </parent>
    <groupId>com.cxb</groupId>
    <artifactId>voice</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>voice</name>
    <description>a speech recognition project</description>

    <properties>
        <java.version>1.8</java.version>
        <druid.springboot.version>1.1.21</druid.springboot.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-thymeleaf</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-devtools</artifactId>
            <scope>runtime</scope>
            <optional>true</optional>
        </dependency>
        <!-- 加了该配置文件 会自动启动数据库 -->
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-data-jpa</artifactId>
        </dependency>
        <dependency>
            <groupId>mysql</groupId>
            <artifactId>mysql-connector-java</artifactId>
            <scope>runtime</scope>
        </dependency>
        <dependency>
            <groupId>com.alibaba</groupId>
            <artifactId>druid-spring-boot-starter</artifactId>
            <version>${druid.springboot.version}</version>
        </dependency>
        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>org.apache.httpcomponents</groupId>
            <artifactId>httpclient</artifactId>
            <version>4.5.12</version>
        </dependency>
        <dependency>
            <groupId>org.jdom</groupId>
            <artifactId>jdom</artifactId>
            <version>2.0.2</version>
        </dependency>
        <dependency>
            <groupId>org.codehaus.jettison</groupId>
            <artifactId>jettison</artifactId>
            <version>1.4.0</version>
        </dependency>
        <dependency>
            <groupId>com.aliyun</groupId>
            <artifactId>aliyun-java-sdk-core</artifactId>
            <version>3.7.1</version>
        </dependency>
        <dependency>
            <groupId>com.alibaba</groupId>
            <artifactId>fastjson</artifactId>
            <version>1.2.66</version>
        </dependency>
        <dependency>
            <groupId>com.aliyun</groupId>
            <artifactId>aliyun-java-sdk-ram</artifactId>
            <version>3.0.0</version>
        </dependency>
        <dependency>
            <groupId>com.aliyun</groupId>
            <artifactId>aliyun-java-sdk-sts</artifactId>
            <version>3.0.0</version>
        </dependency>
        <dependency>
            <groupId>com.aliyun</groupId>
            <artifactId>aliyun-java-sdk-ecs</artifactId>
            <version>4.2.0</version>
        </dependency>
        <dependency>
            <groupId>com.aliyun.oss</groupId>
            <artifactId>aliyun-sdk-oss</artifactId>
            <version>3.8.0</version>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
            <exclusions>
                <exclusion>
                    <groupId>org.junit.vintage</groupId>
                    <artifactId>junit-vintage-engine</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>

</project>
4.2创建提交文件和查询结果的工具类
import com.alibaba.fastjson.JSONObject;
import com.aliyuncs.CommonRequest;
import com.aliyuncs.CommonResponse;
import com.aliyuncs.DefaultAcsClient;
import com.aliyuncs.IAcsClient;
import com.aliyuncs.exceptions.ClientException;
import com.aliyuncs.http.MethodType;
import com.aliyuncs.profile.DefaultProfile;

public class FileTransUtil {

    // 地域ID,常量内容,请勿改变
    public static final String REGIONID = "cn-shanghai";
    public static final String ENDPOINTNAME = "cn-shanghai";
    public static final String PRODUCT = "nls-filetrans";
    public static final String DOMAIN = "filetrans.cn-shanghai.aliyuncs.com";
    public static final String API_VERSION = "2018-08-17";
    public static final String POST_REQUEST_ACTION = "SubmitTask";
    public static final String GET_REQUEST_ACTION = "GetTaskResult";

    // 请求参数key
    public static final String KEY_APP_KEY = "appkey";
    public static final String KEY_FILE_LINK = "file_link";
    public static final String KEY_VERSION = "version";
    public static final String KEY_ENABLE_WORDS = "enable_words";
    // 响应参数key
    public static final String KEY_TASK = "Task";
    public static final String KEY_TASK_ID = "TaskId";
    public static final String KEY_STATUS_TEXT = "StatusText";
    public static final String KEY_RESULT = "Result";
    // 状态值
    public static final String STATUS_SUCCESS = "SUCCESS";
    private static final String STATUS_RUNNING = "RUNNING";
    private static final String STATUS_QUEUEING = "QUEUEING";

    // 阿里云鉴权client
    IAcsClient client;

    public FileTransUtil(String accessKeyId, String accessKeySecret){
        // 设置endpoint
        try {
            DefaultProfile.addEndpoint(ENDPOINTNAME, REGIONID, PRODUCT, DOMAIN);
        } catch (ClientException e) {
            e.printStackTrace();
        }
        // 创建DefaultAcsClient实例并初始化
        DefaultProfile profile = DefaultProfile.getProfile(REGIONID, accessKeyId, accessKeySecret);
        this.client = new DefaultAcsClient(profile);
    }

    public String submitFileTransRequest(String appKey, String fileLink) {
        /**
         * 1. 创建CommonRequest 设置请求参数
         */
        CommonRequest postRequest = new CommonRequest();
        // 设置域名
        postRequest.setDomain(DOMAIN);
        // 设置API的版本号,格式为YYYY-MM-DD
        postRequest.setVersion(API_VERSION);
        // 设置action
        postRequest.setAction(POST_REQUEST_ACTION);
        // 设置产品名称
        postRequest.setProduct(PRODUCT);
        /**
         * 2. 设置录音文件识别请求参数,以JSON字符串的格式设置到请求的Body中
         */
        JSONObject taskObject = new JSONObject();
        // 设置appkey
        taskObject.put(KEY_APP_KEY, appKey);
        // 设置音频文件访问链接
        taskObject.put(KEY_FILE_LINK, fileLink);
        // 新接入请使用4.0版本,已接入(默认2.0)如需维持现状,请注释掉该参数设置
        taskObject.put(KEY_VERSION, "4.0");
        // 设置是否输出词信息,默认为false,开启时需要设置version为4.0及以上
        taskObject.put(KEY_ENABLE_WORDS, true);
        String task = taskObject.toJSONString();
        System.out.println(task);
        // 设置以上JSON字符串为Body参数
        postRequest.putBodyParameter(KEY_TASK, task);
        // 设置为POST方式的请求
        postRequest.setMethod(MethodType.POST);
        /**
         * 3. 提交录音文件识别请求,获取录音文件识别请求任务的ID,以供识别结果查询使用
         */
        String taskId = null;
        try {
            CommonResponse postResponse = client.getCommonResponse(postRequest);
            System.err.println("提交录音文件识别请求的响应:" + postResponse.getData());
            if (postResponse.getHttpStatus() == 200) {
                JSONObject result = JSONObject.parseObject(postResponse.getData());
                String statusText = result.getString(KEY_STATUS_TEXT);
                if (STATUS_SUCCESS.equals(statusText)) {
                    taskId = result.getString(KEY_TASK_ID);
                }
            }
        } catch (ClientException e) {
            e.printStackTrace();
        }
        return taskId;
    }

    public String getFileTransResult(String taskId) {
        /**
         * 1. 创建CommonRequest 设置任务ID
         */
        CommonRequest getRequest = new CommonRequest();
        // 设置域名
        getRequest.setDomain(DOMAIN);
        // 设置API版本
        getRequest.setVersion(API_VERSION);
        // 设置action
        getRequest.setAction(GET_REQUEST_ACTION);
        // 设置产品名称
        getRequest.setProduct(PRODUCT);
        // 设置任务ID为查询参数
        getRequest.putQueryParameter(KEY_TASK_ID, taskId);
        // 设置为GET方式的请求
        getRequest.setMethod(MethodType.GET);
        /**
         * 2. 提交录音文件识别结果查询请求
         * 以轮询的方式进行识别结果的查询,直到服务端返回的状态描述为“SUCCESS”,或者为错误描述,则结束轮询。
         */
        String result = null;
        while (true) {
            try {
                CommonResponse getResponse = client.getCommonResponse(getRequest);
                System.err.println("识别查询结果:" + getResponse.getData());
                if (getResponse.getHttpStatus() != 200) {
                    break;
                }
                JSONObject rootObj = JSONObject.parseObject(getResponse.getData());
                String statusText = rootObj.getString(KEY_STATUS_TEXT);
                if (STATUS_RUNNING.equals(statusText) || STATUS_QUEUEING.equals(statusText)) {
                    // 继续轮询,注意设置轮询时间间隔
                    Thread.sleep(3000);
                }
                else {
                    // 状态信息为成功,返回识别结果;状态信息为异常,返回空
                    if (STATUS_SUCCESS.equals(statusText)) {
                        result = rootObj.getString(KEY_RESULT);
                        // 状态信息为成功,但没有识别结果,则可能是由于文件里全是静音、噪音等导致识别为空
                        if(result == null) {
                            result = "";
                        }
                    }
                    break;
                }
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
        return result;
    }
}
4.3 创建controller,完成上传文件和查询结果接口。
import com.alibaba.fastjson.JSONArray;
import com.alibaba.fastjson.JSONObject;
import com.aliyun.oss.OSS;
import com.aliyun.oss.OSSClientBuilder;
import com.aliyun.oss.model.PutObjectRequest;
import com.aliyun.oss.model.PutObjectResult;
import com.cxb.voice.enums.Status;
import com.cxb.voice.model.TransResult;
import com.cxb.voice.model.Voice;
import com.cxb.voice.service.VoiceService;
import com.cxb.voice.util.FileTransUtil;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Controller;
import org.springframework.util.StringUtils;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.ResponseBody;
import org.springframework.web.multipart.MultipartFile;
import org.springframework.web.multipart.MultipartHttpServletRequest;

import javax.servlet.http.HttpServletRequest;
import java.io.File;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.io.OutputStream;
import java.net.URL;
import java.util.ArrayList;
import java.util.Date;
import java.util.List;

/**
 * @author baixiaozheng
 */
@Controller
public class FileTransController {

    @Autowired
    private VoiceService voiceService;

    // Endpoint以北京为例,其它Region请按实际情况填写。
    String endpoint = "http://oss-cn-beijing.aliyuncs.com";
    // 阿里云主账号AccessKey拥有所有API的访问权限,风险很高。强烈建议您创建并使用RAM账号进行API访问或日常运维,请登录 https://ram.console.aliyun.com 创建RAM账号。
    String accessKeyId = "your accessKeyId";
    String accessKeySecret = "your accessKeySecret";
    String bucketName = "your bucketName";
    String appKey = "your appKey";

    @RequestMapping("toIndex")
    public String toIndex(){
        return "index";
    }

    /**
     * 上传文件
     * @param request
     * @return
     * @throws Exception
     */
    @PostMapping("upload")
    @ResponseBody
    public JSONObject upload(HttpServletRequest request) throws Exception {

        JSONObject json = new JSONObject();
        MultipartHttpServletRequest multipartRequest = (MultipartHttpServletRequest) request;

        // 获取上传的文件
        MultipartFile multiFile = multipartRequest.getFile("file");
        //获得原始文件名;
        String fileRealName = multiFile.getOriginalFilename();

        // 创建OSSClient实例。
        OSS ossClient = new OSSClientBuilder().build(endpoint, accessKeyId, accessKeySecret);

        //把上传的MultipartFile转换成File
        File file = multipartFileToFile(multiFile);

        // 创建PutObjectRequest对象。
        PutObjectRequest putObjectRequest = new PutObjectRequest(bucketName, fileRealName, file);

        // 上传文件。
        PutObjectResult result = ossClient.putObject(putObjectRequest);

        String url = getUrl(fileRealName);

        // 关闭OSSClient。
        ossClient.shutdown();

        json.put("url", url);

        FileTransUtil fileTransUtil = new FileTransUtil(accessKeyId, accessKeySecret);
        String taskId = fileTransUtil.submitFileTransRequest(appKey, url);

        deleteTempFile(file);

        Voice voice = new Voice();
        voice.setFileName(fileRealName);
        voice.setTaskId(taskId);
        voice.setUrl(url);
        voice.setStatus(Status.WAITING);
        voiceService.save(voice);
        return json;
    }

    @GetMapping("result")
    @ResponseBody
    public JSONObject getResult(String taskId){

        JSONObject json = new JSONObject();
        String result = "";
        Voice voice = voiceService.getByTaskId(taskId);
        if(StringUtils.isEmpty(voice.getResult())){
            if(Status.PROGRESS.equals(voice.getStatus())){
                result = "转换中,请稍后再试";
                List<TransResult> list = new ArrayList<>();
                TransResult r = new TransResult();
                r.setResult(result);
                list.add(r);
                json.put("date",list);
                json.put("code",0);
                return json;
            }
            else {
                try {
                    voice.setStatus(Status.PROGRESS);
                    voiceService.save(voice);

                    FileTransUtil fileTransUtil = new FileTransUtil(accessKeyId, accessKeySecret);
                    result = fileTransUtil.getFileTransResult(taskId);
                    System.out.println(result);
                    voice.setStatus(Status.COMPLETE);
                    voice.setResult(result);
                    voiceService.save(voice);
                } catch (Exception e) {
                    voice.setStatus(Status.WAITING);
                    voiceService.save(voice);
                }

            }
        } else {
            result = voice.getResult();
        }

        JSONObject jsonObject = JSONObject.parseObject(result);
        JSONArray sentences = jsonObject.getJSONArray("Sentences");
        List<TransResult> list = new ArrayList<>();
        for(int i=0;i<sentences.size();i++){
            JSONObject s = sentences.getJSONObject(i);
            TransResult r = new TransResult();
            r.setStartTime(timeStamp2Date(s.getLong("BeginTime")));
            r.setEndTime(timeStamp2Date(s.getLong("EndTime")));
            r.setResult(s.getString("Text"));
            list.add(r);
        }
        json.put("data",list);
        json.put("code",0);
        return json;

    }

    @RequestMapping("toList")
    public String toList(){
        return "list";
    }

    @GetMapping("list")
    @ResponseBody
    public JSONObject list(){
        List<Voice> list = voiceService.findAll();
        JSONObject json = new JSONObject();
        json.put("code",0);
        json.put("data",list);
        return json;
    }

     /**
     * @param timestamp
     * @return
     */
    public String timeStamp2Date(Long timestamp){
        String date = new java.text.SimpleDateFormat("HH:mm:ss").format(new java.util.Date(timestamp));
        return date;
    }

    /**
     * MultipartFile 转 File
     *
     * @param file
     * @throws Exception
     */
    public static File multipartFileToFile(MultipartFile file) throws Exception {

        File toFile = null;
        if (file.equals("") || file.getSize() <= 0) {
            file = null;
        } else {
            InputStream ins = null;
            ins = file.getInputStream();
            toFile = new File(file.getOriginalFilename());
            inputStreamToFile(ins, toFile);
            ins.close();
        }
        return toFile;
    }

    /**
     * 获取流文件
     *
     * @param ins
     * @param file
     */
    private static void inputStreamToFile(InputStream ins, File file) {
        try {
            OutputStream os = new FileOutputStream(file);
            int bytesRead = 0;
            byte[] buffer = new byte[8192];
            while ((bytesRead = ins.read(buffer, 0, 8192)) != -1) {
                os.write(buffer, 0, bytesRead);
            }
            os.close();
            ins.close();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    /**
     * 删除本地临时文件
     *
     * @param file
     */
    public static void deleteTempFile(File file) {
        if (file != null) {
            File del = new File(file.toURI());
            del.delete();
        }

    }

    /**
     * 获得url链接
     *
     * @param key
     * @return
     */
    public String getUrl(String key) {
        // 创建OSSClient实例。
        OSS ossClient = new OSSClientBuilder().build(endpoint, accessKeyId, accessKeySecret);

        // 设置URL过期时间为10年  3600l* 1000*24*365*10
        Date expiration = new Date(System.currentTimeMillis() + 3600L * 1000 * 24 * 365);
        // 生成URL
        URL url = ossClient.generatePresignedUrl(bucketName, key, expiration);
        if (url != null) {
            return url.toString();
        }
        return null;
    }
}

其他基础配置不再详述,项目地址:https://github.com/baixiaozheng/voice

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

白效正

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值