科大讯飞语音转写API
一、语音转写是什么?
语音转写(Long Form ASR)基于深度全序列卷积神经网络,将长段音频(5小时以内)数据转换成文本数据,为信息处理和数据挖掘提供基础。转写的是已录制音频(非实时),音频文件上传成功后进入等待队列,待转写成功后用户即可获取结果,返回结果时间受音频时长以及排队任务量的影响。 如遇转写耗时比平时延长,大概率表示当前时间段出现转写高峰,请耐心等待即可。
二、使用步骤
1.注册并认证
访问讯飞官网注册账号:https://xinghuo.xfyun.cn/sparkapi?ch=gji
服务管理-完成个人认证
2.创建应用
右上角-我的应用
完善应用信息
左边选择“语音识别”-“语音转写”
右边的token信息需要记录下来
免费识别时长为5小时
3.构建API
https://www.xfyun.cn/doc/asr/ifasr_new/API.html
官方提供的demo有python3、java语言,这里以PHP调用python为例构建一个API
index.php获取待识别的音频,存储到本地供Python调用,识别完成后删除音频。
<?php
header('Access-Control-Allow-Origin:*');
header('Content-type: application/json');
$url=isset($_GET['url'])? $_GET['url'] :null;
if(empty($url)){die("请传入音频链接参数");}
preg_match('/[^.]+$/', $url, $matches);
$extension = $matches[0];
$array = ["mp3","wav","pcm","aac","opus","flac","ogg","m4a","amr","speex","lyb","ac3","aac","ape","m4r","mp4","acc","wma"];
if(!in_array($extension, $array)){die("当前音频格式不支持");}
$file_path = dirname(__FILE__) . '/cache/' . time() . '.'.$extension;
$file_data = file_get_contents($url); // 从URL获取文件数据
file_put_contents($file_path, $file_data); // 将文件数据保存到本地
$str = exec("python3 lfasr-new.py $file_path");
// 将处理后的JSON字符串转换为PHP对象或数组
$data = json_decode($str, true);
$response_msg = $data["code"];
if($response_msg == 000000) {$code = "200";}
else{$code = "202";}
$res = $data["content"]["orderResult"];
$res = json_decode($res, true);
$res = $res["lattice2"];
$num = count($res);
for ($i=0; $i<$num; $i++)
{
$slice = $res[$i]["json_1best"]["st"]["rt"][0]["ws"];
$totalItems = count($slice);
for ($j=0; $j<$totalItems; $j++)
{
$words .= $slice[$j]["cw"][0]["w"];
}
}
$json_return = array(
"code" => $code,
"src" => $url,
"dst" => $words
);
echo json_encode($json_return, JSON_UNESCAPED_UNICODE);
unlink($file_path);
lfasr-new.py文件写入内容,请求讯飞识别接口
# -*- coding: utf-8 -*-
import base64
import hashlib
import hmac
import json
import os
import time
import requests
import urllib
import sys
lfasr_host = 'https://raasr.xfyun.cn/v2/api'
# 请求的接口名
api_upload = '/upload'
api_get_result = '/getResult'
class RequestApi(object):
def __init__(self, appid, secret_key, upload_file_path):
self.appid = appid
self.secret_key = secret_key
self.upload_file_path = upload_file_path
self.ts = str(int(time.time()))
self.signa = self.get_signa()
def get_signa(self):
appid = self.appid
secret_key = self.secret_key
m2 = hashlib.md5()
m2.update((appid + self.ts).encode('utf-8'))
md5 = m2.hexdigest()
md5 = bytes(md5, encoding='utf-8')
# 以secret_key为key, 上面的md5为msg, 使用hashlib.sha1加密结果为signa
signa = hmac.new(secret_key.encode('utf-8'), md5, hashlib.sha1).digest()
signa = base64.b64encode(signa)
signa = str(signa, 'utf-8')
return signa
def upload(self):
# print("上传部分:")
upload_file_path = self.upload_file_path
file_len = os.path.getsize(upload_file_path)
file_name = os.path.basename(upload_file_path)
param_dict = {}
param_dict['appId'] = self.appid
param_dict['signa'] = self.signa
param_dict['ts'] = self.ts
param_dict["fileSize"] = file_len
param_dict["fileName"] = file_name
param_dict["duration"] = "200"
# print("upload参数:", param_dict)
data = open(upload_file_path, 'rb').read(file_len)
response = requests.post(url =lfasr_host + api_upload+"?"+urllib.parse.urlencode(param_dict),
headers = {"Content-type":"application/json"},data=data)
# print("upload_url:",response.request.url)
result = json.loads(response.text)
# print("upload resp:", result)
return result
def get_result(self):
uploadresp = self.upload()
orderId = uploadresp['content']['orderId']
param_dict = {}
param_dict['appId'] = self.appid
param_dict['signa'] = self.signa
param_dict['ts'] = self.ts
param_dict['orderId'] = orderId
param_dict['resultType'] = "transfer,predict"
# print("")
# print("查询部分:")
# print("get result参数:", param_dict)
status = 3
# 建议使用回调的方式查询结果,查询接口有请求频率限制
while status == 3:
response = requests.post(url=lfasr_host + api_get_result + "?" + urllib.parse.urlencode(param_dict),
headers={"Content-type": "application/json"})
# print("get_result_url:",response.request.url)
result = json.loads(response.text)
# print(result)
status = result['content']['orderInfo']['status']
# print("status=",status)
if status == 4:
break
time.sleep(1)
print(json.dumps(result))
return result
# 输入讯飞开放平台的appid,secret_key和待转写的文件路径
if __name__ == '__main__':
api = RequestApi(appid="xxxxxx",
secret_key="xxxxxx",
upload_file_path=sys.argv[1])
api.get_result()
运行前:请先填写Appid、SecretKey,新建cache缓存文件夹。
4.调用实例
请求地址:
https://你的api地址/?url=待识别的音频文件网络地址
返回数据:
{
"code": "200",
"src": "待识别的音频文件网络地址",
"dst": "识别后的文字"
}
Demo:https://api.szfx.top/api/lfasr.html
总结
以上就是使用PHP调用python构建语音识别 API 的教程,欢迎大家去体验。