google语音搜索识别API【转载】

最新推荐文章于 2024-09-14 07:32:26 发布

wxxgreat

最新推荐文章于 2024-09-14 07:32:26 发布

阅读量1.5k

点赞数 1

一. 语音输入只有在Chrome浏览器下才能看到

语音搜索功能只有在Google Chrome浏览器下才能看到，在IE，Firefox下测试都看不到。之后查官方文档显示该语音搜索功能现在只支持webkit内核的浏览器，大家赶快下个Google Chrome测试一下吧！

如果你没有安装谷歌浏览器，只是想试试语音识别的功能，有个很简单的办法！小编春节在家上网，不经意间发现QQ聊天面板，多功能辅助输入里面多了一个语音识别功能，于是尝试了几次，准确率还是比较高的，大家可以试试！

QQ语音输入

二.如何在自己的网站上实现语音搜索识别

既然很多个人博客上都有该功能，那么说明这个功能能通过调用第三方API实现……我原本以为会很难，应该是没想到会如此的简单……原理就是一句代码 “ x-webkit-speech ”，将这句代码添加在你的<input>标签里面，比如：

 
<FORM method="post" action="">
 
标题：<INPUT type="text" name="title"  x-webkit-speech lang="zh-CN">
 
<INPUT type="submit" value="提交">
 
</FORM>

就这么简单……不信你可以马上将这段代码保存到任意的HTML文件，比如新建一个index.html文件，然后用Chrome打开！x-webkit-speech后面可以跟很多参数，比如代码中的 lang="zh-CN" 参数（这个参数也可以不加）。

还有 x-webkit-grammar="bUIltin:search" 使得语音输入的内容尽量靠近搜索内容，去除多余的字符，例如“的”、“啦”，

onwebkitspeechchange 发声语音改变时会触发，可以用它来设置说完话就自动搜索，比如

<input type="text" x-webkit-speech onwebkitspeechchange="$(this).cloest('form').submit()"/>，想知道其他更具体的信息可以给我留言。

三.google语音搜索识别探究

大熊做事向来喜欢打破砂锅查到底，所以开始了折腾！因为这个简单的HTML5应用激发了我的兴趣。我最开始猜测这是webkit内核的浏览器自带的功能，因为我用Chrome 开发人员工具抓不到数据包。但是断网后就发现该功能不能用了，所以可以确定实现这个方法一定是调用了远程API。而且调用API的这段代码还是写在Chrome内核里面的,不然数据包会被Chrome 开发人员工具抓到。

先说说怎么抓包吧，打开一个网站，按F12 —> 在出现的窗口中选择Network—>再刷新网站—->就能看到你抓到的数据包了。点击能查看每个数据包的详情，比如下图我用谷歌翻译，翻译“我是谁”抓到的包：

Chrome开发人员工具

现在我们能确定两点

1. 语音搜索识别功能是调用远程API

2. 调用API的代码写在Chrome内核中

确定这两点之后我的思路就清晰多了！因为Chrome是开源的，所以大部分源代码都是开放的。很顺利，我在

http://src.chromium.org/viewvc/chrome/trunk/src/content/browser/

找到了 speech模块！

于是开始分析，其实谈不上分析，就是根据用C语言写的源代码的文件名的意思慢慢猜测。因为找不到哪里能下载整个Chrome源代码的链接，就只能一个一个网页去打开，希望有下载链接的朋友分享一下。最后在speech文件夹下的google_one_shot_remote_engine.cc文件里找到了我要的链接！！

http://src.chromium.org/viewvc/chrome/trunk/src/content/browser/speech/google_one_shot_remote_engine.cc?revision=170920&view=markup

我把主要的部分引用过来

 
const char* const kDefaultSpeechRecognitionUrl =
 
    "https://www.google.com/speech-api/v1/recognize?xjerr=1&client=chromium&";
 
const char* const kStatusString = "status";   //调用接口后返回的状态代码，0表示成功，代码之后会单独说明
 
const char* const kHypothesesString = "hypotheses";  //猜测的结果，一个json结构的数
 
const char* const kUtteranceString = "utterance";   //解析出来的词语
 
const char* const kConfidenceString = "confidence";  //调用会返回有多个结果，这是解析的信心值
 
const int kWebServiceStatusNoError = 0;
 
const int kWebServiceStatusNoSpeech = 4;
 
const int kWebServiceStatusNoMatch = 5;
 
 
 
const AudioEncoder::Codec kDefaultAudioCodec = AudioEncoder::CODEC_FLAC;//发送的是FLAC格式的声音文件
 
 
 
//调用谷歌远程引擎
 
void GoogleOneShotRemoteEngine::StartRecognition() {
 
  DCHECK(delegate());
 
  DCHECK(!url_fetcher_.get());
 
  std::string lang_param = config_.language;
 
 
 
  if (lang_param.empty() && url_context_) {
 
    // If no language is provided then we use the first from the accepted
 
    // language list. If this list is empty then it defaults to "en-US".
 
    // Example of the contents of this list: "es,en-GB;q=0.8", ""
 
    net::URLRequestContext* request_context =
 
        url_context_->GetURLRequestContext();
 
    DCHECK(request_context);
 
    // TODO(pauljensen): GoogleOneShotRemoteEngine should be constructed with
 
    // a reference to the HttpUserAgentSettings rather than accessing the
 
    // accept language through the URLRequestContext.
 
    std::string accepted_language_list = request_context->GetAcceptLanguage();
 
    size_t separator = accepted_language_list.find_first_of(",;");
 
    lang_param = accepted_language_list.substr(0, separator);
 
  }
 
 
 
  if (lang_param.empty())
 
    lang_param = "en-US";
 
 
 
/*************************************************************
 
/* 重要注释： 来自---编程学习博客  网址：http://php.oil58.com/ 
 
/*     parts参数非常的重要，里面将存储接口参数，根据下面的代码
 
/* 我们可以知道有如下几个参数: 下文会单独解释
 
/*  "lang="
 
/*  "lm="
 
/*  "xhw="
 
/*  "maxresults="
 
/*  "key="
 
**************************************************************/
 
  std::vector<std::string> parts;
 
  parts.push_back("lang=" + net::EscapeQueryParamValue(lang_param, true));
 
  /* "lang="是一个重要的参数,用来选择语言类型，默认是"en-US" 美式英文，中文为zh-CN，其余语言代码参考：
 
  http://msdn.microsoft.com/en-us/library/ms533052(v=vs.85).aspx */
 
 
 
  if (!config_.grammars.empty()) {
 
    DCHECK_EQ(config_.grammars.size(), 1U);
 
    parts.push_back("lm=" + net::EscapeQueryParamValue(config_.grammars[0].url,
 
                                                       true));
 
  }
 
 
 
  if (!config_.hardware_info.empty())
 
    parts.push_back("xhw=" + net::EscapeQueryParamValue(config_.hardware_info,
 
                                                        true));
 
  parts.push_back("maxresults=" + base::UintToString(config_.max_hypotheses));
 
  parts.push_back(config_.filter_profanities ? "pfilter=2" : "pfilter=0");
 
 
 
  std::string api_key = google_apis::GetAPIKey();
 
  parts.push_back("key=" + net::EscapeQueryParamValue(api_key, true));
 
 
 
}  // namespace content