Web Speech API 是一种浏览器内置的 API,用于语音识别和语音合成。它允许开发者在网页中实现语音输入和语音输出功能,提供更自然和直观的用户交互方式。Web Speech API 包括两个主要部分:
- Speech Recognition API:用于语音识别,将用户的语音输入转换为文本。
- Speech Synthesis API:用于语音合成,将文本转换为语音输出。
使用场景
- 语音控制和导航:通过语音命令控制网页的导航和操作,例如在智能家居控制面板中使用语音控制设备。
- 辅助技术:帮助视力或行动不便的用户通过语音进行网页操作。
- 语音输入:在表单和聊天应用中使用语音输入,提高输入效率。
- 语音反馈:在教育和培训应用中提供语音反馈,增强用户体验。
1. Speech Recognition API
Speech Recognition API 用于将用户的语音输入转换为文本。以下是一个基本的示例:
html
代码解读
复制代码
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>Speech Recognition Example</title> </head> <body> <h1>Speech Recognition Example</h1> <button id="start-recognition">Start Recognition</button> <p id="result"></p> <script> // 检查浏览器是否支持 SpeechRecognition API const SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition; if (!SpeechRecognition) { alert('Your browser does not support Speech Recognition API'); } else { const recognition = new SpeechRecognition(); recognition.lang = 'en-US'; // 设置识别语言 recognition.interimResults = false; // 是否返回临时结果 recognition.maxAlternatives = 1; // 返回结果的最大数量 const startButton = document.getElementById('start-recognition'); const resultParagraph = document.getElementById('result'); startButton.addEventListener('click', () => { recognition.start(); }); recognition.addEventListener('result', (event) => { const transcript = event.results[0][0].transcript; resultParagraph.textContent = `You said: ${transcript}`; }); recognition.addEventListener('speechend', () => { recognition.stop(); }); recognition.addEventListener('error', (event) => { resultParagraph.textContent = `Error occurred in recognition: ${event.error}`; }); } </script> </body> </html>
2. Speech Synthesis API
Speech Synthesis API 用于将文本转换为语音输出。以下是一个基本的示例:
html
代码解读
复制代码
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>Speech Synthesis Example</title> </head> <body> <h1>Speech Synthesis Example</h1> <textarea id="text-to-speak" rows="4" cols="50">Hello, how are you?</textarea> <button id="speak-button">Speak</button> <script> const speakButton = document.getElementById('speak-button'); const textToSpeak = document.getElementById('text-to-speak'); speakButton.addEventListener('click', () => { const utterance = new SpeechSynthesisUtterance(textToSpeak.value); utterance.lang = 'en-US'; // 设置语音语言 utterance.pitch = 1; // 设置语音音调 utterance.rate = 1; // 设置语音速度 window.speechSynthesis.speak(utterance); }); </script> </body> </html>
综合示例
结合 Speech Recognition API 和 Speech Synthesis API,可以实现一个简单的语音助手:
html
代码解读
复制代码
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>Speech Assistant Example</title> </head> <body> <h1>Speech Assistant Example</h1> <button id="start-assistant">Start Assistant</button> <p id="assistant-result"></p> <script> const SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition; if (!SpeechRecognition) { alert('Your browser does not support Speech Recognition API'); } else { const recognition = new SpeechRecognition(); recognition.lang = 'en-US'; recognition.interimResults = false; recognition.maxAlternatives = 1; const startButton = document.getElementById('start-assistant'); const resultParagraph = document.getElementById('assistant-result'); startButton.addEventListener('click', () => { recognition.start(); }); recognition.addEventListener('result', (event) => { const transcript = event.results[0][0].transcript; resultParagraph.textContent = `You said: ${transcript}`; respondToSpeech(transcript); }); recognition.addEventListener('speechend', () => { recognition.stop(); }); recognition.addEventListener('error', (event) => { resultParagraph.textContent = `Error occurred in recognition: ${event.error}`; }); function respondToSpeech(transcript) { let response = ''; if (transcript.toLowerCase().includes('hello')) { response = 'Hello! How can I help you today?'; } else if (transcript.toLowerCase().includes('time')) { response = `The current time is ${new Date().toLocaleTimeString()}`; } else { response = 'Sorry, I did not understand that.'; } const utterance = new SpeechSynthesisUtterance(response); utterance.lang = 'en-US'; window.speechSynthesis.speak(utterance); } } </script> </body> </html>
代码讲解
- 检查浏览器支持:首先检查浏览器是否支持
SpeechRecognition
和SpeechSynthesis
API。 - 初始化识别和合成对象:创建
SpeechRecognition
和SpeechSynthesisUtterance
对象。 - 事件监听:为按钮添加点击事件监听器,开始语音识别。为
recognition
对象添加result
、speechend
和error
事件监听器,处理识别结果和错误。 - 响应语音输入:根据识别结果生成响应文本,并使用
SpeechSynthesis
API 将响应文本转换为语音输出。
通过上述示例,可以在网页中实现基本的语音识别和语音合成功能,使用户能够通过语音进行交互,提供更自然和直观的用户体验。