鸿蒙APP开发-文本转语音

最新推荐文章于 2025-03-05 11:56:25 发布

光圈Science

最新推荐文章于 2025-03-05 11:56:25 发布

阅读量2.2k

点赞数 37

分类专栏：鸿蒙APP开发功能点文章标签： harmonyos 华为音视频

本文链接：https://blog.csdn.net/m0_46364942/article/details/144276981

版权

鸿蒙APP开发功能点专栏收录该内容

12 篇文章

订阅专栏

鸿蒙提供了文本转语音服务，提供将文本信息转换为语音并进行播报的能力，便于用户与设备进行互动，实现实时语音交互，文本播报。

相比于语音识别、录音等文本转语音功能的实现相对简单。

创建引擎

通过textToSpeech模块的createEngine方法可创建文本转语音引擎，同时需要传入相关配置，不过目前支持配置较少。

//创建文本转语音引擎
let tts = await textToSpeech.createEngine({
  language: 'zh-CN', //目前仅支持中文
  person: 0, //0为聆小珊女声音色，当前仅支持聆小珊女声音色
  online: 1   //在线模式，目前仅支持离线1
})

配置事件回调

文本转语音过程中会触发多种事件，我们可以通过配置事件回调实现在对应事件发生时执行自己的逻辑。若是简单使用，这里直接从官网获取即可，无需多余操作。

//设置监听事件执行逻辑
tts.setListener({
  // 开始播报回调
  onStart(requestId: string, response: textToSpeech.StartResponse) {
    console.info(`onStart, requestId: ${requestId} response: ${JSON.stringify(response)}`);
  },
  // 合成完成及播报完成回调
  onComplete(requestId: string, response: textToSpeech.CompleteResponse) {
    console.info(`onComplete, requestId: ${requestId} response: ${JSON.stringify(response)}`);
  },
  // 停止播报回调
  onStop(requestId: string, response: textToSpeech.StopResponse) {
    console.info(`onStop, requestId: ${requestId} response: ${JSON.stringify(response)}`);
  },
  // 返回音频流
  onData(requestId: string, audio: ArrayBuffer, response: textToSpeech.SynthesisResponse) {
    console.info(`onData, requestId: ${requestId} sequence: ${JSON.stringify(response)} audio: ${JSON.stringify(audio)}`);
  },
  // 错误回调
  onError(requestId: string, errorCode: number, errorMessage: string) {
    console.error(`onError, requestId: ${requestId} errorCode: ${errorCode} errorMessage: ${errorMessage}`);
  }
})

启动播报和停止播报

这里将引擎赋值给了状态变量，启动播报调用引擎的speak即可，传入播报内容，和requestId（合成播报ID，全局不允许重复）

this.tts?.speak(this.message, { requestId: this.requestId })

停止播报更加简单：

// 停止文本转语音
this.tts?.stop()

最小实现案例

完整代码如下：

import { textToSpeech } from '@kit.CoreSpeechKit';

@Entry
@Component
struct TextToSpeech {
  @State message: string = 'Hello World, 你好 世界';
  tts: textToSpeech.TextToSpeechEngine | undefined // 文本转语音引擎
  requestId: string = '123' //合成播报ID，全局不允许重复

  async aboutToAppear(): Promise<void> {
    //创建文本转语音引擎
    let tts = await textToSpeech.createEngine({
      language: 'zh-CN', //目前仅支持中文
      person: 0, //0为聆小珊女声音色，当前仅支持聆小珊女声音色
      online: 1   //在线模式，目前仅支持离线1
    })
    //设置监听事件执行逻辑
    tts.setListener({
      // 开始播报回调
      onStart(requestId: string, response: textToSpeech.StartResponse) {
        console.info(`onStart, requestId: ${requestId} response: ${JSON.stringify(response)}`);
      },
      // 合成完成及播报完成回调
      onComplete(requestId: string, response: textToSpeech.CompleteResponse) {
        console.info(`onComplete, requestId: ${requestId} response: ${JSON.stringify(response)}`);
      },
      // 停止播报回调
      onStop(requestId: string, response: textToSpeech.StopResponse) {
        console.info(`onStop, requestId: ${requestId} response: ${JSON.stringify(response)}`);
      },
      // 返回音频流
      onData(requestId: string, audio: ArrayBuffer, response: textToSpeech.SynthesisResponse) {
        console.info(`onData, requestId: ${requestId} sequence: ${JSON.stringify(response)} audio: ${JSON.stringify(audio)}`);
      },
      // 错误回调
      onError(requestId: string, errorCode: number, errorMessage: string) {
        console.error(`onError, requestId: ${requestId} errorCode: ${errorCode} errorMessage: ${errorMessage}`);
      }
    })
    this.tts = tts
  }

  /**
   * 启动文本转语音
   */
  async start() {
    // 启动文本转语音
    this.tts?.speak(this.message, { requestId: this.requestId })
  }

  /**
   * 停止文本转语音
   */
  stop() {
    // 停止文本转语音
    this.tts?.stop()
  }

  build() {
    Column() {
      Text(this.message)
      Button('朗读')
        .onClick(() => {
          this.start()
        })
      Button('停止')
        .onClick(() => {
          this.stop()
        })
    }
    .height('100%')
    .width('100%')
  }
}

进阶

除了基本的文本转语音的功能，我们还可以有更多的定制，例如：

onStart回调中：可获取请求ID、播报相关参数，例如通道数、采样率、采样位数信息。
onData回调中：可获取音频流信息，音频附加信息如格式、时长等。若需要返回音频流信息，或生成音频，可借此实现。

除此之外，我们还可以通过特殊的文本配置实现不同的播报效果：

设置单词播报方式：文本格式：hN
- 0：智能判断播放单词播放方式（默认）
- 1：逐个字母播报
- 2：以单词方式播报
- 例如 “hello[h1] world”
设置数字播报方式：文本格式：hN
- 0：智能判断（默认）
- 1：作为号码逐个播报
- 2：作为数值播报
插入静音停顿：文本格式：[pN]
- N为无符号整数，单位为ms，表示此处停顿N毫秒
指定汉字发音：格式：[=MN]
- M为汉字拼音，N表示声调
- N的取值1-5
- 1：阴平
- 2：阳平
- 3：上声
- 4：去声
- 5：轻声