基于规则的语音合成中文文本前端设计

小湉湉

已于 2022-10-25 15:54:57 修改

阅读量509

点赞数

分类专栏：语音合成 MachineLearning 文章标签：深度学习人工智能语音识别

于 2022-10-25 15:52:07 首次发布

本文链接：https://blog.csdn.net/qq_21275321/article/details/127515024

版权

(以下内容搬运自 PaddleSpeech)

Chinese Rule-Based Text Frontend

A TTS system mainly includes three modules: Text Frontend, Acoustic model and Vocoder. We provide a complete Chinese text frontend module in PaddleSpeech TTS, see exapmles in examples/other/tn and examples/other/g2p.

A text frontend module mainly includes:

Text Segmentation
Text Normalization (TN)
Word Segmentation (mainly in Chinese)
Part-of-Speech
Prosody
G2P (Grapheme-to-Phoneme, include Polyphone and Tone Sandhi, etc.)
Linguistic Features/Charactors/Phonemes

• text: 90 后为中华人民共和国成立 70 周年准备了大礼
• Text Normalization: 九零后为中华人民共和国成立七十周年准备了大礼
• Word Segmentation: 九零后/为/中华人民/共和国/成立/七十/周年/准备/了/大礼
• G2P:
    jiu3 ling2 hou4 wei4 zhong1 hua2 ren2 min2 gong4 he2 guo2 ...
• Prosody (prosodic words #1, prosodic phrases #2, intonation phrases #3, sentence #4):
    九零后#1为中华人民#1共和国#2成立七十周年#3准备了大礼#4

Among them, Text Normalization and G2P are the most important modules. We mainly introduce them here.