Montreal Forced Aligner (MFA)基础使用教程

42 篇文章 0 订阅
5 篇文章 4 订阅

1、声学模型训练
https://montreal-forced-aligner.readthedocs.io/en/latest/aligning.html#trained-alignment
最新 2.0 版本:
https://montreal-forced-aligner.readthedocs.io/en/latest/user_guide/workflows/train_acoustic_model.html?highlight=mfa%20train

usage: mfa train [-h] [--config_path CONFIG_PATH] [-o OUTPUT_MODEL_PATH]
                 [-s SPEAKER_CHARACTERS] [-a AUDIO_DIRECTORY]
                 [--phone_set {AUTO,IPA,ARPA,PINYIN}]
                 [--output_format {short_textgrid,long_textgrid,json}]
                 [--include_original_text] [--train_g2p]
                 [-t TEMPORARY_DIRECTORY] [--disable_mp] [-j NUM_JOBS] [-v]
                 [-q] [--clean] [--overwrite] [--debug]
                 [--disable_textgrid_cleanup]
                 corpus_directory dictionary_path output_paths
                 [output_paths ...]

mfa train corpus_directory dictionary_path output_directory

其他参数比较正常,temp_directorynum_jobs 两项参数建议进行设置,num_jobs 参数在训练语料较大的情况下,多核机器可以很好的进行倍数加速,而temp_directory也可以防止home内存不够,导致异常。

2、其他后续补充

mfa align /data/xxxx/prepared_for_mfa/ /data/xxxx/lexicon.txt english /data/xxxx/output/ -t /data/xxxx/temp_files/ -j 20 --clean

corpus_directory
Full path to the directory to align

dictionary_path
Full path to pronunciation dictionary, or saved dictionary name (you can use mfa model download dictionary to get MFA dictionaries)

acoustic_model_path
Full path to pre-trained acoustic model, or saved model name (you can use mfa model download acoustic to get pretrained MFA models)

output_directory
Full path to output directory, will be created if it doesn’t exist

-h, --help
show this help message and exit

–config_path <config_path>
Path to config file to use for alignment

-s <speaker_characters>, --speaker_characters <speaker_characters>
Number of characters of file names to use for determining speaker, default is to use directory names

-a <audio_directory>, --audio_directory <audio_directory>
Audio directory root to use for finding audio files

–reference_directory <reference_directory>
Directory containing gold standard alignments to evaluate

–custom_mapping_path <custom_mapping_path>
YAML file for mapping phones across phone sets in evaluations

-t <temporary_directory>, --temp_directory <temporary_directory>, --temporary_directory <temporary_directory>
Temporary directory root to store MFA created files, default is /home/docs/Documents/MFA

–disable_mp
Disable any multiprocessing during alignment (not recommended), default is False

-j <num_jobs>, --num_jobs <num_jobs>
Number of data splits (and cores to use if multiprocessing is enabled), defaults is 3

-v, --verbose
Output debug messages, default is False

–clean
Remove files from previous runs, default is False

–overwrite
Overwrite output files when they exist, default is False

–debug
Run extra steps for debugging issues, default is False

–disable_textgrid_cleanup
Disable extra clean up steps on TextGrid output, default is False

–config_path 添加 config.yaml 文件

beam: 10
retry_beam: 40

features:
  type: "mfcc"
  use_energy: false
  use_pitch: true
  frame_shift: 10

training:
  - monophone:
      subset: 10000
      num_iterations: 50
      max_gaussians: 2000
      boost_silence: 1.25

  - triphone:
      subset: 20000
      num_iterations: 50
      num_leaves: 2000
      max_gaussians: 10000
      cluster_threshold: -1
      boost_silence: 1.25
      power: 0.25

  - lda:
      subset: 20000
      num_leaves: 4000
      max_gaussians: 15000
      num_iterations: 40

  - sat:
      subset: 50000
      num_leaves: 4200
      max_gaussians: 40000
      power: 0.2
      silence_weight: 0.2
      fmllr_update_type: "full"

  - pronunciation_probabilities:
      subset: 50000
      silence_probabilities: true

  - sat:
      subset: 150000
      num_leaves: 5000
      max_gaussians: 100000
      power: 0.2
      silence_weight: 0.20
      fmllr_update_type: "full"

  - pronunciation_probabilities:
      subset: 150000
      silence_probabilities: true
      optional: true # Skipped if the corpus is smaller than the subset

  - sat:
      subset: 0
      quick: true # Performs fewer fMLLR estimation
      num_iterations: 20
      num_leaves: 7000
      max_gaussians: 150000
      power: 0.2
      silence_weight: 0.2
      fmllr_update_type: "full"
      optional: true # Skipped if the corpus is smaller than the previous subset

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值