kaldi教程_KALDI工具箱运行TIMIT语料库库实例教程

TIMIT数据库介绍:

TIMIT数据库由630个话者组成,每个人讲10句,美式英语的8种主要方言。

TIMIT S5实例:

首先,将TIMIT.ISO中的TIMIT复制到主文件夹。

1.进入对应的目录,进行如下操作:

zhangju@ubuntu :~$ cd kaldi-trunk/egs/timit/s5/

zhangju@ubuntu :~/kaldi-trunk/egs/timit/s5$

sudo local/timit_data_prep.sh /home/zhangju/TIMIT

会看到如下显示:

Creating coretest set.

MDAB0  MWBT0  FELC0  MTAS1  MWEW0  FPAS0  MJMP0  MLNT0  FPKT0  MLLL0  MTLS0  FJLM0  MBPM0  MKLT0  FNLP0  MCMJ0  MJDH0  FMGD0  MGRT0  MNJM0  FDHC0  MJLN0  MPAM0  FMLD0

# of utterances in coretest set = 192

Creating dev set.

FAKS0  FDAC1  FJEM0  MGWT0  MJAR0  MMDB1  MMDM2  MPDF0  FCMH0  FKMS0  MBDG0  MBWM0  MCSH0  FADG0  FDMS0  FEDW0  MGJF0  MGLB0  MRTK0  MTAA0  MTDT0  MTHC0  MWJG0  FNMR0  FREW0  FSEM0  MBNS0  MMJR0  MDLS0  MDLF0  MDVC0  MERS0  FMAH0  FDRW0  MRCS0  MRJM4  FCAL1  MMWH0  FJSJ0  MAJC0  MJSW0  MREB0  FGJD0  FJMG0  MROA0  MTEB0  MJFC0  MRJR0  FMML0  MRWS1

# of utterances in dev set = 400

Finalizing test

Finalizing dev

timit_data_prep succeeded.

于是在/home/zhangju/kaldi-trunk/egs/timit/s5文件夹下新生成data文件夹,其内包含local文件夹以及相关内容。

2 在终端输入:

local/timit_train_lms.sh data/local(下载、计算文本,用以建立语言模型)

local/timit_format_data.sh(处理与fst有关的东西)

3创建train的mfcc:

sudo steps/make_mfcc.sh data/train exp/make_mfcc/train mfccs 4

(要对train,dev,test创建)

会看到:

Succeeded creating MFCC features for train

sudo steps/make_mfcc.sh data/test exp/make_mfcc/test mfccs 4

会看到:

Succeeded creating MFCC features for test

sudo steps/make_mfcc.sh data/dev exp/make_mfcc/dev mfccs 4

会看到:

Succeeded creating MFCC features for dev

4训练单音素系统(monophone systom)

sudo steps/train_mono.sh data/train data/lang exp/mono

会显示:

Computing cepstral mean and variance statistics

Initializing monophone system.

Compiling training graphs

Pass 0

Pass 1

Aligning data

Pass 2

Aligning data

Pass 3

Aligning data

Pass 4

Aligning data

Pass 5

Aligning data

Pass 6

Aligning data

Pass 7

Aligning data

Pass 8

Aligning data

Pass 9

Aligning data

Pass 10

Aligning data

Pass 11

Pass 12

Aligning data

Pass 13

Pass 14

Pass 15

Aligning data

Pass 16

Pass 17

Pass 18

Pass 19

Pass 20

Aligning data

Pass 21

Pass 22

Pass 23

Pass 24

Pass 25

Aligning data

Pass 26

Pass 27

Pass 28

Pass 29

于是,新建了exp/mono文件夹

scripts/mkgraph.sh --mono data/lang exp/mono exp/mono/graph(制图)

会显示:

fsttablecompose data/lang/L.fst data/lang/G.fst

fstdeterminizestar --use-log=true

fstminimizeencoded

fstisstochastic data/lang/tmp/LG.fst

-0.000244359 -0.0912761

warning: LG not stochastic.

fstcomposecontext --context-size=1 --central-position=0 --read-disambig-syms=data/lang/tmp/disambig_phones.list --write-disambig-syms=data/lang/tmp/disambig_ilabels_1_0.list data/lang/tmp/ilabels_1_0

fstisstochastic data/lang/tmp/CLG_1_0.fst

-0.000244359 -0.0912761

warning: CLG not stochastic.

make-h-transducer --disambig-syms-out=exp/mono/graph/disambig_tid.list --transition-scale=1.0 data/lang/tmp/ilabels_1_0 exp/mono/tree exp/mono/final.mdl

fstminimizeencoded

fstdeterminizestar --use-log=true

fsttablecompose exp/mono/graph/Ha.fst data/lang/tmp/CLG_1_0.fst

fstrmsymbols exp/mono/graph/disambig_tid.list

fstrmepslocal

fstisstochastic exp/mono/graph/HCLGa.fst

0.000331581 -0.091291

HCLGa is not stochastic

add-self-loops --self-loop-scale=0.1 --reorder=true exp/mono/final.mdl

5.for test in dev test ; do

steps/decode_deltas.sh exp/mono data/$test data/lang exp/mono/decode_$test &

done(解码test数据集(test是*/s5/data中dev、test文件夹中的test文件夹))

终端输出结果是:[1] 2307

[2] 2308

6.scripts/average_wer.sh exp/mono/decode_*/wer > exp/mono/wer

会显示:

[1]-  完成steps/decode_deltas.sh exp/mono data/$test data/lang exp/mono/decode_$test

[2]+  完成steps/decode_deltas.sh exp/mono data/$test data/lang exp/mono/decode_$test

7从单音素系统中获得alignments:(分别从mono文件夹中的train,dev,test中获得)(用以训练其他系统)

steps/align_deltas.sh data/train data/lang exp/mono exp/mono_ali_train

会显示:

Computing cepstral mean and variance statistics

Aligning all training data

Done.

方法二:修改run.sh中的timit路径,但后直接运行run.sh

TIMIT S3实例

1 数据准备,输入:

local/timit_data_prep.sh  /home/zhangju/TIMIT

终端显示:

Creating coretest set.

MDAB0  MWBT0  FELC0  MTAS1  MWEW0  FPAS0  MJMP0  MLNT0  FPKT0  MLLL0  MTLS0  FJLM0  MBPM0  MKLT0  FNLP0  MCMJ0  MJDH

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值