语音识别之HTK入门(五)——HCompV总结

 

2020年10月14日更新

前面几篇博客详细解释了HCompV工具是如何初始化一个全局的HMM模型集合,刚接触的人可能会比较乱,我也是。现在做个总结,试图做到“心中有丘壑”。主要是看它涉及的重要数据结构和函数,这也是C语言这种过程式语言主要的思维方式。

首先,我觉得最重要的数据结构就是sturct _HMMSet:

/* ---------------------- HMM Sets ----------------------------- */

typedef struct _HMMSet{
   MemHeap *hmem;          /* memory heap for this HMM Set */   
   Boolean *firstElem;     /* first element added to hmem during MakeHMMSet*/
   char *hmmSetId;         /* identifier for the hmm set */
   MILink mmfNames;        /* List of external file names */
   int numLogHMM;          /* Num of logical HMM's */
   int numPhyHMM;          /* Num of distinct physical HMM's */
   int numFiles;           /* total number of ext files */
   int numMacros;          /* num macros used in this set */
   MLink * mtab;           /* Array[0..MACHASHSIZE-1]OF MLink */
   PtrMap ** pmap;         /* Array[0..PTRHASHSIZE-1]OF PtrMap* */
   Boolean allowTMods;     /* true if HMMs can have Tee Models */
   Boolean optSet;         /* true if global options have been set */
   short vecSize;          /* dimension of observation vectors */
   short swidth[SMAX];     /* [0]=num streams,[i]=width of stream i */
   ParmKind pkind;         /* kind of obs vector components */
   DurKind dkind;          /* kind of duration model (model or state) */
   CovKind ckind;          /* cov kind - only global in V1.X */
   HSetKind hsKind;        /* kind of HMM set */
   TMixRec tmRecs[SMAX];   /* array[1..S]of tied mixture record */
   int numStates;          /* Number of states in HMMSet */
   int numSharedStates;    /* Number of shared states in HMMSet */
   int numMix;             /* Number of mixture components in HMMSet */
   int numSharedMix;       /* Number of shared mixtures in HMMSet */
   int numTransP;          /* Number of distinct transition matrices */
   int ckUsage[NUMCKIND];  /* Number of components using given ckind */
   InputXForm *xf;         /* Input transform of HMMSet */
   AdaptXForm *semiTied;   /* SemiTied transform associated with model set */
   short projSize;         /* dimension of vector to update */

   /* Adaptation information accumulates */
   Boolean attRegAccs;   /* have the set of accumulates been attached */
   Boolean attXFormInfo; /* have the set of adapt info been attached */
   Boolean attMInfo;     /* have the set of adapt info been attached */
   AdaptXForm *curXForm;
   AdaptXForm *parentXForm;
   
   /* Added to support LogWgts */
   Boolean logWt;       /* Component weights are stored as Logs */

   /* Added to support delayed loading of the semi-tied transform */
   char *semiTiedMacro;  /* macroname of semi-tied transform */

} HMMSet;

它提供了了解整个HMM模型的地图,包括整个隐马尔可夫模型集合包括多个逻辑HMM、物理HMM、每个HMM包括多少个状态、观察向量的维度是多少,还包括两个hash表,mtab和pmap,里面存储了HMMdef和macro的信息。这个结构有个比较让人迷惑的地方是对data stream的定义,就是宏定义#define SMAX 5,说是data stream个数。但是到目前为止,我还没搞明白它是什么意思。通过调试程序发现目前只有一个data stream,可以认为凡是SMAX出现的地方,用它定义的数组,只有第一个下标有意义。

例如HMMSet结构有一个重要的数据项:TMixRec tmRecs[SMAX];   /* array[1..S]of tied mixture record */,它是一个长度为SMAX数组,其实只有第一数据有意义,包含的对象是TMixRec。

typedef struct {        /* A Tied Mixture "Codebook" */
   LabId mixId;          /* id of macro base name */
   short nMix;           /* num mixtures M in set */
   short topM;           /* num TMProbs actually used */
   MixPDF ** mixes;      /* array[1..M] of MixPDF */
   LogFloat maxP;        /* max log mixture prob */
   TMProb *probs;        /* array[1..M] of TMProb */
} TMixRec;

当HMM模型通过混合高斯来建模每个状态的发射概率时,nMix就是高斯的个数,mixes就是指向高斯概率密度的指针数组。

CreateHMMSet函数的主要功能就是初始化HMMSet对象hset。可以看一下它的内容,其实非常简单,就是给hset对象的数据项设置初值。

/* EXPORT->CreateHMMSet: create the basic HMMSet structure */
void CreateHMMSet(HMMSet *hset, MemHeap *heap, Boolean allowTMods)
{
   int s;

   /* set default values in hset structure */
   hset->hmem = heap;
   hset->hmmSetId = NULL;
   hset->mmfNames = NULL; hset->numFiles = 0;
   hset->allowTMods = allowTMods;  hset->optSet = FALSE;
   hset->vecSize = 0; hset->swidth[0] = 0;
   hset->dkind = NULLD; hset->ckind = NULLC; hset->pkind = 0;
   hset->numPhyHMM = hset->numLogHMM = hset->numMacros = 0;
   hset->xf = NULL; hset->logWt = FALSE;
   for (s=1; s<SMAX; s++) {
      hset->tmRecs[s].nMix = 0; hset->tmRecs[s].mixId = NULL;
      hset->tmRecs[s].probs = NULL; hset->tmRecs[s].mixes = NULL;
   }
   /* initialise the hash tables */
   hset->mtab = (MLink *)MakeHashTab(hset,MACHASHSIZE);
   hset->pmap = NULL;
   /* initialise adaptation information */
   hset->attRegAccs = FALSE;
   hset->attXFormInfo = FALSE;
   hset->attMInfo = FALSE;
   hset->curXForm = NULL;
   hset->parentXForm = NULL;
   hset->semiTiedMacro = NULL;
   hset->semiTied = NULL;
   hset->projSize = 0;
}

HCompV的主函数继续往下,就是一些命令行参数的处理,包括-f -m等等参数的意义,设置全局的开关量等。

经过初始化HMMSet之后,只是有了一个壳,里面具体的参数要通过模型描述文件来配置。读取配置文件,并赋值参数就是下面这个函数做的事了。

/* Initialise: load HMMs and create accumulators */
void Initialise(void)
{
   int s,V;
   Boolean eSep;
   char base[MAXSTRLEN];
   char path[MAXSTRLEN];
   char ext[MAXSTRLEN];

   /* Load HMM defs */     
   if(MakeOneHMM(&hset,BaseOf(hmmfn,base))<SUCCESS)
      HError(2028,"Initialise: MakeOneHMM failed");
   if(LoadHMMSet(&hset,PathOf(hmmfn,path),ExtnOf(hmmfn,ext))<SUCCESS)
      HError(2028,"Initialise: LoadHMMSet failed");
   SetParmHMMSet(&hset);
   if (hset.hsKind==DISCRETEHS || hset.hsKind==TIEDHS)
      HError(2030,"Initialise: HCompV only uses continuous models");

   /* Create a heap to store the input data */
   CreateHeap(&iStack,"InBuf", MSTAK, 1, 0.5, 100000, LONG_MAX);
   
   /* Get a pointer to the physical HMM */
   hmmId = GetLabId(base,FALSE);
   macroLink = FindMacroName(&hset,'h',hmmId);
   if (macroLink==NULL)
      HError(2020,"Initialise: cannot find hmm %s in hset",hmmfn);
   hmmLink = (HLink)macroLink->structure;

   /* Find out for which streams full covariance is needed */
   CheckVarianceKind( );

   /* Create accumulators for the mean and variance */
   for (s=1;s<=hset.swidth[0]; s++){
      V = hset.swidth[s];
      accs[s].meanSum=CreateVector(&gstack,V);
      ZeroVector(accs[s].meanSum);
      if (fullcNeeded[s]) {
         accs[s].squareSum.inv=CreateSTriMat(&gstack,V);
         accs[s].fixed.inv=CreateSTriMat(&gstack,V);
         ZeroTriMat(accs[s].squareSum.inv);
      }
      else {
         accs[s].squareSum.var=CreateSVector(&gstack,V);
         accs[s].fixed.var=CreateSVector(&gstack,V);
         ZeroVector(accs[s].squareSum.var);
      }
   }

   /* Create an object to hold the input parameters */
   SetStreamWidths(hset.pkind,hset.vecSize,hset.swidth,&eSep);
   obs=MakeObservation(&gstack,hset.swidth,hset.pkind,FALSE,eSep);
   if(segLab != NULL) {
      segId = GetLabId(segLab,TRUE);
   }

   if (trace&T_TOP) {
      printf("Calculating Fixed Variance\n");
      printf("  HMM Prototype: %s\n",hmmfn);
      printf("  Segment Label: %s\n",(segLab==NULL)?"None":segLab);
      printf("  Num Streams  : %d\n",hset.swidth[0]);
      printf("  UpdatingMeans: %s\n",(meanUpdate)?"Yes":"No");
      printf("  Target Direct: %s\n",(outDir==NULL)?"Current":outDir);     
   }
}

其中MakeOneHMM(&hset,BaseOf(hmmfn,base)函数调用的参数hmmfn就是之前命令行参数处理得到的模型描述文件名。hmmfn的值为“proto”。

下面就来看看它是如何把proto模型加入到hset中的。其实它就调用了InitHMMSet函数。

/* InitHMMSet: Init a HMM set by reading the HMM list in fname.
               If isSingle, then fname is the name of a single HMM */
static ReturnStatus InitHMMSet(HMMSet *hset, char *fname, Boolean isSingle)
{
   Source src;
   char buf[MAXSTRLEN];
   LabId lId, pId;
    
   /* sets first element on heap to allow disposing of memory */
   hset->firstElem = (Boolean *) New(hset->hmem, sizeof(Boolean));

   if (isSingle){
      /* fname is a single HMM file to load as a singleton set */
      lId = GetLabId(fname,TRUE);
      pId = lId;
      if(CreateHMM(hset,lId,pId)<SUCCESS){
         HRError(7060,"InitHMMSet: Error in CreateHMM", hset->numLogHMM);
         return(FAIL);   
      }
   } 
   else {
      /* read the HMM list file and build the logical list */
      ......
      ......
   }
   return(SUCCESS);
}

isSingle值为True,所以这段代码其实很短,本质上就调用了CreateHMM。

/* CreateHMM: create logical macro. If pId is unknown, create macro for
              it too, along with HMMDef.  */
static ReturnStatus CreateHMM(HMMSet *hset, LabId lId, LabId pId)
{
   MLink m;
   HLink hmm;
   Boolean newMacro=FALSE; /* for memory clear up*/

   m = FindMacroName(hset,'l',lId);
   if (m != NULL){
      HRError(7036,"CreateHMM: multiple use of logical HMM name %s",lId->name);
      return(FAIL);
   }

   m = FindMacroName(hset,'h',pId);
   if (m == NULL) {  /* need to create new phys macro and HMMDef */
      hmm = (HLink)New(hset->hmem,sizeof(HMMDef));
      hmm->owner = hset; hmm->nUse = 1; hmm->hook = NULL;
      hmm->numStates = 0;  /* indicates HMMDef not yet defined */
      if((m=NewMacro(hset,0,'h',pId,hmm))==NULL){
         HRError(7091,"CreateHMM: NewMacro (Physical) failed"); /*will never happen*/
         return(FAIL);
      } 
      newMacro=TRUE;
      if (pId != lId) ++hmm->nUse;
   } 
   else {
      if (m->type != 'h'){
         HRError(7091,"CreateHMM: %s is not a physical HMM",pId->name);
         return(FAIL);
      }
      hmm = (HLink)m->structure;
      ++hmm->nUse;
   }
   if(NewMacro(hset,0,'l',lId,hmm)==NULL){ 
      HRError(7091,"CreateHMM: NewMacro (Logical) failed"); /*will never happen*/
      return(FAIL);
   }    
   return(SUCCESS);
}

函数调用FindMacroName(hset,'l',lId)和FindMacroName(hset,'h',pId)就是在前面提到的HMMSet结构中hash表中查找是否已经存在该模型了。因为之前只初始化HMMSet对象hset,并没有添加任何模型,所以m == null执行如下代码:

   m = FindMacroName(hset,'h',pId);
   if (m == NULL) {  /* need to create new phys macro and HMMDef */
      hmm = (HLink)New(hset->hmem,sizeof(HMMDef));
      hmm->owner = hset; hmm->nUse = 1; hmm->hook = NULL;
      hmm->numStates = 0;  /* indicates HMMDef not yet defined */
      if((m=NewMacro(hset,0,'h',pId,hmm))==NULL){
         HRError(7091,"CreateHMM: NewMacro (Physical) failed"); /*will never happen*/
         return(FAIL);
      } 
      newMacro=TRUE;
      if (pId != lId) ++hmm->nUse;
   } 

而pId是LabId类型,它是指向NameCell的指针。NameCell就是HMMSet中mtab的元素,主要包含name。这里pId和LId的name都是“proto”。这部分代码就是在hset中添加了一个“h”(phyhmm)型的模型,首先分配了一个HMMDef的对象,并初始化它,然后把它添加到hset的mtab链表中(这部分在NewMacro函数中完成)。

/* EXPORT-> NewMacro: append a macro with given values to list */
MLink NewMacro(HMMSet *hset, short fidx, char type, LabId id, Ptr structure)
{
   unsigned int hashval;
   MLink m;
   PtrMap *p;

   m = FindMacroName(hset,type,id);
   if (m != NULL){
     /* 
	Create special exception for allowing multiple definitions of
	the same macro. These are due to flexibility in xform code.
	Exceptional macros are:
	'a' adaptation transform definition
	'b' baseclass definition
	'r' regression class tree definition
	All these macros have an associated fname field that states
	the filename that they were loaded from. Note that the memory
	from the structure is not freed.
     */
     switch (type) { 
     case 'a':
       if (!(strcmp(((AdaptXForm *)m->structure)->fname,((AdaptXForm *)structure)->fname))) {
	 HRError(7036,"Duplicate copy of ~a macro %s loaded from %s",id->name,((AdaptXForm *)m->structure)->fname);
	 return m;
       }
       break;
     case 'b':
       if (!(strcmp(((BaseClass *)m->structure)->fname,((BaseClass *)structure)->fname))) {
	 HRError(7036,"Duplicate copy of ~b macro %s loaded from %s",id->name,((BaseClass *)m->structure)->fname);
	 return m;
       } else {
	 HRError(7036,"WARNING: Duplicate copies of ~b macro %s loaded from %s and %s",
                 id->name,((BaseClass *)m->structure)->fname,((BaseClass *)m->structure)->fname);
	 return m;
       }
       break;
     case 'r':
       if (!(strcmp(((RegTree *)m->structure)->fname,((RegTree *)structure)->fname))) {
	 HRError(7036,"Duplicate copy of ~r macro %s loaded from %s",id->name,((RegTree *)m->structure)->fname);
	 return m;
       }
       break;
     default:
       break;
     }
     HError(7036,"NewMacro: macro or model name %s already exists",id->name);
   }

   m = (MLink)New(hset->hmem,sizeof(MacroDef));
   if (type == 'h')
      ++hset->numPhyHMM;
   else if (type == 'l'){
      ++hset->numLogHMM;
   }
   else
      ++hset->numMacros;
   hashval = Hash(id->name);
   m->type = type; m->fidx = fidx;
   m->id = id;
   m->structure = structure;
   m->next = hset->mtab[hashval]; hset->mtab[hashval] = m;
   if (hset->pmap != NULL) {
      hashval = (unsigned int)m->structure % PTRHASHSIZE;
      p = (PtrMap *)New(hset->hmem,sizeof(PtrMap));
      p->ptr = m->structure; p->m = m; 
      p->next = hset->pmap[hashval]; hset->pmap[hashval] = p;
   }
   return m;
}

到现在为止,基本现在逻辑比较清楚了。

1)主函数调用CreateHMMSet,初始化一个hset隐马尔可夫模型集合对象。

2)初始化一个hmm对象,并添加到hset中去,主要在Initialise函数中加载该模型的参数。

     Initialise--->MakeOneHMM--->InitHMMSet--->CreateHMM--->(构建hmm,NewMacro)添加模型。

3)然后是加载其他hmm模型。     

      

HCompV

借由分析HCompV源代码,顺带简单梳理了下HTK主要的数据结构和核心的模型初始化、加载的函数。因为HMM本身的复杂性,以及开发人员为了效率的考虑将HTK的数据结构设计得比较复杂。要对HTK有个全貌而细致的了解,还要假以时日。这个过程中肯定会伴随多次的总结。作为第一阶段的目标,就是训练一个实验用的电话拨号系统,能识别出预定的语音句子就完成任务了。因此不在代码的细节纠缠,否则可能连这个初步任务都完成不了。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值