隐马尔可夫模型 _bak

1 篇文章 0 订阅
1 篇文章 0 订阅

1. 定义：

The Hidden Markov Model is a finite set of states , each of which is associated with a (generally multidimensional) probability distribution []. Transitions among the states are governed by a set of probabilities calledtransition probabilities. In a particular state an outcome or observation can be generated, according to the associated probability distribution. It is only the outcome, not the state visible to an external observer and therefore states are hidden'' to the outside; hence the name Hidden Markov Model.

• The number of states of the model, N .
• The number of observation symbols in the alphabet, M . If the observations are continuous thenM is infinite.
• A set of state transition probabilities .

where denotes the current state.
Transition probabilities should satisfy the normal stochastic constraints,

and

• A probability distribution in each of the states, .

where denotes the observation symbol in the alphabet, and the current parameter vector.
Following stochastic constraints must be satisfied.

and

If the observations are continuous then we will have to use a continuous probability density function, instead of a set of discrete probabilities. In this case we specify the parameters of the probability density function. Usually the probability density is approximated by a weighted sum of M Gaussian distributions ,

where,

should satisfy the stochastic constrains,

and

• The initial state distribution, .
where,

Therefore we can use the compact notation

to denote an HMM with discrete probability distributions, while

to denote one with continuous densities. .

2.一些假设：

For the sake of mathematical and computational tractability, following assumptions are made in the theory of HMMs.

(1)The Markov assumption 状态的马尔科夫假设，即当前状态只与前一个状态相关
As given in the definition of HMMs, transition probabilities are defined as,

In other words it is assumed that the next state is dependent only upon the current state. This is called the Markov assumption and the resulting model becomes actually a first order HMM.
However generally the next state may depend on past k states and it is possible to obtain a such model, called an order HMM by defining the transition probabilities as follows.

But it is seen that a higher order HMM will have a higher complexity. Even though the first order HMMs are the most common, some attempts have been made to use the higher order HMMs too.

(2)The stationarity assumption 稳定性假设，状态转换与时间无关
Here it is assumed that state transition probabilities are independent of the actual time at which the transitions takes place. Mathematically,

for any and .

(3)The output independence assumption 当前状态到观察状态的转换概率与已经发生的观察序列无关，即可以将观察序列分解为多个无关的步骤
This is the assumption that current output(observation) is statistically independent of the previous outputs(observations). We can formulate this assumption mathematically, by considering a sequence of observations,

. Then according to the assumption for an HMM ,

However unlike the other two, this assumption has a very limited validity. In some cases this assumption may not be fair enough and therefore becomes a severe weakness of the HMMs.

3.要解决的三个问题：

Once we have an HMM, there are three problems of interest.

(1)The Evaluation Problem 计算某一个观察序列在模型下的出现概率
Given an HMM and a sequence of observations , what is the probability that the observations are generated by the model, ?
(2)The Decoding Problem 根据观察到的序列，计算其最有可能对应的隐藏状态序列，即解码问题
Given a model and a sequence of observations , what is the most likely state sequence in the model that produced the observations?
(3)The Learning Problem 怎样改进这个模型，使得观察到的序列的概率最大化
Given a model and a sequence of observations , how should we adjust the model parameters in order to maximize

Evaluation problem can be used for isolated (word) recognition. Decoding problem is related to the continuous recognition as well as to the segmentation. Learning problem must be solved, if we want to train an HMM for the subsequent use of recognition tasks.

4. 估计观察序列的概率问题：

We have a model and a sequence of observations , and must be found. We can calculate this quantity using simple probabilistic arguments. But this calculation involves number of operations in the order of . This is very large even if the length of the sequence,T is moderate. Therefore we have to look for an other method for this calculation. Fortunately there exists one which has a considerably low complexity and makes use an auxiliary variable, calledforward variable .

The forward variable is defined as the probability of the partial observation sequence , when it terminates at the statei . Mathematically,

Then it is easy to see that following recursive relationship holds.

where,

Using this recursion we can calculate

and then the required probability is given by,

The complexity of this method, known as the forward algorithm is proportional to , which is linear wrtT whereas the direct calculation mentioned earlier, had an exponential complexity.

In a similar way we can define the backward variable as the probability of the partial observation sequence , given that the current state isi . Mathematically ,

As in the case of there is a recursive relationship which can be used to calculate efficiently.

where,

Further we can see that,

Therefore this gives another way to calculate , by using both forward and backward variables as given in eqn.1.7 .

Eqn. 1.7 is very useful, specially in deriving the formulas required for gradient based training.

5.解码问题：

In this case We want to find the most likely state sequence for a given sequence of observations, and a model,

The solution to this problem depends upon the way most likely state sequence'' is defined. One approach is to find the most likely state att =t and to concatenate all such ' 's. But some times this method does not give a physically meaningful state sequence. Therefore we would go for another method which has no such problems.
In this method, commonly known as Viterbi algorithm , the whole state sequence with the maximum likelihood is found. In order to facilitate the computation we define an auxiliary variable,

which gives the highest probability that partial observation sequence and state sequence up tot =t can have, when the current state isi .

It is easy to observe that the following recursive relationship holds.

where,

So the procedure to find the most likely state sequence starts from calculation of using recursion in1.8 , while always keeping a pointer to the winning state'' in the maximum finding operation. Finally the state , is found where

and starting from this state, the sequence of states is back-tracked as the pointer in each state indicates.This gives the required set of states.
This whole algorithm can be interpreted as a search in a graph whose nodes are formed by the states of the HMM in each of the time instant .

6.学习问题

• 0
点赞
• 0
收藏
• 0
评论
01-07 60
07-06 1959
04-12 562
04-09 2038
06-24 11万+

“相关推荐”对你有帮助么？

• 非常没帮助
• 没帮助
• 一般
• 有帮助
• 非常有帮助

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、C币套餐、付费专栏及课程。