语言-英语翻译（tutorial on hmm and applications）

最新推荐文章于 2023-11-23 15:18:41 发布

Hans.liang

最新推荐文章于 2023-11-23 15:18:41 发布

阅读量1.1k

点赞数

分类专栏：各种语言

本文链接：https://blog.csdn.net/lhm1019/article/details/79864010

版权

本文深入探讨隐马尔科夫模型（HMM）的基础知识，包括其数学原理和算法实现，并结合实例展示了HMM在自然语言处理中的广泛应用，如语音识别、机器翻译等。

摘要由CSDN通过智能技术生成

	real-world processes generally produce observable outputs which can be characterized as signals 真实世界的过程通常产生可以表征为信号的可观察输出
	the signals can be discrete in nature(e.g,characters from a finite alphabet,quantized vectors from a codebook,etc.),or continuous in nature(e.g.,speech samples,temperature measurements,music,etc.) 信号本质上可以是离散的（例如，来自有限字母表的字符，来自码本的量化矢量等），或者本质上是连续的（例如，语音采样，温度测量，音乐等）
	the signal source can be stationary(i.e.,its statistical properties do not vary with time),or nonstationary(i.e.,the signal properties vary over time). 信号源可以是静止的（即，其统计特性不随时间变化），或者非平稳（即，信号特性随时间变化）。
	the signals can be pure(i.e.,coming strictly from a single source),or can be corrupted from other signal sources(e.g.,noise) or by transmission distortions,reververation,etc. 信号可以是纯粹的（即，严格来自单个源），或者可以从其他信号源（例如噪声）或由传输失真，混响等破坏。
	a problem of fundamental interest is characterizing such real-world signals in terms of signal models. 根本兴趣的问题是用信号模型表征这样的现实世界信号
	there are several reasons why one is interested in applying signal models 为什么有兴趣应用信号模型有几个原因
	first of all,a signal model can provide the basis for a theoretical description of a signal processing system which can be used to process the signal so as to provide a desired output 首先，信号模型可以为信号处理系统的理论描述提供基础，该信号处理系统可以用来处理信号以提供期望的输出
	for example if we are interested in enhancing a speech signal corrupted by noise and transmission distortion,we can use the signal model to design a system which will optimally remove the noise and undo the transmission distortion 例如，如果我们对增强受噪声和传输失真破坏的语音信号感兴趣，我们可以使用信号模型来设计一个系统，该系统将最优地消除噪声并消除传输失真.
	a second reason why signal models are important is that they are potentially capable of letting us learn a great deal about the signal source(i.e.,the real-world process which produced the signal) without having to have the source available. 信号模型重要的第二个原因是它们有可能让我们在不需要信源的情况下了解信号源（即产生信号的真实世界过程）的大量信息。
	this property is especially important when the cost of getting signals from the actual source is high. 当从实际信号源获取信号的成本很高时，此特性尤其重要。
	in this case ,with a good signal model,we can simulate the source and learn as much as possible via simulations. 在这种情况下，通过一个好的信号模型，我们可以模拟源并通过模拟尽可能多地学习。
	finally,the most important reason why signal models are important is that they often work extremely well in pracitce,and enable us to realize important practical systems-e.g.,prediction systems,recognition systems,identification systems,etc,in a very efficient manner. 最后，信号模型的重要性最重要的原因是它们在实践中经常工作得非常好，并且使我们能够以非常有效的方式实现重要的实际系统，例如预测系统，识别系统，识别系统等。
	these are several possible choices for what type of signal model is used for characterizing the properties of a given signal. 对于使用什么类型的信号模型来表征给定信号的特性，这些是几种可能的选择。
	broadly one can dichotomize the types of signal models into the class of deterministic the types of signal models into the class of deterministic models,and the class of statistical models. 从广义上讲，人们可以将信号模型的类型分为确定性类型的信号模型类型和确定性模型类别以及统计模型类别。
	deterministic models generally exploit some known specific properties of the signal,e.g.,that the signal is a sine wave,or a sum of exponentials,etc. 确定性模型通常利用信号的一些已知的特定属性，例如信号是正弦波，或者指数的总和等。
	in these cases,specification of the signal model is generally straightforward;all that is required is to determine(estimate) values of the parameters of the signal model(e.g.,amplitude,frequency,phase of a sine wave,amplitudes and rates of exponentials,etc.) 在这些情况下，信号模型的规范通常是直截了当的;所需要的只是确定（估计）信号模型的参数值（例如，幅度，频率，正弦波的相位，指数的幅度和速率，等等。）
	the second broad class of signal models is the set of statistical models in which one tries to characterize only the statistical properties of the signal. 第二大类的信号模型是一组统计模型，其中人们只试图表征信号的统计特性。
	examples of such statistical models include Gaussian processes,Poisson processes,Markov processes,and hidden Markov processes,among others. 这样的统计模型的例子包括高斯过程，泊松过程，马尔科夫过程和隐马尔可夫过程等等。
	the underlying assumption of the statistical model is that the signal can be well characterized as a parametric random process,and that the parameters of the stochastic process can be determined(estimated)in a precise,well-defined manner. 统计模型的基本假设是信号可以很好地表征为参数随机过程，并且随机过程的参数可以以精确的，明确定义的方式确定（估计）。
	for the applications of interest,namely speech processing,both deterministic and stochastic signal models have had good success. 对于感兴趣的应用，即语音处理，确定性和随机信号模型都取得了很好的成功。
	in this paper we will concern ourselves strictly with one type of stochastic signal model,namely the hidden markov model.these models are referred to as markov sources or probabilistic functions of markov chains in the communications literature. 在本文中，我们将严格地关注一种随机信号模型，即隐马尔可夫模型。这些模型被称为马尔可夫链或通信文献中的马尔可夫链的概率函数。
	we will first review the theory of markov chains and then extend the ideas to the class of hidden markov models using serveral simple examples. 我们首先回顾马尔可夫链的理论，然后用一些简单的例子将这些思想扩展到隐马尔可夫模型的类。
	we will then focus our attention on the three fundamental problems for hmm design,namely:the evaluation of the probability(or likelihood) of a sequence of observations given a specific hmm; the determination of a best sequence of model states;and the adjustment of model parameters so as to best account for the observed signal. 我们将把注意力集中在HMM设计的三个基本问题上，即：评估给定具体HMM的观察序列的概率（或可能性）; 确定模型状态的最佳序列;以及调整模型参数以便最好地解释观察到的信号。
	we will show that once these three fundamental problems are solved,we can apply hmms to selected problems in speech recognition. 我们将证明，一旦这三个基本问题得到解决，我们就可以将hmms应用于语音识别中的选定问题。
	the second reason was that the original applications of the theory to speech processing did not provide sufficient tutorial material for most readers to understand the theory and to be able to apply it to their own research. 第二个原因是该理论在语音处理中的原始应用并未为大多数读者提供足够的指导材料来理解该理论并能够将其应用到他们自己的研究中。
	as a result ,several tutorial papers were written which provided a sufficient level of detail for a number of research labs to begin work using hmms in individual speech processing applications 因此，编写了几篇教程，为许多研究实验室提供了足够的详细信息，以开始在单独的语音处理应用中使用hmms
	the paper combines results from a number of original sources and hopefully provides a single source for acquiring the background required to pursue further this fascinating area of research. 该论文结合了许多原始资料的结果，并希望为获得进一步研究这一迷人领域所需的背景提供单一来源。
	the organization of this paper is as follows. 本文的结构如下
	in section ii we review the theory of discrete markov chains and show how the concept of hidden states,where the observation is a probabilistic function of the state,can be used effectively. 在第二节中，我们回顾了离散马尔可夫链理论，并展示了如何有效地使用隐含状态的概念，其中观察是状态的概率函数。
	we illustrate the theory with two simple examples,namely coin-tossing,and the classic balls-in-urns system. 我们用两个简单的例子来说明理论，即硬币投掷和经典的球投入系统
	in section iv we discuss the various types of hmms that have been studied including ergodic as well as left-right models. 在第四节中，我们讨论了已经研究的各种类型的HMM，包括遍历和左右模型。
	the state duration density,and the optimization criterion for choosing optimal hmm parameter values. 状态持续时间密度以及用于选择最佳hmm参数值的优化标准。
	in section v we discuss the issues that arise in implementing hmms including the topics of scaling,initial parameter estimates,model size,model form,missing data,and multiple observation sequences. 在第五节中，我们讨论实施hmms时出现的问题，包括缩放，初始参数估计，模型大小，模型形式，缺失数据和多观察序列等主题。
	in section vi we describe an isolated word speech recognizer,implemented with hmm ideas,and show how it performs as compared to alternative implementations. 在第六节中，我们描述了一个孤立的单词语音识别器，实现了很多想法，并展示了它与其他实现相比的表现。
	in section vii we extend the ideas presented in section vi to the problem of recognizing a string of spoken words based on concatenating individual hmms of each word in the vocabulary. 在第vii节中，我们将第vi节中提出的思想延伸到基于连接词汇表中每个词的单个词的识别一串口语词的问题。
	in section viii we briefly outline how the ideas of hmm have been applied to a large vocabulary speech recognizer,and in section ix we summarize the ideas discussed throughout the paper. 在第viii节中，我们简要地概述了HMM的思想如何应用于大型词汇语音识别器，在第九节我们总结了本文讨论的想法。
	discrete markov processes 离散马尔可夫过程
	consider a system which may be described at any time as being in one of a set of N distinct states,s1,s2,...,sn,as illustrated in fig.1(where n = 5 for simplicity) 考虑可以在任何时间描述为处于N个不同状态集合s1，s2，...，sn中的一个中的系统，如图1所示（其中为了简单起见n = 5）
	at regularly spaced discrete times,the system undergoes a change of state(possibly back to the same state)according to a set of probabilities associated with the state. 在规则间隔的离散时间，根据与状态相关的一组概率，系统经历状态改变（可能回到相同状态）。
	we denote the time instants associated with state changes as t = 1,2,...,and we denote the actual state at time t as qt. 我们将与状态变化相关的时刻表示为t = 1,2，...，并且我们将时刻t的实际状态表示为qt。
	a full probabilistic description of the above system would,in general,require specification of the current state(at time t),as well as all the predecessor states. 对上述系统的全面概率描述通常要求规定当前状态（在时间t）以及所有前驱状态。
问题:为什么是这个表达式。P2	for the special case of a discrete,first order,markov chain,this probabilistic description is truncated to just the current and the predecessor state,i.e.(???) 对于离散的一阶马尔可夫链的特殊情况，这个概率描述被截断为只有当前状态和前一状态， P[qt = Si\|qt-1 =Sj,qt-2 = Sk,…] = P[qt = Sj \| qt-1 = Si]
问题：这句话没读懂。P2	furthermore we only consider those processes in which the right-hand side of is independent of time,thereby leading to the set of state transition probabilities aij of the form with the state transition coefficients having the properties since they obey standard stochastic constraints. 此外，我们只考虑其中右手边与时间无关的那些过程，由此导致状态转移系数具有属性的形式的状态转移概率a ij的集合，因为它们服从标准随机约束。 Aij = P[qt = Sj\|qt-1 = si], 1 <=I,j<=N Aij >=0 ,∑aij = 1
	the above stochastic process could be called an observable Markov model since the output of the process is the set of states at each instant of time,where each state corresponds to a pysical(observable)event. 上述随机过程可以称为可观察马尔可夫模型，因为过程的输出是每个时刻的状态集合，其中每个状态对应于一个物理（可观察）事件。
	to set ideas,consider a simple 3 state markov model of the weather. 设定想法，考虑一个简单的三态马尔可夫模型的天气
	we assume that once a day(e.g.,at noon),the weather is observed as being one of the following:state 1:rain or snow , state 2:cloudy, state 3:sunny
	we postulate that the weather on day is characterized by a single one of the three states above,and that the matrix A of state trainsition probabilities is 我们假设当天的天气特征是上述三种状态中的单一状态，并且状态转移概率的矩阵A是
问题：为什么公式是这样的？	Given that the weather on day1(t=1) is sunny(state 3),we can ask the question:what is the probability(according to the model) that the weather for the next 7 days will be "sun-sun-rain-rain-sun-cloudy-sun ..."? stated more formally,we define the observation sequence O as O={S3,S3,S3,S1,S1,S3,S2,S3}corresponding to t=1,2,...,8,and we wish to determine the probability of O,given the model.This probability can be expressed(and evaluated) as 更正式地说，我们将观测序列O定义为对应于t = 1,2，...，8的O = {S3，S3，S3，S1，S1，S3，S2，S3}，并且我们希望确定给定模型的概率为O.这个概率可以表示为（和评估）为 P(O\|Model) = P[S3,S3,S3,S1,S1,S3,S2,S3\|Model] = P[S3].P[S3\|S3].P[S3\|S3].P[S1\|S3].P[S1\|S1].P[S3\|S1].P[S2\|S3].P[S3\|S2] = ∏3 . a33 .a33.a31.a11