【math】Hiden Markov Model 隐马尔可夫模型了解

Introduction to Hidden Markov Model

Introduction

  • Markov chains were first introduced in 1906 by Andrey Markov
  • HMM was developed by L. E. Baum and coworkers in the 1960s
  • HMM is simplest dynamic Bayesian network and a directed graphic model
  • Application: speech recognition, PageRank(Google), DNA analysis, …

Markov chain

A Markov chain is “a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event”.

在这里插入图片描述
Space or time can be either discrete(𝑋_𝑡:t=0, 1, 2,…) or continuous(𝑋_𝑡:t≥0). (we will focus on Markov chains in discrete space an time)

Example for Markov chain:

  • transition matrix 𝑄 :
  • 5-step transition matrix is 𝑄^5 :
    在这里插入图片描述

Hidden Markov Model(HMM)

HMM is a statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved (i.e. hidden) states.

State: x = (x1, x2, x3) ; 
Observation: y = (y1, y2, y3);
Transition matrix: A = (aij); 
Emission matrix: B = (bij)

在这里插入图片描述

Example:
在这里插入图片描述
在这里插入图片描述

Three Questions

  • Given the model 𝜆=[𝐴, 𝐵,𝜋], how to calculate the probability of producing the observation 𝒚={𝑦 1 _1 1,𝑦 2 _2 2,…,𝑦 𝑛 _𝑛 n | 𝑦 𝑖 _𝑖 i∈𝑂}? In other words, how to evaluate the matching degree between the model and the observation ?
  • Given the model 𝜆=[𝐴, 𝐵,𝜋] and the observation 𝒚={𝑦 1 _1 1,𝑦 2 _2 2,…,𝑦 𝑛 _𝑛 n| 𝑦 𝑖 _𝑖 i∈𝑂}, how to find most probable state 𝒙={𝑥 1 _1 1,𝑥 2 _2 2,…,𝑥 𝑛 _𝑛 n |𝑥 𝑖 _𝑖 i∈𝑆} ? In other words, how to infer the hidden state from the observation ?
  • Given the observation 𝒚={𝑦 1 _1 1,𝑦 2 _2 2,…,𝑦 𝑛 _𝑛 n | 𝑦 𝑖 _𝑖 i∈𝑂}, how to adjust model parameters 𝜆=[𝐴, 𝐵,𝜋] to maximize to the probability 𝑃(𝒚│𝜆) ? In other words, how to train the model to describe the observation more accurately ?

Q1: evaluate problem – Forward algorithm

Q1: how to evaluate the matching degree between the model and the observation ? ( forward algorithm)

  • the probability of observing event 𝑦 1 _1 1: 𝑃 𝑖 0 _{𝑖0} i0 = 𝑃 𝑖 _𝑖 i (𝑂=𝑦 1 {_1} 1) = 𝜋 𝑖 𝑏 1 𝑖 _{𝑖}𝑏_{1𝑖} ib1i
  • the probability of observing event 𝑦 𝑗 + 1 _{𝑗+1} j+1 (𝑗≥1): 𝑃 𝑖 𝑗 _𝑖𝑗 ij=𝑏 𝑗 , 𝑖 + 1 _{𝑗,𝑖+1} j,i+1 𝑘 _𝑘 k𝑃 𝑖 , 𝑗 − 1 _{𝑖,𝑗−1} i,j1 𝑎 𝑖 𝑗 _{𝑖𝑗} ij
  • 𝑃(𝒚)=∑ 𝑘 _{𝑘} k 𝑃 𝑖 𝑗 _{𝑖𝑗} ij∗𝑎 𝑗 0 _{𝑗0} j0
    在这里插入图片描述

Q2: decode problem – Viterbi algorithm

Q2: how to infer the hidden state from the observation ? (Viterbi algorithm)

Observation 𝒚=(𝑦 1 _1 1, 𝑦 2 _2 2,…, 𝑦 𝑇 _𝑇 T), initial prob. 𝝅=(𝜋 1 _1 1,𝜋 2 _2 2,…, 𝜋 𝐾 _𝐾 K), transition matrix 𝐴, emission matrix 𝐵.
在这里插入图片描述
Viterbi algorithm(optimal solution) backtracking method; It retains the optimal solution of each choice in the previous step and finds the optimal selection path through the backtracking method.

Example:

  • The observation 𝒚={“′Normal′, ′Cold′, ′Dizzy′” } , 𝜆=[𝐴, 𝐵,𝜋], the hidden state 𝒙={𝑥 1 _1 1,𝑥 2 _2 2,𝑥 3 _3 3}= ?
    在这里插入图片描述 在这里插入图片描述
    在这里插入图片描述

Q3: learn problem – Baum-Welch algorithm

Q3: how to train the model to describe the observation more accurately ? (Baum-Welch algorithm)

The Baum–Welch algorithm uses the well known EM algorithm to find the maximum likelihood estimate of the parameters of a hidden Markov model given a set of observed feature vectors.
在这里插入图片描述
在这里插入图片描述
Baum–Welch algorithm(forward-backward alg.) It approximates the optimal parameters through iteration.

在这里插入图片描述

  • (1) Likelihood function :
    𝑙𝑜𝑔𝑃(Y,𝐼|𝜆)
  • (2) Expectation of EM algorithm:
    𝑄(𝜆,𝜆 ̂ )= ∑ 𝐼 _𝐼 I 𝑙𝑜𝑔𝑃(𝑌,𝐼|𝜆)𝑃(𝑌,𝐼|𝜆 ̂ )
  • (3) *Maximization of EM algorithm:
    max 𝑄(𝜆,𝜆 ̂ )

use Lagrangian multiplier method and take the partial derivative of Lagrangian funcition。

Application

CpG island:

  • In the human genome wherever the dinucleotide CG occurs, the C nucleotide is typically chemically modified by methylation.
  • around the promoters or ‘start’ regions
  • CpG is typically a few hundred to a few thousand bases long.
    在这里插入图片描述 在这里插入图片描述
    three questions: of CpG island:
  • Given the model of distinguished the CpG island, how to calculate the probability of the observation sequence ?
  • Given a short stretch of genomic sequence, how would we decide if it comes from a CpG island or not ?
  • Given a long piece of sequence, how would we find the CpG islands in it?
    在这里插入图片描述

reference:


未完待续…

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值