Week7-3HMM1

Markov model

  • sequence of random variables that are not independent
    • weather report
    • text

Properties

  • limited horizon
    • P(Xt+1=skX1,...,Xt)=P(Xt+1=sXt) (first order)
  • time invariant(stationary)
    • P(X2=sk)=P(X1)

Visible MM

P(X1,...,XT)=P(X1)P(X2X1)P(X3X1,X2)...P(XTX1,X2,...,XT1)=P(X1)P(X2X1)P(X3X2)...P(XtXt1)

Hidden MM

  • Motivation
    • observing a sequence of symbols
    • the sequence of state that led to the generation of the symbols is hidden
  • Definition
    • Q = sequence of states
    • O = sequence of observations, drawn from a vocabulary
    • q0,qf = special (start, final) states
    • A = state transition probabilities
    • B = symbol emission probabilities
    • Π = initial state probabilities
    • μ(A,B,Pi) = complete probabilistic model

used to model state sequences and observation sequences

Generative algorithm

  • pick start state from Pi
  • For t = 1…T
    • move to another state based on A
    • emit an observation based on B

Example

State probability

这里写图片描述

Emission probability

这里写图片描述

  • Initial

    P(Astart)=1.0P(Bstart)=0.0

  • Transition

    P(AA)=0.8P(AB)=0.6P(BA)=0.2P(BB)=0.4

  • Emission: see previous table

observation of the sequence “yz”

  • Possible sequences of states:
    • AA
    • AB
    • BA
    • BB

p(yz)=p(yzAA)+p(yzAB)+p(yzBA)+p(yzBB)=1.0×0.2×0.8×0.1=1.0×0.2×0.2×0.2=1.0×0.5×0.6×0.2=1.0×0.5×0.4×0.1

In this way we could compute all the possibilities of any sequence.

State and transitions

  • states:

    • the states encode the most recent history
    • the transitions encode likely sequences of states
    • use MLE to estimate the transition probabilities
  • transitions:

    • estimating the emission probabilities
    • possible to use standard smoothing and heuristic methods

Sequence of observation

  • observers can only see the emitted symbols
  • observation likelihood
    • given the observation sequence S and the model μ , what is the probability P(Sμ) that the sequence was generated by that model
  • HMM turn to language model

Tasks with HMM

  • tasks
    • Given a μ(A,B,Π) , find the distribution P(Oμ)
    • Given O,μ , what is (X1.X2,...,Xt)
    • Given O and a space of all possible μ , find the best μ
  • decoding
    • tag each token with a label

Inference

  • find the most likely tag, given the word
    • t=argmaxtp(tw)
  • given the model μ , we could find the best sequence of tags {ti}n1 given the sequence of words {wi}n1
  • too many ways for combinations

Viterbi algorithm

  • Find the best path up to observation i and state s(partial best path), and if we condition on the whole string of i, we will get the best path for the whole sentence.

    • dynamic programming
    • memoization
    • backpointers
  • initial state
    这里写图片描述

  • we could calculate the first state P(t=1)
    这里写图片描述

这里写图片描述

  • say if we want to calculate P(B,t=2) and calculating P(A,t=2) is similar
    这里写图片描述

  • and we could find the best path and best sequence of states
    这里写图片描述

  • finally we could find the best sequence for all observations

  • 这里写图片描述
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值