Vitberi Algorithem

最新推荐文章于 2022-03-30 18:54:07 发布

US_579

最新推荐文章于 2022-03-30 18:54:07 发布

阅读量705

点赞数

分类专栏： python 文章标签：机器学习

本文链接：https://blog.csdn.net/US_579/article/details/89486462

版权

python 专栏收录该内容

9 篇文章 0 订阅

订阅专栏

Vitberi Algorithem

the code is in link below

https://github.com/US579/DataMining-Project

How to run

Dataset in this case :

State_File ='./toy_example/State_File'
Symbol_File='./toy_example/Symbol_File'
Query_File ='./toy_example/Query_File'

simple run:

python3 submission.py

HMM and Viterbi Algorithem decription

In this case below, the output of the implicit sequence is

[3, 2, 1, 2, 4, -9.397116279119517]

with log probility -9.397116279119517 which is largest probility(this is log probility)
在这里插入图片描述

Parameter breakdown

0:S1       1:S2      2:S3     3:BEGIN      4:END

1.Initial Probilities

The blue line represent the initial probability (Pi) which can be deemed as equivalent to transition probabilities from the BEGIN state to all the implicit state

So, we caculate as

for s in states:
    transition_probability[len(states)-2][states.index(s)])

2.Emission Probilities

The red line represent its emission probability from state after smoothing is

$B[i,j]=\frac{n(i,j)+1}{n(i)+M+1}$

If the symbol is an unknown symbol, its emission probability from state after smoothing is

$B[i,j]=\frac{1}{n(i)+M+1}$

for i in range(1,len(obs)):
    for cur in range(len(states)):
                #if there is no emission from states `cur` to observation `sym.index(obs[i])`(this is the index in the symbol list),we us add one smoothing (in case it is 0)
                    emission_rate = emission_probability[str(cur)][str(sym.index(obs[i]))]

2.Transition Probilities

The black line represent the transition probilities transfer the states from on to another

for i in range(n1):
    for j in range(n1):
        transition_probability[i][j] = (float(distance[i][j])+1) / (sum(distance[i])+n1-1)

the number of state_i transfer to state_j divide by the total number of transfering state_j to any states , i also use add-1 smoothing here

3.Viterbi Algorithm

for HMM, the most useful function is to find the most likely implicit sequence according to its observation,In general, the HMM problem can be described by the following five elements:

observations ：we observed phenomenon sequence
states ：all the possible implicit states
start_probability ：the initial probilities of each implicit states
transition_probability ：the probility of transfering from one implicit states to another
emission_probability ：the probility of some implicit states emit some observed phenomenon

If you use the brute-force method to exhaust all possible state sequences and compare their probability values, the time complexity is O(n^m), obviously , this is unacceptable when we want to find a long sequnce with large dataset, however, we can decrease its time complexity by using Viterbi Algorithem,

we can consider this probelm as dynamic programming , the last_state is the probability of each implicit state corresponding to the previous observed phenomenon, and curr_pro is the probability of each implicit state corresponding to the current observed phenomenon. Solving cur_pro actually depends only on last_state, this is core thinking of Vitberi Algorithem.

MAIN ALGORITHEM

def Viterbi_Algorithm(states,obs,emission_probability,transition_probability):
        '''
        caculate maximum probility of the path , and return as a dict,
        :argument
            :type dict
            states : the states 
            emission_probability : the two-dimension array cotains the emission probility(I use dict here) 
            transition_probability : the two-dimension array cotains the transition probility
            obs : observation sequence
        :return
            A dic
        '''
        for s in states:
            # caculate the initial state probility also count the first emission probility from observation
            curr_pro[s] = math.log(transition_probability[len(states)-2][states.index(s)])+\
                          math.log(emission_probability[states.index(s))],[sym.index(obs[0]))])
        #caculate the rest obervation sequence
        for i in range(1,len(obs)):
            last_pro = curr_pro
            curr_pro = {}
            for cur in range(len(states)):
                    (max_pr,last_state) = max([(last_pro[k]+math.log(transition_probability[states.index(k)][cur])+
                                                math.log(emission_rate), k) for k in states])
                curr_pro[states[cur]] = max_pr
                path[states[cur]].append(last_state)
        return path

US_579

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Vitberi Algorithem

Vitberi Algorithemthe code is in link belowhttps://github.com/US579/DataMining-ProjectHow to runDataset in this case :State_File ='./toy_example/State_File'Symbol_File='./toy_example/Symbol_File...
复制链接

扫一扫

专栏目录