Introduction
Consider the simple example of someone trying to deduce the weather from a piece of seaweed - folklore tells us that `soggy' seaweed means wet weather, while `dry' seaweed means sun. If it is in an intermediate state (`damp'), then we cannot be sure. However, the state of the weather is not restricted to the state of the seaweed, so we may say on the basis of an examination that the weather is probably raining or sunny. A second useful clue would be the state of the weather on the preceding day (or, at least, its probable state) - by combining knowledge about what happened yesterday with the observed seaweed state, we might come to a better forecast for today. This is typical of the type of system we will consider in this tutorial.
|
Generating Patterns
Section 1 - Page 1
DeterministicPatterns
Consider a set of traffic lights; the sequence of lights isred - red/amber - green - amber - red.The sequence can be pictured as a state machine, where the different states ofthe traffic lights follow each other.
e.g. |
Notice that each state is dependent solely on the previous state,so if the lights are green, an amber light will always follow - that is, thesystem is deterministic. Deterministic systems are relatively easy tounderstand and analyse, once the transitions are fully known.
Patterns generated by a hidden process
When a Markov process may not be powerful enoughIn some cases the patterns that we wish to find are not described sufficiently by a Markov process. Returning to the weather example, a hermit may perhaps not have access to direct weather observations, but does have a piece of seaweed. Folklore tells us that the state of the seaweed is probabalistically related to the state of the weather - the weather and seaweed states are closely linked. In this case we have two sets of states, the observable states (the state of the seaweed) and the hidden states (the state of the weather). We wish to devise an algorithm for the hermit to forecast weather from the seaweed and the Markov assumption without actually ever seeing the weather. A more realistic problem is that of recognising speech; the sound that we hear is the product of the vocal chords, size of throat, position of tongue and several other things. Each of these factors interact to produce the sound of a word, and the sounds that a speech recognition system detects are the changing sound generated from the internal physical changes in the person speaking. | ||||||||
In such cases the observed sequence of states is probabalistically related to the hidden process. We model such processes using a hidden Markov model where there is an underlying hidden Markov process changing over time, and a set of observable states which are related somehow to the hidden states. | ||||||||
Hidden Markov Models Section 1 - Page 1
Definition of a hidden Markov model A hidden Markov model (HMM) is a triple (,A,B).
Each probability in the state transition matrix and in the confusion matrix is time independent - that is, the matrices do not change in time as the system evolves. In practice, this is one of the most unrealistic assumptions of Markov models about real processes. |
HMMs - Summary
Summary
Frequently, patterns do not appear in isolation but as part of aseries in time - this progression can sometimes be used to assist in theirrecognition. Assumptions are usually made about the time based process - acommon assumption is that the process's state is dependent only on thepreceding N states - then we have an order N Markov model. The simplest case isN=1.
Various examples exists where the process states (patterns) arenot directly observable, but are indirectly, and probabalistically, observableas another set of patterns - we can then define a hidden Markov model - thesemodels have proved to be of great value in many current areas of research,notably speech recognition.
Such models of real processes pose three problems that areamenable to immediate attack; these are :
- Evaluation : with what probability does a given model generate a given sequence of observations. The forward algorithm solves this problem efficiently.
- Decoding : what sequence of hidden (underlying) states most probably generated a given sequence of observations. The Viterbi algorithm solves this problem efficiently.
- Learning : what model most probably underlies a given sample of observation sequences - that is, what are the parameters of such a model. This problem may be solved by using the forward-backward algorithm.
HMMs have proved to be of great value in analysing real systems; theirusual drawback is the over-simplification associated with the Markov assumption- that a state is dependent only on predecessors, and that this dependence istime independent.
A full expositionon HMMs may be found in:
L R Rabiner and B H Juang, `An introduction to HMMs',iEEE ASSP Magazine, 3, 4-16.
http://www.comp.leeds.ac.uk/roger/HiddenMarkovModels/html_dev/main.html