隐马尔可夫模型
定义
隐马尔可夫模型由初始概率分布,状态转移概率分布以及观测概率分布确定
Constructor
头文件1
/**
* A class that represents a Hidden Markov Model with an arbitrary type of
* emission distribution. This HMM class supports training (supervised and
* unsupervised), prediction of state sequences via the Viterbi algorithm,
* estimation of state probabilities, generation of random sequences, and
* calculation of the log-likelihood of a given sequence.
*
* The template parameter, Distribution, specifies the distribution which the
* emissions follow. The class should implement the following functions:
*
* @code
* class Distribution
* {
* public:
* // The type of observation used by this distribution.
* typedef something DataType;
*
* // Return the probability of the given observation.
* double Probability(const DataType& observation) const;
*
* // Estimate the distribution based on the given observations.
* double Train(const std::vector<DataType>& observations);
*
* // Estimate the distribution based on the given observations, given also
* // the probability of each observation coming from this distribution.
* double Train(const std::vector<DataType>& observations,
* const std::vector<double>& probabilities);
* };
* @endcode
*
* See the mlpack::distribution::DiscreteDistribution class for an example. One
* would use the DiscreteDistribution class when the observations are
* non-negative integers. Other distributions could be Gaussians, a mixture of
* Gaussians (GMM), or any other probability distribution implementing the
* four Distribution functions.
*
* Usage of the HMM class generally involves either training an HMM or loading
* an already-known HMM and taking probability measurements of sequences.
* Example code for supervised training of a Gaussian HMM (that is, where the
* emission output distribution is a single Gaussian for each hidden state) is
* given below.
*
* @code
* extern arma::mat observations; // Each column is an observation.
* extern arma::Row<size_t> states; // Hidden states for each observation.
* // Create an untrained HMM with 5 hidden states and default (N(0, 1))
* // Gaussian distributions with the dimensionality of the dataset.
* HMM<GaussianDistribution> hmm(5, GaussianDistribution(observations.n_rows));
*
* // Train the HMM (the labels could be omitted to perform unsupervised
* // training).
* hmm.Train(observations, states);
* @endcode
*
* Once initialized, the HMM can evaluate the probability of a certain sequence
* (with LogLikelihood()), predict the most likely sequence of hidden states
* (with Predict()), generate a sequence (with Generate()), or estimate the
* probabilities of each state for a sequence of observations (with Train()).
*
* @tparam Distribution Type of emission distribution for this HMM.
*/
template<typename Distribution = distribution::DiscreteDistribution>
class HMM
{
public:
/**
* Create the Hidden Markov Model with the given number of hidden states and
* the given default distribution for emissions. The dimensionality of the
* observations is taken from the emissions variable, so it is important that
* the given default emission distribution is set with the correct
* dimensionality. Alternately, set the dimensionality with Dimensionality().
* Optionally, the tolerance for convergence of the Baum-Welch algorithm can
* be set.
*
* By default, the transition matrix and initial probability vector are set to
* contain equal probability for each state.
*
* @param states Number of states.
* @param emissions Default distribution for emissions.
* @param tolerance Tolerance for convergence of training algorithm
* (Baum-Welch).
*/
HMM(const size_t states = 0,
const Distribution emissions = Distribution(),
const double tolerance = 1e-5);
实现
/**
* Create the Hidden Markov Model with the given number of hidden states and the
* given number of emission states.
*/
template<typename Distribution>
HMM<Distribution>::HMM(const size_t states,
const Distribution emissions,
const double tolerance) :
emission(states, /* default distribution */ emissions),
transitionProxy(arma::randu<arma::mat>(states, states)),
initialProxy(arma::randu<arma::vec>(states) / (double) states),
dimensionality(emissions.Dimensionality()),
tolerance(tolerance),
recalculateInitial(false),
recalculateTransition(false)
{
// Normalize the transition probabilities and initial state probabilities.
initialProxy /= arma::accu(initialProxy);
for (size_t i = 0; i < transitionProxy.n_cols; ++i)
transitionProxy.col(i) /= arma::accu(transitionProxy.col(i));
logTransition = log(transitionProxy);
logInitial = log(initialProxy);
}
此构造函数的第一个参数 states 代表着隐藏的状态数,emissions 代表观测概率分布,tolerance 用于 Baum-Welch 算法的收敛
emission 和形参 emissions 不同,是个存放 Distribution 的 vector,这里初始化为 states 个 emissions
transitionProxy 是状态转移矩阵,按 [ 0 , 1 ] [0, 1] [0,1] 的均匀分布赋初值,并对每一列标准化,使其成为概率分布
initialProxy 是初始状态概率向量,按 [ 0 , 1 s t a t e s ] [0, \frac{1}{states}] [0,states1] 的均匀分布赋初值,并对其标准化
dimensionality 是观测数据的维度
头文件2
/**
* Create the Hidden Markov Model with the given initial probability vector,
* the given transition matrix, and the given emission distributions. The
* dimensionality of the observations of the HMM are taken from the given
* emission distributions. Alternately, the dimensionality can be set with
* Dimensionality().
*
* The initial state probability vector should have length equal to the number
* of states, and each entry represents the probability of being in the given
* state at time T = 0 (the beginning of a sequence).
*
* The transition matrix should be such that T(i, j) is the probability of
* transition to state i from state j. The columns of the matrix should sum
* to 1.
*
* The emission matrix should be such that E(i, j) is the probability of
* emission i while in state j. The columns of the matrix should sum to 1.
*
* Optionally, the tolerance for convergence of the Baum-Welch algorithm can
* be set.
*
* @param initial Initial state probabilities.
* @param transition Transition matrix.
* @param emission Emission distributions.
* @param tolerance Tolerance for convergence of training algorithm
* (Baum-Welch).
*/
HMM(const arma::vec& initial,
const arma::mat& transition,
const std::vector<Distribution>& emission,
const double tolerance = 1e-5);
实现
/**
* Create the Hidden Markov Model with the given transition matrix and the given
* emission probability matrix.
*/
template<typename Distribution>
HMM<Distribution>::HMM(const arma::vec& initial,
const arma::mat& transition,
const std::vector<Distribution>& emission,
const double tolerance) :
emission(emission),
transitionProxy(transition),
logTransition(log(transition)),
initialProxy(initial),
logInitial(log(initial)),
tolerance(tolerance),
recalculateInitial(false),
recalculateTransition(false)
{
// Set the dimensionality, if we can.
if (emission.size() > 0)
dimensionality = emission[0].Dimensionality();
else
{
Log::Warn << "HMM::HMM(): no emission distributions given; assuming a "
<< "dimensionality of 0 and hoping it gets set right later."
<< std::endl;
dimensionality = 0;
}
}
此构造函数第一个参数 initial 代表了初始状态概率向量,它的长度相当于 states,每一个元素应该代表着零时刻处于某一状态的概率
transition 代表状态转移矩阵,元素 T i j T_{ij} Tij 表示从状态 j j j 转移到状态 i i i 的概率,因此每一列元素和为 1
emission 是观测概率分布向量,它共有 states 个元素,分别代表着每一个状态,每个元素是一个 Distribution ,表示某一状态下的观测概率分布
例题
借用一下《统计学习方法》(第2版)书中的例 10.1(盒子和球模型)
假设有 4 个盒子,每个盒子里都装有红,白两种颜色的球,盒子里的红,白球数由下表列出:
1 | 2 | 3 | 4 | |
---|---|---|---|---|
红 | 5 | 3 | 6 | 8 |
白 | 5 | 7 | 4 | 2 |
按下面的方法抽球,产生一个球的颜色的观测序列:
∙ \bullet ∙ 开始,从 4 个盒子里以等概率随机选取 1 个盒子,从这个盒子里随机抽出 1 个球,记录其颜色,放回;
∙ \bullet ∙ 然后,从当前盒子随机转移到下一个盒子,规则是:如果当前盒子是盒子 1,那么下一个盒子一定是盒子 2;如果当前是盒子 2 或 3,那么分别以概率 0.4 和 0.6 转移到左边或右边的盒子;如果当前是盒子 4,那么各以 0.5 的概率停留在盒子 4 或转移到盒子 3;
∙ \bullet ∙ 确定转移的盒子后,再从这个盒子里随机抽出 1 个球,记录其颜色,放回;
∙ \bullet ∙ 如此下去,重复 5 次,得到一个球的颜色的观测序列:
O = ( 红 , 红 , 白 , 白 , 红 ) O = ( 红,红,白,白,红 ) O=(红,红,白,白,红)
在这个过程中,观察者只能观测到球的颜色的序列,观测不到球是从哪个盒子取出的,即观测不到盒子的序列
我们使用第二种构造函数,隐藏的状态数为 4,对应 4 个盒子
初始状态概率向量为:
i n i t i a l = ( 0.25 , 0.25 , 0.25 , 0.25 ) T initial = (0.25, 0.25, 0.25, 0.25)^{\mathsf{T}} initial=(0.25,0.25,0.25,0.25)T
状态转移矩阵为:
t r a n s i t i o n = [ 0 0.4 0 0 1 0 0.4 0 0 0.6 0 0.5 0 0 0.6 0.5 ] transition = \begin{bmatrix} 0 \quad 0.4 \quad 0 \quad 0 \\ 1 \quad 0 \quad 0.4 \quad 0 \\ 0 \quad 0.6 \quad 0 \quad 0.5 \\ 0 \quad 0 \quad 0.6 \quad 0.5 \end{bmatrix} transition=⎣⎢⎢⎡00.400100.4000.600.5000.60.5⎦⎥⎥⎤
我们使用默认的 DiscreteDistribution
其实现:
/**
* A discrete distribution where the only observations are discrete
* observations. This is useful (for example) with discrete Hidden Markov
* Models, where observations are non-negative integers representing specific
* emissions.
*
* No bounds checking is performed for observations, so if an invalid
* observation is passed (i.e. observation > numObservations), a crash will
* probably occur.
*
* This distribution only supports one-dimensional observations, so when
* passing an arma::vec as an observation, it should only have one dimension
* (vec.n_rows == 1). Any additional dimensions will simply be ignored.
*
* @note
* This class, like every other class in mlpack, uses arma::vec to represent
* observations. While a discrete distribution only has positive integers
* (size_t) as observations, these can be converted to doubles (which is what
* arma::vec holds). This distribution internally converts those doubles back
* into size_t before comparisons.
*/
class DiscreteDistribution
{
public:
/**
* Default constructor, which creates a distribution that has no
* observations.
*/
DiscreteDistribution() :
probabilities(std::vector<arma::vec>(1)){
/* Nothing to do. */ }
/**
* Define the discrete distribution as having numObservations possible
* observations. The probability in each state will be set to (1 /
* numObservations).
*
* @param numObservations Number of possible observations this distribution
* can have.
*/
DiscreteDistribution(const size_t numObservations) :
probabilities(std::vector<arma::vec>(1,
arma::ones<arma::vec>(numObservations) / numObservations))
{
/* Nothing to do. */ }
/**
* Define the multidimensional discrete distribution as having
* numObservations possible observations. The probability in each state will
* be set to (1 / numObservations of each dimension).
*
* @param numObservations Number of possible observations this distribution
* can have.
*/
DiscreteDistribution(const arma::Col<size_t>& numObservations)
{
for (size_t i = 0; i < numObservations.n_elem; ++i)
{
const size_t numObs = size_t(numObservations[i]);
if (numObs <= 0)
{
std::ostringstream oss;
oss << "number of observations for dimension " << i << " is 0, but "
<< "must be greater than 0";
throw std::invalid_argument(oss.str());
}
probabilities.push_back(arma::ones<arma::vec>(numObs) / numObs);
}
}
/**
* Define the multidimensional discrete distribution as having the given
* probabilities for each observation.
*
* @param probabilities Probabilities of each possible observation.
*/
DiscreteDistribution(const std::vector<arma::vec>& probabilities)
{
for (size_t i = 0; i < probabilities.size(); ++i)
{
arma::vec temp = probabilities[i];
double sum = accu(temp);
if (sum > 0)
this->probabilities.push_back(temp / sum);
else
{
this->probabilities.push_back(arma::ones<arma::vec>(temp.n_elem)
/ temp.n_elem);
}
}
}
/**
* Get the dimensionality of the distribution.
*/
size_t Dimensionality() const {
return probabilities.size(); }
/**
* Return the probability of the given observation. If the observation is
* greater than the number of possible observations, then a crash will
* probably occur -- bounds checking is not performed.
*
* @param observation Observation to return the probability of.
* @return Probability of the given observation.
*/
double Probability(const arma::vec& observation) const
{
double probability = 1.0;
// Ensure the observation has the same dimension with the probabilities.
if (observation.n_elem != probabilities.size())
{
Log::Fatal << "DiscreteDistribution::Probability(): observation has "
<< "incorrect dimension " << observation.n_elem << " but should have"
<< " dimension " << probabilities.size() << "!" << std::endl;
}
for (size_t dimension = 0; dimension < observation.n_elem; dimension++)
{
// Adding 0.5 helps ensure that we cast the floating point to a size_t
// correctly.
const size_t obs = size_t(observation(dimension) + 0.5);
// Ensure that the observation is within the bounds.
if (obs >= probabilities[dimension].n_elem)
{
Log::Fatal << "DiscreteDistribution::Probability(): received "
<< "observation " << obs << "; observation must be in [0, "
<< probabilities[dimension].n_elem << "] for this distribution."
<< std::endl;
}
probability *= probabilities[dimension][obs];
}
return probability;
}
/**
* Return the log probability of the given observation. If the observation
* is greater than the number of possible observations, then a crash will
* probably occur -- bounds checking is not performed.
*
* @param observation Observation to return the log probability of.
* @return Log probability of the given observation.
*/
double LogProbability(const arma::vec& observation) const
{
// TODO: consider storing log probabilities instead?
return log(Probability(observation));
}
/**
* Calculates the Discrete probability density function for each
* data point (column) in the given matrix.
*
* @param x List of observations.
* @param probabilities Output probabilities for each input observation.
*/
void Probability(const arma::mat& x, arma::vec& probabilities) const
{
probabilities.set_size(x.n_cols);
for (size_t i = 0; i < x.n_cols; ++i)
probabilities(i) = Probability(x.unsafe_col(i));
}
/**
* Returns the Log probability of the given matrix. These values are stored
* in logProbabilities.
*
* @param x List of observations.
* @param logProbabilities Output log-probabilities for each input
* observation.
*/
void LogProbability(const arma::mat& x, arma::vec& logProbabilities) const
{
logProbabilities.set_size(x.n_cols);
for (size_t i = 0; i < x.n_cols; ++i)
logProbabilities(i) = log(Probability(x.unsafe_col(i)));
}
/**
* Return a randomly generated observation (one-dimensional vector; one
* observation) according to the probability distribution defined by this
* object.
*
* @return Random observation.
*/
arma::vec Random() const;
/**
* Estimate the probability distribution directly from the given
* observations. If any of the observations is greater than numObservations,
* a crash is likely to occur.
*
* @param observations List of observations.
*/
void Train(const arma::mat& observations);
/**
* Estimate the probability distribution from the given observations, taking
* into account the probability of each observation actually being from this
* distribution.
*
* @param observations List of observations.
* @param probabilities List of probabilities that each observation is
* actually from this distribution.
*/
void Train(const arma::mat& observations,
const arma::vec& probabilities);
//! Return the vector of probabilities for the given dimension.
arma::vec& Probabilities(const size_t dim = 0) {
return probabilities[dim]; }
//! Modify the vector of probabilities for the given dimension.
const arma::vec& Probabilities(const size_t dim = 0) const
{
return probabilities[dim]; }
/**
* Serialize the distribution.
*/
template<typename Archive>
void serialize(Archive& ar, const unsigned int /* version */)
{
ar & BOOST_SERIALIZATION_NVP(probabilities);
}
private:
//! The probabilities for each dimension; each arma::vec represents the
//! probabilities for the observations in each dimension.
std::vector<arma::vec> probabilities;
};
观测序列是球的颜色,只有一个维度
因此,状态 1(即第一个盒子)的概率分布为 ( 0.5 , 0.5 ) T (0.5, 0.5)^{\mathsf{T}} (0.5,0.5)T
状态 2 的概率分布为 ( 0.3 , 0.7 ) T (0.3, 0.7)^{\mathsf{T}} (0.3,0.7)T
状态 3 的概率分布为 ( 0.6 , 0.4 ) T (0.6, 0.4)^{\mathsf{T}} (0.6,0.4)T
状态 4 的概率分布为 ( 0.8 , 0.2 ) T (0.8, 0.2)^{\mathsf{T}} (0.8,0.2)T
写成代码:
#include <iostream>
#include <mlpack/core/dists/discrete_distribution.hpp>
#include <mlpack/methods/hmm/hmm.hpp>
using namespace std;
using namespace arma;
using namespace mlpack::distribution;
using namespace mlpack::hmm;
void hmm_test()
{
// initialize
vec initial({
0.25, 0.25, 0.25, 0.25});
mat transition("0, 0.4, 0, 0;"
"1, 0, 0.4, 0;"
"0, 0.6, 0, 0.5;"
"0, 0, 0.6, 0.5;");
DiscreteDistribution box1(2);
DiscreteDistribution box2(vector<vec>(1, vec({
0.3, 0.7})));
DiscreteDistribution box3(vector<vec>(1, vec({
0.6, 0.4})));
DiscreteDistribution box4(vector<vec>(1, vec({
0.8, 0.2})));
vector<DiscreteDistribution> emission({
box1, box2, box3, box4});
HMM<DiscreteDistribution> hmm(initial, transition, emission);
}
概率计算方法
给定模型和观测序列,我们要计算在该模型下,观测序列出现的概率
头文件
/**
* Compute the log-likelihood of the given data sequence.
*
* @param dataSeq Data sequence to evaluate the likelihood of.
* @return Log-likelihood of the given sequence.
*/
double LogLikelihood(const arma::mat& dataSeq) const;
实现
/**
* Compute the log-likelihood of the given data sequence.
*/
template<typename Distribution>
double HMM<Distribution>::LogLikelihood(const arma::mat& dataSeq) const
{
arma::mat forwardLog;
arma::vec logScales;
Forward(dataSeq, logScales, forwardLog);
// The log-likelihood is the log of the scales for each time step.
return accu(logScales);
}
这里使用的是前向算法:
Forward
头文件
/**
* The Forward algorithm (part of the Forward-Backward algorithm). Computes
* forward probabilities for each state for each observation in the given data
* sequence. The returned matrix has rows equal to the number of hidden
* states and columns equal to the number of observations.
*
* @param dataSeq Data sequence to compute probabilities for.
* @param logScales Vector in which scaling factors will be saved.
* @param forwardLogProb Matrix in which forward probabilities will be saved.
*/
void Forward(const arma::mat& dataSeq,
arma::vec& logScales,
arma::mat& forwardLogProb) const;
实现
/**
* The Forward procedure (part of the Forward-Backward algorithm).
*/
template<typename Distribution>
void HMM<Distribution>::Forward(const arma::mat& dataSeq,
arma::vec& logScales,
arma::mat& forwardLogProb) const
{
// Our goal is to calculate the forward probabilities:
// P(X_k | o_{1:k}) for all possible states X_k, for each time point k.
forwardLogProb.resize(logTransition.n_rows, dataSeq.n_cols);
forwardLogProb.fill(-std::numeric_limits<double>::infinity());
logScales.resize(dataSeq.n_cols);
logScales.fill(-std::numeric_limits<double>::infinity());
ConvertToLogSpace();
// The first entry in the forward algorithm uses the initial state
// probabilities. Note that MATLAB assumes that the starting state (at
// t = -1) is state 0; this is not our assumption here. To force that
// behavior, you could append a single starting state to every single data
// sequence and that should produce results in line with MATLAB.
for (size_t state = 0; state < logTransition.n_rows; state++)
{
forwardLogProb(state, 0) = logInitial(state) +
emission[state].LogProbability(dataSeq.unsafe_col(0));
}
// Then normalize the column.
logScales[0] = math::AccuLog(forwardLogProb.col(0));
if (std::isfinite(logScales[0]))
forwardLogProb.col(0) -= logScales[0];
// Now compute the probabilities for each successive observation.
for (size_t t = 1; t < dataSeq.n_cols; t++)
{
for (size_t j = 0; j < logTransition.n_rows; ++j)
{
// The forward probability of state j at time t is the sum over all states
// of the probability of the previous state transitioning to the current
// state and emitting the given observation.
arma::vec tmp = forwardLogProb.col(t - 1) + logTransition.row(j).t();
forwardLogProb(j, t) = math::AccuLog(tmp) +
emission[j].LogProbability(dataSeq.unsafe_col(t));
}
// Normalize probability.
logScales[t] = math::AccuLog(forwardLogProb.col(t));
if (std::isfinite(logScales[t]))
forwardLogProb.col(t) -= logScales[t];
}
}
简单起见,我们观测的数据都是一维的,因此不妨记 dataSeq 为 ( 1 × \times × T ) 的矩阵
隐藏的状态数为 N,因此 logInitial 为有 N 个元素的向量,logTransition 为 ( N × \times × N ) 的矩阵
emission 有 N 个元素,每个元素代表某一状态下的观测概率分布,因为观测的数据是一维的,所以每一观测概率分布的维度为 1
首先是 forwardLogProb 初始化为 ( N × \times × T ) 的矩阵,元素为负无穷,logScales 初始化为有 T 个元素的向量,元素为负无穷
然后是初值的计算,forwardLogProb 第一列的元素
f o r w a r d L o g P r o b ( i , 0 ) = l o g I n i t i a l ( i ) + log ( e m i s s i o n ( d a t a S e q ( 0 ) , i ) ) , i = 0 , 1 , ⋯ , N − 1 forwardLogProb(i, 0) = logInitial(i) + \log(emission( dataSeq(0), i)) \ , \quad i = 0, 1, \cdots , N-1 forwardLogProb(i,0)=logInitial(i)+log(emission(dataSeq(0),i)) ,i=0,1,⋯,N−1
(为方便书写,我们将 emission 写成矩阵的形式,注意向量 emission[ j ] 代表的是第 j 列,因此 emission(i, j) 表示在状态 j 下观测到 i 的概率)
这里计算的是对数后的概率,其原始形式的含义是:在零时刻处于状态 i 的概率 乘以 在状态 i 下观测到观测序列第一个元素的概率
因此,forwardLogProb(i, 0) 表示:在零时刻处于状态 i 和在状态 i 下观测到 dataSeq(0) 的联合概率的自然对数
我们去看一下 AccuLog 的头文件
/**
* Sum a vector of log values. (T should be an Armadillo type.)
*
* @param x vector of log values
* @return log(e^x0 + e^x1 + ...)
*/
template<typename T>
typename T::elem_type AccuLog(const T& x);
因此,logScales[0] 表示:零时刻观测到 dataSeq(0) 的概率的自然对数
如果 logScales 是无穷的话,那么 forwardLogProb 相应的列全置为无穷
接下来是递推的过程,对于 t 时刻,目标状态为 j :
tmp 表示为两个列向量相加,因此 tmp[ i ] 是:上一时刻 (t-1) 位于状态 i 和在状态 i 观测到 dataSeq(t-1) 以及由状态 i 转移到目标状态 j 的联合概率的自然对数
forwardLogProb(j, t) 表示:上一时刻 (t-1) 观测到 dataSeq(t-1) 和 t 时刻位于状态 j 并观测到 dataSeq(t) 的联合概率的自然对数
注意递推的计算过程,上面赋初值的时候还看不出来:forwardProb(j, t) 这一概率在表示时刻 t 位于状态 j 并观测到 dataSeq(t) 时,隐含着上一时刻 forwardProb 的信息,这一递推的过程延续下去,forwardProb 定义的就是前向概率
logScales[ t ] 对 forwardLogProb 第 t 列求对数和,其含义是:上一时刻 (t-1) 观测到 dataSeq(t-1) 和 t 时刻观测到 dataSeq(t) 的联合概率的自然对数
回到 LogLikelihood ,调用 Forward 函数后,再将 logScales 所有元素累加求和(对数相加相当于概率相乘)
表示的含义就是观测到整个序列 dataSeq 的概率
例题
同上,书中的例题 10.2:
考虑盒子和球模型 λ = ( A , B , π ) \lambda = (A, B, \pi) λ=(A,B,π),状态集合 Q = { 1 , 2 , 3 } Q = \{1, 2, 3\} Q={
1,2,3},观测集合 V = { 红 , 白 } V=\{红,白\} V={
红,白},
A = [ 0.5 0.2 0.3 0.3 0.5 0.2 0.2 0.3 0.5 ] B = [ 0.5 0.5 0.4 0.6 0.7 0.3 ] π = [ 0.2 0.4 0.4 ] A = \begin{bmatrix} 0.5 \quad 0.2 \quad 0.3 \\ 0.3 \quad 0.5 \quad 0.2 \\ 0.2 \quad 0.3 \quad 0.5 \end{bmatrix} \quad B = \begin{bmatrix} 0.5 \quad 0.5 \\ 0.4 \quad 0.6 \\ 0.7 \quad 0.3 \end{bmatrix} \quad \pi = \begin{bmatrix} 0.2 \\ 0.4 \\ 0.4 \end{bmatrix} A=⎣⎡0.50.20.30.30.50.20.20.30.5⎦⎤B