HMM(mlpack)

本文详细介绍了mlpack库中隐马尔可夫模型(HMM)的实现,包括构造函数、概率计算方法(如前向算法和后向算法)、单个状态概率计算、学习算法(监督学习和无监督学习Baum-Welch算法)以及预测算法。通过具体的例题展示了如何使用这些方法,并提供了相应的代码实现。
摘要由CSDN通过智能技术生成

定义

隐马尔可夫模型由初始概率分布状态转移概率分布以及观测概率分布确定

Constructor

头文件1

/**
 * A class that represents a Hidden Markov Model with an arbitrary type of
 * emission distribution.  This HMM class supports training (supervised and
 * unsupervised), prediction of state sequences via the Viterbi algorithm,
 * estimation of state probabilities, generation of random sequences, and
 * calculation of the log-likelihood of a given sequence.
 *
 * The template parameter, Distribution, specifies the distribution which the
 * emissions follow.  The class should implement the following functions:
 *
 * @code
 * class Distribution
 * {
 *  public:
 *   // The type of observation used by this distribution.
 *   typedef something DataType;
 *
 *   // Return the probability of the given observation.
 *   double Probability(const DataType& observation) const;
 *
 *   // Estimate the distribution based on the given observations.
 *   double Train(const std::vector<DataType>& observations);
 *
 *   // Estimate the distribution based on the given observations, given also
 *   // the probability of each observation coming from this distribution.
 *   double Train(const std::vector<DataType>& observations,
 *                const std::vector<double>& probabilities);
 * };
 * @endcode
 *
 * See the mlpack::distribution::DiscreteDistribution class for an example.  One
 * would use the DiscreteDistribution class when the observations are
 * non-negative integers.  Other distributions could be Gaussians, a mixture of
 * Gaussians (GMM), or any other probability distribution implementing the
 * four Distribution functions.
 *
 * Usage of the HMM class generally involves either training an HMM or loading
 * an already-known HMM and taking probability measurements of sequences.
 * Example code for supervised training of a Gaussian HMM (that is, where the
 * emission output distribution is a single Gaussian for each hidden state) is
 * given below.
 *
 * @code
 * extern arma::mat observations; // Each column is an observation.
 * extern arma::Row<size_t> states; // Hidden states for each observation.
 * // Create an untrained HMM with 5 hidden states and default (N(0, 1))
 * // Gaussian distributions with the dimensionality of the dataset.
 * HMM<GaussianDistribution> hmm(5, GaussianDistribution(observations.n_rows));
 *
 * // Train the HMM (the labels could be omitted to perform unsupervised
 * // training).
 * hmm.Train(observations, states);
 * @endcode
 *
 * Once initialized, the HMM can evaluate the probability of a certain sequence
 * (with LogLikelihood()), predict the most likely sequence of hidden states
 * (with Predict()), generate a sequence (with Generate()), or estimate the
 * probabilities of each state for a sequence of observations (with Train()).
 *
 * @tparam Distribution Type of emission distribution for this HMM.
 */
template<typename Distribution = distribution::DiscreteDistribution>
class HMM
{
   
 public:
  /**
   * Create the Hidden Markov Model with the given number of hidden states and
   * the given default distribution for emissions.  The dimensionality of the
   * observations is taken from the emissions variable, so it is important that
   * the given default emission distribution is set with the correct
   * dimensionality.  Alternately, set the dimensionality with Dimensionality().
   * Optionally, the tolerance for convergence of the Baum-Welch algorithm can
   * be set.
   *
   * By default, the transition matrix and initial probability vector are set to
   * contain equal probability for each state.
   *
   * @param states Number of states.
   * @param emissions Default distribution for emissions.
   * @param tolerance Tolerance for convergence of training algorithm
   *      (Baum-Welch).
   */
  HMM(const size_t states = 0,
      const Distribution emissions = Distribution(),
      const double tolerance = 1e-5);

实现

/**
 * Create the Hidden Markov Model with the given number of hidden states and the
 * given number of emission states.
 */
template<typename Distribution>
HMM<Distribution>::HMM(const size_t states,
                       const Distribution emissions,
                       const double tolerance) :
    emission(states, /* default distribution */ emissions),
    transitionProxy(arma::randu<arma::mat>(states, states)),
    initialProxy(arma::randu<arma::vec>(states) / (double) states),
    dimensionality(emissions.Dimensionality()),
    tolerance(tolerance),
    recalculateInitial(false),
    recalculateTransition(false)
{
   
  // Normalize the transition probabilities and initial state probabilities.
  initialProxy /= arma::accu(initialProxy);
  for (size_t i = 0; i < transitionProxy.n_cols; ++i)
    transitionProxy.col(i) /= arma::accu(transitionProxy.col(i));

  logTransition = log(transitionProxy);
  logInitial = log(initialProxy);
}

此构造函数的第一个参数 states 代表着隐藏的状态数,emissions 代表观测概率分布,tolerance 用于 Baum-Welch 算法的收敛

emission 和形参 emissions 不同,是个存放 Distribution 的 vector,这里初始化为 states 个 emissions
transitionProxy 是状态转移矩阵,按 [ 0 , 1 ] [0, 1] [0,1] 的均匀分布赋初值,并对每一列标准化,使其成为概率分布
initialProxy 是初始状态概率向量,按 [ 0 , 1 s t a t e s ] [0, \frac{1}{states}] [0,states1] 的均匀分布赋初值,并对其标准化
dimensionality 是观测数据的维度

头文件2

  /**
   * Create the Hidden Markov Model with the given initial probability vector,
   * the given transition matrix, and the given emission distributions.  The
   * dimensionality of the observations of the HMM are taken from the given
   * emission distributions.  Alternately, the dimensionality can be set with
   * Dimensionality().
   *
   * The initial state probability vector should have length equal to the number
   * of states, and each entry represents the probability of being in the given
   * state at time T = 0 (the beginning of a sequence).
   *
   * The transition matrix should be such that T(i, j) is the probability of
   * transition to state i from state j.  The columns of the matrix should sum
   * to 1.
   *
   * The emission matrix should be such that E(i, j) is the probability of
   * emission i while in state j.  The columns of the matrix should sum to 1.
   *
   * Optionally, the tolerance for convergence of the Baum-Welch algorithm can
   * be set.
   *
   * @param initial Initial state probabilities.
   * @param transition Transition matrix.
   * @param emission Emission distributions.
   * @param tolerance Tolerance for convergence of training algorithm
   *      (Baum-Welch).
   */
  HMM(const arma::vec& initial,
      const arma::mat& transition,
      const std::vector<Distribution>& emission,
      const double tolerance = 1e-5);

实现

/**
 * Create the Hidden Markov Model with the given transition matrix and the given
 * emission probability matrix.
 */
template<typename Distribution>
HMM<Distribution>::HMM(const arma::vec& initial,
                       const arma::mat& transition,
                       const std::vector<Distribution>& emission,
                       const double tolerance) :
    emission(emission),
    transitionProxy(transition),
    logTransition(log(transition)),
    initialProxy(initial),
    logInitial(log(initial)),
    tolerance(tolerance),
    recalculateInitial(false),
    recalculateTransition(false)
{
   
  // Set the dimensionality, if we can.
  if (emission.size() > 0)
    dimensionality = emission[0].Dimensionality();
  else
  {
   
    Log::Warn << "HMM::HMM(): no emission distributions given; assuming a "
        << "dimensionality of 0 and hoping it gets set right later."
        << std::endl;
    dimensionality = 0;
  }
}

此构造函数第一个参数 initial 代表了初始状态概率向量,它的长度相当于 states,每一个元素应该代表着零时刻处于某一状态的概率
transition 代表状态转移矩阵,元素 T i j T_{ij} Tij 表示从状态 j j j 转移到状态 i i i 的概率,因此每一列元素和为 1
emission 是观测概率分布向量,它共有 states 个元素,分别代表着每一个状态,每个元素是一个 Distribution ,表示某一状态下的观测概率分布

例题

借用一下《统计学习方法》(第2版)书中的例 10.1(盒子和球模型)

假设有 4 个盒子,每个盒子里都装有红,白两种颜色的球,盒子里的红,白球数由下表列出:

1 2 3 4
5 3 6 8
5 7 4 2

按下面的方法抽球,产生一个球的颜色的观测序列:

∙ \bullet 开始,从 4 个盒子里以等概率随机选取 1 个盒子,从这个盒子里随机抽出 1 个球,记录其颜色,放回;
∙ \bullet 然后,从当前盒子随机转移到下一个盒子,规则是:如果当前盒子是盒子 1,那么下一个盒子一定是盒子 2;如果当前是盒子 2 或 3,那么分别以概率 0.4 和 0.6 转移到左边或右边的盒子;如果当前是盒子 4,那么各以 0.5 的概率停留在盒子 4 或转移到盒子 3;
∙ \bullet 确定转移的盒子后,再从这个盒子里随机抽出 1 个球,记录其颜色,放回;
∙ \bullet 如此下去,重复 5 次,得到一个球的颜色的观测序列:
O = ( 红 , 红 , 白 , 白 , 红 ) O = ( 红,红,白,白,红 ) O=()
在这个过程中,观察者只能观测到球的颜色的序列,观测不到球是从哪个盒子取出的,即观测不到盒子的序列

我们使用第二种构造函数,隐藏的状态数为 4,对应 4 个盒子
初始状态概率向量为:
i n i t i a l = ( 0.25 , 0.25 , 0.25 , 0.25 ) T initial = (0.25, 0.25, 0.25, 0.25)^{\mathsf{T}} initial=(0.25,0.25,0.25,0.25)T
状态转移矩阵为:
t r a n s i t i o n = [ 0 0.4 0 0 1 0 0.4 0 0 0.6 0 0.5 0 0 0.6 0.5 ] transition = \begin{bmatrix} 0 \quad 0.4 \quad 0 \quad 0 \\ 1 \quad 0 \quad 0.4 \quad 0 \\ 0 \quad 0.6 \quad 0 \quad 0.5 \\ 0 \quad 0 \quad 0.6 \quad 0.5 \end{bmatrix} transition=00.400100.4000.600.5000.60.5
我们使用默认的 DiscreteDistribution
其实现:

/**
 * A discrete distribution where the only observations are discrete
 * observations.  This is useful (for example) with discrete Hidden Markov
 * Models, where observations are non-negative integers representing specific
 * emissions.
 *
 * No bounds checking is performed for observations, so if an invalid
 * observation is passed (i.e. observation > numObservations), a crash will
 * probably occur.
 *
 * This distribution only supports one-dimensional observations, so when
 * passing an arma::vec as an observation, it should only have one dimension
 * (vec.n_rows == 1).  Any additional dimensions will simply be ignored.
 *
 * @note
 * This class, like every other class in mlpack, uses arma::vec to represent
 * observations.  While a discrete distribution only has positive integers
 * (size_t) as observations, these can be converted to doubles (which is what
 * arma::vec holds).  This distribution internally converts those doubles back
 * into size_t before comparisons.
 */
class DiscreteDistribution
{
   
 public:
  /**
   * Default constructor, which creates a distribution that has no
   * observations.
   */
  DiscreteDistribution() :
      probabilities(std::vector<arma::vec>(1)){
    /* Nothing to do. */ }

  /**
   * Define the discrete distribution as having numObservations possible
   * observations.  The probability in each state will be set to (1 /
   * numObservations).
   *
   * @param numObservations Number of possible observations this distribution
   *    can have.
   */
  DiscreteDistribution(const size_t numObservations) :
      probabilities(std::vector<arma::vec>(1,
          arma::ones<arma::vec>(numObservations) / numObservations))
  {
    /* Nothing to do. */ }

  /**
   * Define the multidimensional discrete distribution as having
   * numObservations possible observations.  The probability in each state will
   * be set to (1 / numObservations of each dimension).
   *
   * @param numObservations Number of possible observations this distribution
   *    can have.
   */
  DiscreteDistribution(const arma::Col<size_t>& numObservations)
  {
   
    for (size_t i = 0; i < numObservations.n_elem; ++i)
    {
   
      const size_t numObs = size_t(numObservations[i]);
      if (numObs <= 0)
      {
   
        std::ostringstream oss;
        oss << "number of observations for dimension " << i << " is 0, but "
            << "must be greater than 0";
        throw std::invalid_argument(oss.str());
      }
      probabilities.push_back(arma::ones<arma::vec>(numObs) / numObs);
    }
  }

  /**
   * Define the multidimensional discrete distribution as having the given
   * probabilities for each observation.
   *
   * @param probabilities Probabilities of each possible observation.
   */
  DiscreteDistribution(const std::vector<arma::vec>& probabilities)
  {
   
    for (size_t i = 0; i < probabilities.size(); ++i)
    {
   
      arma::vec temp = probabilities[i];
      double sum = accu(temp);
      if (sum > 0)
        this->probabilities.push_back(temp / sum);
      else
      {
   
        this->probabilities.push_back(arma::ones<arma::vec>(temp.n_elem)
            / temp.n_elem);
      }
    }
  }

  /**
   * Get the dimensionality of the distribution.
   */
  size_t Dimensionality() const {
    return probabilities.size(); }

  /**
   * Return the probability of the given observation.  If the observation is
   * greater than the number of possible observations, then a crash will
   * probably occur -- bounds checking is not performed.
   *
   * @param observation Observation to return the probability of.
   * @return Probability of the given observation.
   */
  double Probability(const arma::vec& observation) const
  {
   
    double probability = 1.0;
    // Ensure the observation has the same dimension with the probabilities.
    if (observation.n_elem != probabilities.size())
    {
   
      Log::Fatal << "DiscreteDistribution::Probability(): observation has "
          << "incorrect dimension " << observation.n_elem << " but should have"
          << " dimension " << probabilities.size() << "!" << std::endl;
    }

    for (size_t dimension = 0; dimension < observation.n_elem; dimension++)
    {
   
      // Adding 0.5 helps ensure that we cast the floating point to a size_t
      // correctly.
      const size_t obs = size_t(observation(dimension) + 0.5);

      // Ensure that the observation is within the bounds.
      if (obs >= probabilities[dimension].n_elem)
      {
   
        Log::Fatal << "DiscreteDistribution::Probability(): received "
            << "observation " << obs << "; observation must be in [0, "
            << probabilities[dimension].n_elem << "] for this distribution."
            << std::endl;
      }
      probability *= probabilities[dimension][obs];
    }

    return probability;
  }

  /**
   * Return the log probability of the given observation.  If the observation
   * is greater than the number of possible observations, then a crash will
   * probably occur -- bounds checking is not performed.
   *
   * @param observation Observation to return the log probability of.
   * @return Log probability of the given observation.
   */
  double LogProbability(const arma::vec& observation) const
  {
   
    // TODO: consider storing log probabilities instead?
    return log(Probability(observation));
  }

  /**
   * Calculates the Discrete probability density function for each
   * data point (column) in the given matrix.
   *
   * @param x List of observations.
   * @param probabilities Output probabilities for each input observation.
   */
  void Probability(const arma::mat& x, arma::vec& probabilities) const
  {
   
    probabilities.set_size(x.n_cols);
    for (size_t i = 0; i < x.n_cols; ++i)
      probabilities(i) = Probability(x.unsafe_col(i));
  }

  /**
   * Returns the Log probability of the given matrix. These values are stored
   * in logProbabilities.
   *
   * @param x List of observations.
   * @param logProbabilities Output log-probabilities for each input
   *   observation.
   */
  void LogProbability(const arma::mat& x, arma::vec& logProbabilities) const
  {
   
    logProbabilities.set_size(x.n_cols);
    for (size_t i = 0; i < x.n_cols; ++i)
      logProbabilities(i) = log(Probability(x.unsafe_col(i)));
  }

  /**
   * Return a randomly generated observation (one-dimensional vector; one
   * observation) according to the probability distribution defined by this
   * object.
   *
   * @return Random observation.
   */
  arma::vec Random() const;

  /**
   * Estimate the probability distribution directly from the given
   * observations. If any of the observations is greater than numObservations,
   * a crash is likely to occur.
   *
   * @param observations List of observations.
   */
  void Train(const arma::mat& observations);

  /**
   * Estimate the probability distribution from the given observations, taking
   * into account the probability of each observation actually being from this
   * distribution.
   *
   * @param observations List of observations.
   * @param probabilities List of probabilities that each observation is
   *     actually from this distribution.
   */
  void Train(const arma::mat& observations,
             const arma::vec& probabilities);

  //! Return the vector of probabilities for the given dimension.
  arma::vec& Probabilities(const size_t dim = 0) {
    return probabilities[dim]; }
  //! Modify the vector of probabilities for the given dimension.
  const arma::vec& Probabilities(const size_t dim = 0) const
  {
    return probabilities[dim]; }

  /**
   * Serialize the distribution.
   */
  template<typename Archive>
  void serialize(Archive& ar, const unsigned int /* version */)
  {
   
    ar & BOOST_SERIALIZATION_NVP(probabilities);
  }

 private:
  //! The probabilities for each dimension; each arma::vec represents the
  //! probabilities for the observations in each dimension.
  std::vector<arma::vec> probabilities;
};

观测序列是球的颜色,只有一个维度
因此,状态 1(即第一个盒子)的概率分布为 ( 0.5 , 0.5 ) T (0.5, 0.5)^{\mathsf{T}} (0.5,0.5)T
状态 2 的概率分布为 ( 0.3 , 0.7 ) T (0.3, 0.7)^{\mathsf{T}} (0.3,0.7)T
状态 3 的概率分布为 ( 0.6 , 0.4 ) T (0.6, 0.4)^{\mathsf{T}} (0.6,0.4)T
状态 4 的概率分布为 ( 0.8 , 0.2 ) T (0.8, 0.2)^{\mathsf{T}} (0.8,0.2)T

写成代码:

#include <iostream>
#include <mlpack/core/dists/discrete_distribution.hpp>
#include <mlpack/methods/hmm/hmm.hpp>

using namespace std;
using namespace arma;
using namespace mlpack::distribution;
using namespace mlpack::hmm;

void hmm_test()
{
   
    // initialize
    vec initial({
   0.25, 0.25, 0.25, 0.25});
    mat transition("0, 0.4, 0, 0;"
                   "1, 0, 0.4, 0;"
                   "0, 0.6, 0, 0.5;"
                   "0, 0, 0.6, 0.5;");
    DiscreteDistribution box1(2);
    DiscreteDistribution box2(vector<vec>(1, vec({
   0.3, 0.7})));
    DiscreteDistribution box3(vector<vec>(1, vec({
   0.6, 0.4})));
    DiscreteDistribution box4(vector<vec>(1, vec({
   0.8, 0.2})));
    vector<DiscreteDistribution> emission({
   box1, box2, box3, box4});
    HMM<DiscreteDistribution> hmm(initial, transition, emission);
}

概率计算方法

给定模型和观测序列,我们要计算在该模型下,观测序列出现的概率

头文件

  /**
   * Compute the log-likelihood of the given data sequence.
   *
   * @param dataSeq Data sequence to evaluate the likelihood of.
   * @return Log-likelihood of the given sequence.
   */
  double LogLikelihood(const arma::mat& dataSeq) const;

实现

/**
 * Compute the log-likelihood of the given data sequence.
 */
template<typename Distribution>
double HMM<Distribution>::LogLikelihood(const arma::mat& dataSeq) const
{
   
  arma::mat forwardLog;
  arma::vec logScales;

  Forward(dataSeq, logScales, forwardLog);

  // The log-likelihood is the log of the scales for each time step.
  return accu(logScales);
}

这里使用的是前向算法:

Forward

头文件

/**
   * The Forward algorithm (part of the Forward-Backward algorithm).  Computes
   * forward probabilities for each state for each observation in the given data
   * sequence.  The returned matrix has rows equal to the number of hidden
   * states and columns equal to the number of observations.
   *
   * @param dataSeq Data sequence to compute probabilities for.
   * @param logScales Vector in which scaling factors will be saved.
   * @param forwardLogProb Matrix in which forward probabilities will be saved.
   */
  void Forward(const arma::mat& dataSeq,
               arma::vec& logScales,
               arma::mat& forwardLogProb) const;

实现

/**
 * The Forward procedure (part of the Forward-Backward algorithm).
 */
template<typename Distribution>
void HMM<Distribution>::Forward(const arma::mat& dataSeq,
                                arma::vec& logScales,
                                arma::mat& forwardLogProb) const
{
   
  // Our goal is to calculate the forward probabilities:
  //  P(X_k | o_{1:k}) for all possible states X_k, for each time point k.
  forwardLogProb.resize(logTransition.n_rows, dataSeq.n_cols);
  forwardLogProb.fill(-std::numeric_limits<double>::infinity());
  logScales.resize(dataSeq.n_cols);
  logScales.fill(-std::numeric_limits<double>::infinity());

  ConvertToLogSpace();

  // The first entry in the forward algorithm uses the initial state
  // probabilities.  Note that MATLAB assumes that the starting state (at
  // t = -1) is state 0; this is not our assumption here.  To force that
  // behavior, you could append a single starting state to every single data
  // sequence and that should produce results in line with MATLAB.
  for (size_t state = 0; state < logTransition.n_rows; state++)
  {
   
    forwardLogProb(state, 0) = logInitial(state) +
        emission[state].LogProbability(dataSeq.unsafe_col(0));
  }

  // Then normalize the column.
  logScales[0] = math::AccuLog(forwardLogProb.col(0));
  if (std::isfinite(logScales[0]))
    forwardLogProb.col(0) -= logScales[0];

  // Now compute the probabilities for each successive observation.
  for (size_t t = 1; t < dataSeq.n_cols; t++)
  {
   
    for (size_t j = 0; j < logTransition.n_rows; ++j)
    {
   
      // The forward probability of state j at time t is the sum over all states
      // of the probability of the previous state transitioning to the current
      // state and emitting the given observation.
      arma::vec tmp = forwardLogProb.col(t - 1) + logTransition.row(j).t();
      forwardLogProb(j, t) = math::AccuLog(tmp) +
          emission[j].LogProbability(dataSeq.unsafe_col(t));
    }

    // Normalize probability.
    logScales[t] = math::AccuLog(forwardLogProb.col(t));
    if (std::isfinite(logScales[t]))
        forwardLogProb.col(t) -= logScales[t];
  }
}

简单起见,我们观测的数据都是一维的,因此不妨记 dataSeq 为 ( 1 × \times × T ) 的矩阵
隐藏的状态数为 N,因此 logInitial 为有 N 个元素的向量,logTransition 为 ( N × \times × N ) 的矩阵
emission 有 N 个元素,每个元素代表某一状态下的观测概率分布,因为观测的数据是一维的,所以每一观测概率分布的维度为 1

首先是 forwardLogProb 初始化为 ( N × \times × T ) 的矩阵,元素为负无穷,logScales 初始化为有 T 个元素的向量,元素为负无穷

然后是初值的计算,forwardLogProb 第一列的元素
f o r w a r d L o g P r o b ( i , 0 ) = l o g I n i t i a l ( i ) + log ⁡ ( e m i s s i o n ( d a t a S e q ( 0 ) , i ) )   , i = 0 , 1 , ⋯   , N − 1 forwardLogProb(i, 0) = logInitial(i) + \log(emission( dataSeq(0), i)) \ , \quad i = 0, 1, \cdots , N-1 forwardLogProb(i,0)=logInitial(i)+log(emission(dataSeq(0),i)) ,i=0,1,,N1
(为方便书写,我们将 emission 写成矩阵的形式,注意向量 emission[ j ] 代表的是第 j 列,因此 emission(i, j) 表示在状态 j 下观测到 i 的概率)

这里计算的是对数后的概率,其原始形式的含义是:在零时刻处于状态 i 的概率 乘以 在状态 i 下观测到观测序列第一个元素的概率
因此,forwardLogProb(i, 0) 表示:在零时刻处于状态 i 和在状态 i 下观测到 dataSeq(0) 的联合概率的自然对数

我们去看一下 AccuLog 的头文件

/**
 * Sum a vector of log values.  (T should be an Armadillo type.)
 *
 * @param x vector of log values
 * @return log(e^x0 + e^x1 + ...)
 */
template<typename T>
typename T::elem_type AccuLog(const T& x);

因此,logScales[0] 表示:零时刻观测到 dataSeq(0) 的概率自然对数
如果 logScales 是无穷的话,那么 forwardLogProb 相应的列全置为无穷

接下来是递推的过程,对于 t 时刻,目标状态为 j :

tmp 表示为两个列向量相加,因此 tmp[ i ] 是:上一时刻 (t-1) 位于状态 i 和在状态 i 观测到 dataSeq(t-1) 以及由状态 i 转移到目标状态 j 的联合概率的自然对数

forwardLogProb(j, t) 表示:上一时刻 (t-1) 观测到 dataSeq(t-1) 和 t 时刻位于状态 j 并观测到 dataSeq(t) 的联合概率的自然对数

注意递推的计算过程,上面赋初值的时候还看不出来:forwardProb(j, t) 这一概率在表示时刻 t 位于状态 j 并观测到 dataSeq(t) 时,隐含着上一时刻 forwardProb 的信息,这一递推的过程延续下去,forwardProb 定义的就是前向概率

logScales[ t ] 对 forwardLogProb 第 t 列求对数和,其含义是:上一时刻 (t-1) 观测到 dataSeq(t-1) 和 t 时刻观测到 dataSeq(t) 的联合概率的自然对数

回到 LogLikelihood ,调用 Forward 函数后,再将 logScales 所有元素累加求和(对数相加相当于概率相乘)
表示的含义就是观测到整个序列 dataSeq 的概率

例题

同上,书中的例题 10.2:

考虑盒子和球模型 λ = ( A , B , π ) \lambda = (A, B, \pi) λ=(A,B,π),状态集合 Q = { 1 , 2 , 3 } Q = \{1, 2, 3\} Q={ 1,2,3},观测集合 V = { 红 , 白 } V=\{红,白\} V={ }
A = [ 0.5 0.2 0.3 0.3 0.5 0.2 0.2 0.3 0.5 ] B = [ 0.5 0.5 0.4 0.6 0.7 0.3 ] π = [ 0.2 0.4 0.4 ] A = \begin{bmatrix} 0.5 \quad 0.2 \quad 0.3 \\ 0.3 \quad 0.5 \quad 0.2 \\ 0.2 \quad 0.3 \quad 0.5 \end{bmatrix} \quad B = \begin{bmatrix} 0.5 \quad 0.5 \\ 0.4 \quad 0.6 \\ 0.7 \quad 0.3 \end{bmatrix} \quad \pi = \begin{bmatrix} 0.2 \\ 0.4 \\ 0.4 \end{bmatrix} A=0.50.20.30.30.50.20.20.30.5B

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值