Perceptron(mlpack)

本文详细介绍了mlpack库中Perceptron的实现,包括权重初始化策略ZeroInitialization、训练过程中的SimpleWeightUpdate算法以及分类函数。通过《统计学习方法》中的例子进行测试,展示了Perceptron的工作原理和效果。
摘要由CSDN通过智能技术生成

感知机

源码

/**
 * This class implements a simple perceptron (i.e., a single layer neural
 * network).  It converges if the supplied training dataset is linearly
 * separable.
 *
 * @tparam LearnPolicy Options of SimpleWeightUpdate and GradientDescent.
 * @tparam WeightInitializationPolicy Option of ZeroInitialization and
 *      RandomInitialization.
 */
template<typename LearnPolicy = SimpleWeightUpdate,
         typename WeightInitializationPolicy = ZeroInitialization,
         typename MatType = arma::mat>
class Perceptron
{
 public:
  /**
   * Constructor: create the perceptron with the given number of classes and
   * initialize the weight matrix, but do not perform any training.  (Call the
   * Train() function to perform training.)
   *
   * @param numClasses Number of classes in the dataset.
   * @param dimensionality Dimensionality of the dataset.
   * @param maxIterations Maximum number of iterations for the perceptron
   *      learning algorithm.
   */
  Perceptron(const size_t numClasses = 0,
             const size_t dimensionality = 0,
             const size_t maxIterations = 1000);

首先是关于感知机整体的一些参数,例如学习策略是 SimpleWeightUpdate ,权重初始化的方式是 ZeroInitialization

然后是第一种初始化方法,提供三个数字,分别代表数据集的种类数,数据集的维度,以及最大迭代次数。

下面,我们来看一下它的具体实现:

/**
 * Construct the perceptron with the given number of classes and maximum number
 * of iterations.
 */
template<
    typename LearnPolicy,
    typename WeightInitializationPolicy,
    typename MatType
>
Perceptron<LearnPolicy, WeightInitializationPolicy, MatType>::Perceptron(
    const size_t numClasses,
    const size_t dimensionality,
    const size_t maxIterations) :
    maxIterations(maxIterations)
{
  WeightInitializationPolicy wip;
  wip.Initialize(weights, biases, dimensionality, numClasses);
}

可以看到,它构造了一个类型是 WeightInitializationPolicy 的实例,接着调用该实例的 Initialize 方法,并将权重向量,偏置向量,数据维度,数据种类传递给它

按照头文件里的说明,这里的 WeightInitializationPolicy 默认是 ZeroInitialization ,因此我们不妨去看一下 ZeroInitialization 的实现:

/**
 * This class is used to initialize the matrix weightVectors to zero.
 */
class ZeroInitialization
{
 public:
  ZeroInitialization() { }

  inline static void Initialize(arma::mat& weights,
                                arma::vec& biases,
                                const size_t numFeatures,
                                const size_t numClasses)
  {
    weights.zeros(numFeatures, numClasses);
    biases.zeros(numClasses);
  }
}; // class ZeroInitialization

按照 Armadillo 的官方文档,分别将 weights 初始化为 (数据维度 * 数据种类) 的向量,biases 初始化为列向量,其元素个数等于数据集的种类数。它们的初始元素都为零,至于它们为什么是这个形状的向量,与该实现的 SimpleWeightUpdate 有关。

继续看完其余两个构造方法:

/**
   * Constructor: constructs the perceptron by building the weights matrix,
   * which is later used in classification.  The number of classes should be
   * specified separately, and the labels vector should contain values in the
   * range [0, numClasses - 1].  The data::NormalizeLabels() function can be
   * used if the labels vector does not contain values in the required range.
   *
   * @param data Input, training data.
   * @param labels Labels of dataset.
   * @param numClasses Number of classes in the dataset.
   * @param maxIterations Maximum number of iterations for the perceptron
   *      learning algorithm.
   */
  Perceptron(const MatType& data,
             const arma::Row<size_t>& labels,
             const size_t numClasses,
             const size_t maxIterations = 1000);

实现:

/**
 * Constructor - constructs the perceptron. Or rather, builds the weights
 * matrix, which is later used in classification.  It adds a bias input vector
 * of 1 to the input data to take care of the bias weights.
 *
 * @param data Input, training data.
 * @param labels Labels of dataset.
 * @param maxIterations Maximum number of iterations for the perceptron learning
 *      algorithm.
 */
template<
    typename LearnPolicy,
    typename WeightInitializationPolicy,
    typename MatType
>
Perceptron<LearnPolicy, WeightInitializationPolicy, MatType>::Perceptron(
    const MatType& data,
    const arma::Row<size_t>& labels,
    const size_t numClasses,
    const size_t maxIterations) :
    maxIterations(maxIterations)
{
  // Start training.
  Train(data, labels, numClasses);
}

在该实现里,可以直接传入训练数据和标签,以及种类数,接着就用这些数据开始训练

还有一个复制构造函数:

/**
   * Alternate constructor which copies parameters from an already initiated
   * perceptron.
   *
   * @param other The other initiated Perceptron object from which we copy the
   *       values from.
   * @param data The data on which to train this Perceptron object on.
   * @param labels The labels of data.
   * @param numClasses Number of classes in the data.
   * @param instanceWeights Weight vector to use while training. For boosting
   *      purposes.
   */
  Perceptron(const Perceptron& other,
             const MatType& data,
             const arma::Row<size_t>& labels,
             const size_t numClasses,
             const arma::rowvec& instanceWeights);

实现:

/**
 * Alternate constructor which copies parameters from an already initiated
 * perceptron.
 *
 * @param other The other initiated Perceptron object from which we copy the
 *      values from.
 * @param data The data on which to train this Perceptron object on.
 * @param instanceWeights Weight vector to use while training. For boosting
 *      purposes.
 * @param labels The labels of data.
 */
template<
    typename LearnPolicy,
    typename WeightInitializationPolicy,
    typename MatType
>
Perceptron<LearnPolicy, WeightInitializationPolicy, MatType>::Perceptron(
    const Perceptron& other,
    const MatType& data,
    const arma::Row<size_t>& labels,
    const size_t numClasses,
    const arma::rowvec& instanceWeights) :
    maxIterations(other.maxIterations)
{
  Train(data, labels, numClasses, instanceWeights);
}

原理和上一个差不多,只不过在训练时多了一个 instanceWeights ,正如注释里提到,这个向量在训练时有用,下面我们去看一下训练的函数:

/**
 * Training function.  It trains on trainData using the cost matrix
 * instanceWeights.
 *
 * @param data Data to train on.
 * @param labels Labels of data.
 * @param instanceWeights Cost matrix. Stores the cost of mispredicting
 *      instances.  This is useful for boosting.
 */
template<
    typename LearnPolicy,
    typename WeightInitializationPolicy,
    typename MatType
>
void Perceptron<LearnPolicy, WeightInitializationPolicy, MatType>::Train(
    const MatType& data,
    const arma::Row<size_t>& labels,
    const size_t numClasses,
    const arma::rowvec& instanceWeights)
{
  // Do we need to resize the weights?
  if (weights.n_elem != numClasses)
  {
    WeightInitializationPolicy wip;
    wip.Initialize(weights, biases, data.n_rows, numClasses);
  }

  size_t j, i = 0;
  bool converged = false;
  size_t tempLabel;
  arma::uword maxIndexRow = 0, maxIndexCol = 0;
  arma::mat tempLabelMat;

  LearnPolicy LP;

  const bool hasWeights = (instanceWeights.n_elem > 0);

  while ((i < maxIterations) && (!converged))
  {
    // This outer loop is for each iteration, and we use the 'converged'
    // variable for noting whether or not convergence has been reached.
    ++i;
    converged = true;

    // Now this inner loop is for going through the dataset in each iteration.
    for (j = 0; j < data.n_cols; ++j)
    {
      // Multiply for each variable and check whether the current weight vector
      // correctly classifies this.
      tempLabelMat = weights.t() * data.col(j) + biases;

      tempLabelMat.max(maxIndexRow, maxIndexCol);

      // Check whether prediction is correct.
      if (maxIndexRow != labels(0, j))
      {
        // Due to incorrect prediction, convergence set to false.
        converged = false;
        tempLabel = labels(0, j);

        // Send maxIndexRow for knowing which weight to update, send j to know
        // the value of the vector to update it with.  Send tempLabel to know
        // the correct class.
        if (hasWeights)
          LP.UpdateWeights(data.col(j), weights, biases, maxIndexRow, tempLabel,
              instanceWeights(j));
        else
          LP.UpdateWeights(data.col(j), weights, biases, maxIndexRow,
              tempLabel);
      }
    }
  }
}

开始是一些准备工作,然后对数据的每一列(注意mlpack的Load函数会自动转置读入的数据矩阵),用权重和偏置检查它的类别,如果误分类了,就将该数据,权重,偏置,错误的索引值,正确的索引值传递给更新权重的函数。

这里也就是 SimpleWeightUpdate :

/**
 * This class is used to update the weightVectors matrix according to the simple
 * update rule as discussed by Rosenblatt:
 *
 *  if a vector x has been incorrectly classified by a weight w,
 *  then w = w - x
 *  and  w'= w'+ x
 *
 *  where w' is the weight vector which correctly classifies x.
 */
namespace mlpack {
namespace perceptron {

class SimpleWeightUpdate
{
 public:
  /**
   * This function is called to update the weightVectors matrix.  It decreases
   * the weights of the incorrectly classified class while increasing the weight
   * of the correct class it should have been classified to.
   *
   * @tparam Type of vector (should be an Armadillo vector like arma::vec or
   *      arma::sp_vec or something similar).
   * @param trainingPoint Point that was misclassified.
   * @param weights Matrix of weights.
   * @param biases Vector of biases.
   * @param incorrectClass Index of class that the point was incorrectly
   *      classified as.
   * @param correctClass Index of the true class of the point.
   * @param instanceWeight Weight to be given to this particular point during
   *      training (this is useful for boosting).
   */
  template<typename VecType>
  void UpdateWeights(const VecType& trainingPoint,
                     arma::mat& weights,
                     arma::vec& biases,
                     const size_t incorrectClass,
                     const size_t correctClass,
                     const double instanceWeight = 1.0)
  {
    weights.col(incorrectClass) -= instanceWeight * trainingPoint;
    biases(incorrectClass) -= instanceWeight;

    weights.col(correctClass) += instanceWeight * trainingPoint;
    biases(correctClass) += instanceWeight;
  }
};

} // namespace perceptron
} // namespace mlpack

更新的方法正如一开始所说,对于待分类的向量 x x x ,错误的权重向量 w = w − x w = w - x w=wx ,正确的权重向量 w ′ = w ′ + x w' = w' + x w=w+x
而之前提到的 instanceWeights 在更新函数里被用作学习率的向量,默认的学习率为 1.0
权重向量则按照这个 instanceWeight 进行更新

最后还有分类函数:

/**
 * Classification function. After training, use the weights matrix to classify
 * test, and put the predicted classes in predictedLabels.
 *
 * @param test Testing data or data to classify.
 * @param predictedLabels Vector to store the predicted classes after
 *      classifying test.
 */
template<
    typename LearnPolicy,
    typename WeightInitializationPolicy,
    typename MatType
>
void Perceptron<LearnPolicy, WeightInitializationPolicy, MatType>::Classify(
    const MatType& test,
    arma::Row<size_t>& predictedLabels)
{
  arma::vec tempLabelMat;
  arma::uword maxIndex = 0;

  // Could probably be faster if done in batch.
  for (size_t i = 0; i < test.n_cols; ++i)
  {
    tempLabelMat = weights.t() * test.col(i) + biases;
    tempLabelMat.max(maxIndex);
    predictedLabels(0, i) = maxIndex;
  }
}

和训练时差不多,用权重的转置乘以测试集的每一列(每一个数据点),再加上偏置,最后取最大元素的索引值作为分类的结果

测试

用《统计学习方法》里,例2.1示范:正实例点是 x 1 = ( 3 , 3 ) T x_1=(3,3)^{\mathsf{T}} x1=(3,3)T, x 2 = ( 4 , 3 ) T x_2=(4,3)^{\mathsf{T}} x2=(4,3)T, 负实例点是 x 3 = ( 1 , 1 ) T x_3=(1,1)^{\mathsf{T}} x3=(1,1)T

数据比较少,迭代个10次就差不多了:

#include <iostream>
#include <mlpack/core.hpp>
#include <mlpack/methods/perceptron/perceptron.hpp>

using namespace mlpack;
using namespace mlpack::perceptron;
using namespace arma;
using namespace std;

int main()
{
	mat dataset;
    mlpack::data::Load("../ml_test/data/my_data.csv", dataset);
    Row<size_t> labels;
    labels = conv_to<decltype (labels)>::from(dataset.row(dataset.n_rows-1));
    dataset.shed_row(dataset.n_rows-1);

    cout << "dataset:\n" << dataset << "labels:\n" << labels << endl;
    Perceptron p(2, 2, 10);
    p.Train(dataset, labels, 2);
    cout << "weights:\n" << p.Weights() << endl;
    cout << "bias:\n" <<  p.Biases() << endl;
}

输出:
dataset:
3.0000 4.0000 1.0000
3.0000 3.0000 1.0000
labels:
1 1 0

weights:
-1.0000 1.0000
-1.0000 1.0000

bias:
3.0000
-3.0000

按照先前的解释,对于待分类点 ( x 1 ,   x 2 ) (x_1, \ x_2) (x1, x2),其预测结果为:
{ 0   ,   x 1 + x 2 − 3 ⩽ 0 1   ,   o t h e r w i s e \begin{cases} 0 \ , \ x_1 + x_2 - 3 \leqslant 0 \\ 1 \ , \ otherwise \end{cases} {0 , x1+x2301 , otherwise

例如接着上面的程序:

mat test;
test << 5 << endr << 2 << endr;
Row<size_t> pred_labels(1);
p.Classify(test, pred_labels);
cout << "predicted labels: " << pred_labels[0] << endl;

输出:
predicted labels: 1

参考

perceptron
《统计学习方法》

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值