【徒手写机器学习算法】AdaBoost算法

最新推荐文章于 2022-03-07 00:02:15 发布

hanss2

最新推荐文章于 2022-03-07 00:02:15 发布

阅读量834

点赞数

分类专栏：徒手系列徒手写机器学习算法

本文链接：https://blog.csdn.net/hanss2/article/details/80365766

版权

徒手系列同时被 2 个专栏收录

6 篇文章 0 订阅

订阅专栏

徒手写机器学习算法

6 篇文章 4 订阅

订阅专栏

Adaboost是一种迭代算法，其核心思想是针对同一个训练集训练不同的分类器(弱分类器)，然后把这些弱分类器集合起来，构成一个更强的最终分类器（强分类器）。

我们要写的AdaBoost算法

我们假设要训练的就是之前的感知器算法，也就是每个弱学习器都是一个线性感知机。其流程可以这样描述:

这里写图片描述

解释一下这个算法流程图:

初始化训练数据（每个样本）的权值分布 $D$ ：如果有 $m$ 个样本，则每一个训练的样本点最开始时都被赋予相同的权重： $1/m$ 。
训练弱分类器。具体训练过程中，如果某个样本已经被准确地分类，那么在构造下一个训练集中，它的权重 $D_i^{(t+1)}$ 就被降低；相反，如果某个样本点没有被准确地分类，那么它的权重 $D_i^{(t+1)}$ 就得到提高。同时，得到第 $t$ 个弱分类器对应的话语权 $w_t$ 。然后，更新权值后的样本集被用于训练下一个分类器，整个训练过程如此迭代地进行下去。
将各个训练得到的弱分类器组合成强分类器 $h_s(x)$ 。各个弱分类器的训练过程结束后，分类误差率小的弱分类器的话语权较大，其在最终的分类函数中起着较大的决定作用，而分类误差率大的弱分类器的话语权较小，其在最终的分类函数中起着较小的决定作用。换言之，误差率低的弱分类器在最终分类器中占的比例较大，反之较小。

变量分配

static std::vector<double> D;    //权值分布
static std::vector<VectorXd*> H; //弱学习器
static std::vector<double> W;    //最终每个学习器的权重

整个程序

#include "csv.hpp"
#include <Eigen/Dense>
#include <iostream>
#include <vector>
#include <math.h>

//the max training steps
#define T 10

using namespace std;  
using namespace Eigen;

static std::vector<double> D;
static std::vector<VectorXd*> H;
static std::vector<double> W;

void init_D(std::vector<double>& D)
{
    if (D.size() == 0)
    {
        return;
    }
    for (int i = 0; i < D.size(); ++i)
    {
        D[i] = 1.0/D.size();
    }
}

template<typename DType>
int judge(VectorXd* X, DType y, VectorXd* h)
{
    if (h->dot(*X)*y > 0)
    {
        return 1;
    }
    return 0;
}

template<typename DType>
double compute_error(std::vector<VectorXd*> X, std::vector<DType> y, VectorXd* h,std::vector<double> D)
{
    double error = 0;
    for (int i = 0; i < X.size(); ++i)
    {
        error += D[i]*judge(X[i],y[i],h);
    }
    return error;
}

template<typename DType>
double compute_exp_error(int step, std::vector<VectorXd*> X, std::vector<DType> y, , VectorXd* h,std::vector<double> D)
{
    double error = 0;
    for (int i = 0; i < X.size(); ++i)
    {
        error += D[i]*exp(-W[step]*y[i]*h->dot(*X[i]));
    }
    return error;
}

template<typename DType>
void train(std::vector<VectorXd*> X, std::vector<DType> y)
{
    int step = 0;
    while(step < T)
    {
        WL<DType>(h,X,y,D);
        double e = compute_error<DType>(X,y,H[step],D);
        W[step]  = log(1/e - 1)/2;
        updateD<DType>(D,W,X,y,H[step]);
    }
}

template<typename DType>
void updateD(int step, std::vector<double>& D, static std::vector<double> W, std::vector<VectorXd*> X, std::vector<DType> y, VectorXd* h)
{
    std::vector<double> D_new = D;
    for (int i = 0; i < D.size(); ++i)
    {
        D_new[i] = D[i]*exp(-W[step]*y[i]*h->dot(*X[i]))/compute_exp_error<DType>(step,X,y,h,D);
    }
    for (int i = 0; i < D_new.size(); ++i)
    {
        D[i] = D_new[i];
    }
}

//weak learner WL
template<typename DType>
void WL(VectorXd* h, std::vector<VectorXd*> X, std::vector<DType> y, std::vector<double>& D)
{
    int step = 0;
    while(step < 100)
    {
        int flag = 0; //judge if all samples from <X,y> meet y<W,X> >= 0;
        for (int t = 0; t < X.size(); ++t)
        {
            if ( y[t]*( h->dot(*X) ) < 0 )
            {
                h->array() += D[t]*y[t]*(X->array());
                flag++;
            }
        }
        if (flag == 0)
        {
            return;
        }
        step++;
    }
}

int predict(VectorXd* x)
{
    double res = 0;
    for (int t = 0; t < T; ++t)
    {
        res += W[t]*H[t]->dot(*x);
    }
    return res>0;
}