adaboost算法原理及实现

模型概述

Adaboost模型属于boost模型中的一种,boost模型的思想是通过从弱学习算法出发,反复学习,得到一系列弱分类器(又称为基本分类器),然后组合这些弱分类器,得到相应的强分类器。大多数的boost方法都是改变训练数据的概率分布,然后针对不同的训练数据分布学习相应的弱分类器。

Adaboost的模型的思想是在每一次训练过程中提高被前一轮弱分类器的错误分类的样本的权重,这样可以让分类器更好的纠正错误。在训练完所有的分类器后,Adaboost采用的是加权多数表决的方法来进行投票,加大分类误差率小的分类器的权重,使其在表决中能够其较大作用。

Adaboost模型可以看成是加法模型的特例,形式如下:
f ( x ) = ∑ m = 1 M α m G m ( x ) f(x) = \sum_{m=1}^M \alpha_mG_m(x) f(x)=m=1MαmGm(x)
G m ( x ) G_m(x) Gm(x)代表基分类器, α m \alpha_m αm代表其系数

模型策略

Adaboost模型可以看成是加法模型,相应的损失函数可以是指数损失函数。
L ( y , f ( x ) ) = e x p [ − y f ( x ) ] L(y,f(x)) = exp[-yf(x)] L(y,f(x))=exp[yf(x)]
f k ( x ) f_k(x) fk(x)为经过前k次学习后前k个弱学习器组合后的学习器。假设第k次迭代的参数 α k , G k ( x ) \alpha_k,G_k(x) αk,Gk(x),则 f k ( x ) = f k − 1 ( x ) + α k G k ( x ) f_k(x) = f_{k-1}(x) + \alpha_kG_k(x) fk(x)=fk1(x)+αkGk(x)
将上式代入损失函数可得:
l o s s = ∑ i = 1 N e x p [ − y i ( f k − 1 ( x i ) + α G ( x i ) ) ] loss = \sum_{i=1}^Nexp[-y_i(f_{k-1}(x_i)+\alpha G(x_i))] loss=i=1Nexp[yi(fk1(xi)+αG(xi))]
根据经验损失最小化的原则,有 α k , G k ( x ) \alpha_k,G_k(x) αk,Gk(x)为:
( α k , G k ( x ) ) = a r g m i n α , G ( x ) ∑ i = 1 N e x p [ − y i ( f k − 1 ( x ) + α G ( x i ) ) ] (\alpha_k,G_k(x)) = \mathop {argmin}\limits_{\alpha,G(x)}\sum\limits_{i=1}^{N}exp[-y_i(f_{k-1}(x)+\alpha G(x_i))] (αk,Gk(x))=α,G(x)argmini=1Nexp[yi(fk1(x)+αG(xi))]

固定 α \alpha α,则有使上式最小的 G m ∗ ( x ) G_m^*(x) Gm(x)应该是
G m ∗ ( x ) = a r g m i n G ∑ i = 1 N w m i − I ( y i ≠ G ( x i ) ) G_m^*(x) = \mathop {argmin}\limits_{G}\sum\limits_{i=1}^{N}w_ {mi}^{-} I(y_i \neq G(x_i)) Gm(x)=Gargmini=1NwmiI(yi=G(xi))
其中 w m i − = e x p [ − y i f m − 1 ( x i ) ] w_{mi}^{-}=exp[-y_if_{m-1}(x_i)] wmi=exp[yifm1(xi)]

而对于 α m ∗ \alpha_m^* αm,从损失函数有:
∑ i = 1 N w m i − e x p [ − y i α G ( x i ) ) ] = ∑ y i = G ( x i ) w m i − e − α + ∑ y i ≠ G ( x i ) w m i − e α = ( e α − e − α ) G ∑ i = 1 N w m i − I ( y i ≠ G ( x i ) ) + e − α ∑ i = 1 N w m i − \sum_{i=1}^Nw_{mi}^{-}exp[-y_i\alpha G(x_i))] = \sum\limits_{y_i=G(x_i)}w_{mi}^{-}e^{-\alpha}+\sum\limits_{y_i\neq G(x_i)}w_{mi}^{-}e^{\alpha}=\\ (e^{\alpha}-e^{-\alpha}) {G}\sum\limits_{i=1}^{N}w_{mi}^{-}I(y_i \neq G(x_i)) +e^{-\alpha}\sum\limits_{i=1}^{N}w_{mi}^{-} i=1Nwmiexp[yiαG(xi))]=yi=G(xi)wmieα+yi=G(xi)wmieα=(eαeα)Gi=1NwmiI(yi=G(xi))+eαi=1Nwmi

对上式 α \alpha α求导,则有
α m ∗ = 1 2 l o g 1 − e m e m \alpha_m^* = \frac{1}{2}log\frac{1-e_m}{e_m} αm=21logem1em
其中: e m = ∑ i = 1 N w m i − I ( y i ≠ G ( x i ) ) ∑ i = 1 N w m i − = ∑ i = 1 N w m i I ( y i ≠ G ( x i ) ) e_m = \frac{\sum\limits_{i=1}^{N}w_{mi}^{-}I(y_i \neq G(x_i))}{\sum\limits_{i=1}^{N}w_{mi}^{-}}=\sum\limits_{i=1}^{N}w_{mi}I(y_i \neq G(x_i)) em=i=1Nwmii=1NwmiI(yi=G(xi))=i=1NwmiI(yi=G(xi))
最后每一轮权值的更新由: w m i − = e x p [ − y i f m − 1 ( x i ) ] w_{mi}^{-}=exp[-y_if_{m-1}(x_i)] wmi=exp[yifm1(xi)],以及 f m ( x ) = f m − 1 ( x ) + α k G m ( x ) f_m(x) = f_{m-1}(x) + \alpha_kG_m(x) fm(x)=fm1(x)+αkGm(x)
可得:
w m + 1 , i − = w m , i − e x p [ − y i α m G m ( x ) ] w_{{m+1},i}^{-}=w_{{m},i}^{-}exp[-y_i \alpha _{m}G_m(x)] wm+1,i=wm,iexp[yiαmGm(x)]

模型算法

输入:训练数据集 T = ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . , ( x N , y N ) ) T = {(x_1,y_1),(x_2,y_2),...,(x_N,y_N))} T=(x1,y1),(x2,y2),...,(xN,yN)),其中 x i ∈ X ⊆ R n , y i ∈ Y = { − 1 , + 1 } x_i\in X\subseteq R^n,y_i \in Y = \{-1,+1\} xiXRn,yiY={1,+1}
输出: 最终分类器 G ( x ) G(x) G(x)
(1)初始化训练数据的权值分布
D 1 = ( w 11 , . . . , w 1 i , . . . , w 1 N ) , w 1 i = 1 N , i = 1 , 2 , . . . , N D_1 = (w_{11},...,w_{1i},...,w_{1N}), w_{1i} = \frac{1}{N},i = 1,2,...,N D1=(w11,...,w1i,...,w1N),w1i=N1,i=1,2,...,N
(2)对m = 1,2,…,M
( a )使用具有权值分布 D m D_m Dm的训练数据集学习,得到基本分类器
G m ( x ) : X − > { − 1 , + 1 } G_m(x): X ->\{-1,+1\} Gm(x):X>{1,+1}
( b )计算G_m(x)在训练数据集上的分类误差率
e m = ∑ i = 1 N w m i − I ( y i ≠ G ( x i ) ) ∑ i = 1 N w m i − = ∑ i = 1 N w m i I ( y i ≠ G ( x i ) ) e_m = \frac{\sum\limits_{i=1}^{N}w_{mi}^{-}I(y_i \neq G(x_i))}{\sum\limits_{i=1}^{N}w_{mi}^{-}}=\sum\limits_{i=1}^{N}w_{mi}I(y_i \neq G(x_i)) em=i=1Nwmii=1NwmiI(yi=G(xi))=i=1NwmiI(yi=G(xi))
( c )计算 G m G_m Gm的系数
α m = 1 2 l o g 1 − e m e m \alpha_m = \frac{1}{2}log\frac{1-e_m}{e_m} αm=21logem1em
( d )更新训练数据集的权值分布
D m + 1 = ( w m + 1 , 1 , . . . , w m + 1 , i , . . . , w m + 1 , N ) w m + 1 , i = w m , i Z m e x p ( − α m y i G m ( x i ) ) Z m = ∑ i = 1 N w m , i e x p ( − α m y i G m ( x i ) ) D_{m+1}= (w_{m+1,1},...,w_{m+1,i},...,w_{m+1,N}) \\ w_{m+1,i}=\frac{w_{m,i}}{Z_m}exp(-\alpha_my_iG_m(x_i)) \\ Z_m = \sum_{i=1}^{N}w_{m,i}exp(-\alpha_my_iG_m(x_i)) Dm+1=(wm+1,1,...,wm+1,i,...,wm+1,N)wm+1,i=Zmwm,iexp(αmyiGm(xi))Zm=i=1Nwm,iexp(αmyiGm(xi))
(3)生成最终分类器
G ( x ) = s i g n ( ∑ m = 1 M α m G m ( x ) ) G(x) = sign(\sum_{m=1}^{M}\alpha_mG_m(x)) G(x)=sign(m=1MαmGm(x))

代码实现

首先导入相关包

import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import pandas as pd
import matplotlib.pyplot as pyplot

引入测试数据

def create_data():
    iris=load_iris()
    df=pd.DataFrame(iris.data,columns=iris.feature_names)
    df['label']=iris.target
    df.columns=['sepal length','sepal width','pedal_length','pedal width','label']
    data=df.iloc[:100,[0,1,-1]]
    data['label'].apply(lambda x: 1 if x==1 else -1)
    data = np.array(data)
    return data[:,:2] ,data[:,-1]

算法核心原理部分主要包括生成G(x)和 α \alpha α的部分

class Adaboost:
	def __init__(self, n_estimators, learning_rate):
		self.n_estimators = n_estimators
		self.learning_rate = learning_rate
		self.model = []


	def fit(self,X_train, y_train):
		"""拟合训练集数据"""
		self.m, self.n = X_train.shape
		#初始化权值分布
		self.weight = [np.ones(self.m) / self.m]
		for i in range(self.n_estimators):
			compare_array, position, threshold, error, axis = self._G(X_train, y_train,self.weight[i])
			alpha_i = self.caculate_alpha(error)
			Z_i = self.caculate_Z(alpha_i, self.weight[i], compare_array, y_train)
	
			self.weight.append(self.weight[i] * np.exp(- alpha_i * compare_array * y_train /Z_i))

			self.model.append((axis, alpha_i, position, threshold))







	def caculate_alpha(self,error):
		return 0.5 * np.log((1 -error) / error)



	def caculate_Z(self, alpha, weight, pre_y, y):
		return np.dot(weight,np.exp(- alpha * pre_y *y))



	def calculate_err_rate(self, pre_y, y, weight):
		error = sum([weight[i] if pre_y[i] != y[i] else 0 for i in range(self.m)])  
		return error


	def G(self, threshold, x, position):
		#基本分类器
		if position == 'positive':
			pre_y = np.array([1 if x[i] >threshold else -1 for i in range(len(x))])
		else:
			pre_y = np.array([-1 if x[i] >threshold else 1 for i in range(len(x))])

		return pre_y

		
	def _G(self, X_train, y_train,weight):
		min_error = np.inf
		position = None
		threshold = None
		compare_array = None
		axis = None
		for i in range(self.n):
			feature = X_train[:, i]
			feature_max = max(feature)
			feature_min = min(feature)
			iter_num = int((feature_max -feature_min) // self.learning_rate)
    

			for j in range(iter_num):
				vi = feature_min + j * self.learning_rate

				pre_y_positive = self.G(vi, feature, 'positive')
				err_positive = self.calculate_err_rate(pre_y_positive, y_train,weight)
		
				pre_y_negative = self.G(vi, feature, 'negative')
				err_negative = self.calculate_err_rate(pre_y_negative, y_train, weight)

				if err_positive >err_negative:
					if err_negative < min_error:
						max_error = err_negative
						position = 'nagetive'
						compare_array = pre_y_negative
						threshold = vi
						axis = i
				else:
					if err_positive < min_error:
						max_error = err_positive
						position = 'positive'
						compare_array = pre_y_positive
						threshold = vi
						axis = i


		return compare_array, position, threshold, max_error, axis



	def predict(self, X_test, y_test):
		"""预测测试集数据"""
		result = []
		for i in range(len(self.model)):
			axis, alpha_i, position, threshold = self.model[i]
			result += alpha_i * self.G(threshold, X_test[i], y_test)

		return [1 if result[i] > 0 else -1 for i in range(len(result))]




	def score(self, X_test, y_test):
		"""测试模型正确率"""
		num = X_test.shape[0]
		acc_num = 0
		f = self.predict(X_test)
		acc_num = sum(f == y_test)
		return float(acc_num / num)
  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值