Machine Learning with Scikit-Learn and Tensorflow 7.8 AdaBoost

书籍信息
Hands-On Machine Learning with Scikit-Learn and Tensorflow
出版社: O’Reilly Media, Inc, USA
平装: 566页
语种: 英语
ISBN: 1491962291
条形码: 9781491962299
商品尺寸: 18 x 2.9 x 23.3 cm
ASIN: 1491962291

系列博文为书籍中文翻译
代码以及数据下载:https://github.com/ageron/handson-ml

Boosting是将弱的模型整合成为强的模型的方法,核心思想逐个训练模型,当前的模型尝试纠正前面的模型的错误。Boosting的方法有许多,著名的方法包括AdaBoost(Adaptive Boosting)和Gradient Boosting。

当前模型纠正错误的方法包括更加关注前面模型的错误,这样的过程导致新的模型越来越关注困难的训练数据,这是AdaBoost的核心思想。

例如,当我们使用AdaBoost进行分类时,第1个模型(例如决策树)训练后预测训练数据,预测错误的训练数据权重升高。第2个模型使用新的权重训练然后预测训练数据,训练数据的权重再次被修改,以此类推。

下图展示逐个训练的模型的决策边界(基础模型是高度正则化的SVM)。第1个模型存在许多预测错误的训练数据,这些训练数据的权重升高,第2个模型因此取得更加优秀的效果。右边是学习速率(预测错误的训练数据权重增加速率)减半的结果。

from sklearn.svm import SVC
from matplotlib.colors import ListedColormap

def plot_decision_boundary(clf, X, y, axes=[-1.5, 2.5, -1, 1.5], alpha=0.5):
    x1s = np.linspace(axes[0], axes[1], 100)
    x2s = np.linspace(axes[2], axes[3], 100)
    x1, x2 = np.meshgrid(x1s, x2s)
    X_new = np.c_[x1.ravel(), x2.ravel()]
    y_pred = clf.predict(X_new).reshape(x1.shape)
    custom_cmap = ListedColormap(['#fafab0','#9898ff'])
    plt.contourf(x1, x2, y_pred, alpha=0.3, cmap=custom_cmap)
    custom_cmap2 = ListedColormap(['#7d7d58','#4c4c7f'])
    plt.contour(x1, x2, y_pred, cmap=custom_cmap2, alpha=0.8)
    plt.plot(X[:, 0][y==0], X[:, 1][y==0], "yo", alpha=alpha)
    plt.plot(X[:, 0][y==1], X[:, 1][y==1], "bs", alpha=alpha)
    plt.axis(axes)
    plt.xlabel(r"$x_1$", fontsize=18)
    plt.ylabel(r"$x_2$", fontsize=18, rotation=0)

m = len(X_train)
plt.figure(figsize=(11, 4))
for subplot, learning_rate in ((121, 1), (122, 0.5)):
    sample_weights = np.ones(m)
    for i in range(4):
        plt.subplot(subplot)
        svm_clf = SVC(kernel="rbf", C=0.05)
        svm_clf.fit(X_train, y_train, sample_weight=sample_weights)
        y_pred = svm_clf.predict(X_train)
        sample_weights[y_pred != y_train] *= (1 + learning_rate)
        plot_decision_boundary(svm_clf, X, y, alpha=0.2)
        plt.title("learning_rate = {}".format(learning_rate - 1), fontsize=16)
plt.subplot(121)
plt.text(-0.7, -0.65, "1", fontsize=14)
plt.text(-0.6, -0.10, "2", fontsize=14)
plt.text(-0.5,  0.30, "3", fontsize=14)
plt.text(-0.3,  0.60, "4", fontsize=14)
plt.show()

这里写图片描述

注释:
实例用于展示,事实上,SVM不适合作为AdaBoost的基础模型,因为SVM训练速度慢且使用时不稳定。

AdaBoost逐个训练模型的思想类似梯度下降。不同的是,梯度下降尝试通过寻找最优参数最小化代价函数,AdaBoost通过添加新的模型提升模型的效果。所有模型训练完成后,AdaBoost使用类似bagging的方式整合结果,只是不同的模型根据他们在带权训练数据的预测结果拥有不同的权重。

注释:
Boosting逐个训练模型的缺点是无法并行,当前模型只有在前面的模型训练完成后才能开始训练。

让我们详细分析AdaBoost。初始化时,训练数据的权值 wi 1m (m是训练数据数量),第1个模型训练后预测训练数据,其在带权训练数据的错误率 rj 是预测正确的训练数据的权值和除以训练数据的权值和。当前模型的权值 αj=η log1rjrj 。其中 η 是学习速率(默认是1)。模型的效果越好,权重越高。如果模型接近随机猜测,那么模型的权重接近0。如果模型经常错误,那么模型的权重是负值。

然后,训练数据的权重被修改。如果预测值与实际值相同,那么权重 wi 不变,否则权重 wi 变为 wiexp(αj) 。注意最后所有权重需要除以 mi=1wi

最后,第2个模型使用新的权重训练。这样的过程持续到模型的数量达到要求或者得到完美的模型。

进行预测时,AdaBoost计算所有模型的预测结果,最终的结果是获得模型权值最多的结果。

实际上,scikit-learn使用的是SAMME。当标签只有两类时,SAMME与AdaBoost相同。另外,如果基础模型能够预测类别概率,scikit-learn使用SAMMER,核心思想是通过基础模型的类别概率而不是预测结果得到最终的结果,通常能够获得更好的效果。

译者注:
根据官方文档,AdaBoostClassifier使用的是AdaBoost-SAMME和AdaBoost-SAMME.R(通过algorithm参数选择),AdaBoostRegressor使用的是AdaBoost.R2。

以下是AdaBoost实例。

from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import AdaBoostClassifier
from matplotlib.colors import ListedColormap

def plot_decision_boundary(clf, X, y, axes=[-1.5, 2.5, -1, 1.5], alpha=0.5):
    x1s = np.linspace(axes[0], axes[1], 100)
    x2s = np.linspace(axes[2], axes[3], 100)
    x1, x2 = np.meshgrid(x1s, x2s)
    X_new = np.c_[x1.ravel(), x2.ravel()]
    y_pred = clf.predict(X_new).reshape(x1.shape)
    custom_cmap = ListedColormap(['#fafab0','#9898ff'])
    plt.contourf(x1, x2, y_pred, alpha=0.3, cmap=custom_cmap, linewidth=10)
    plt.plot(X[:, 0][y==0], X[:, 1][y==0], "yo", alpha=alpha)
    plt.plot(X[:, 0][y==1], X[:, 1][y==1], "bs", alpha=alpha)
    plt.axis(axes)
    plt.xlabel(r"$x_1$", fontsize=18)
    plt.ylabel(r"$x_2$", fontsize=18, rotation=0)

ada_clf = AdaBoostClassifier(
        DecisionTreeClassifier(max_depth=2), n_estimators=200,
        algorithm="SAMME.R", learning_rate=0.5, random_state=42
    )
ada_clf.fit(X_train, y_train)
plot_decision_boundary(ada_clf, X, y)

这里写图片描述

注释:
如果Adaboost存在过拟合的问题,可以考虑减少模型数量或正则化基础模型。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurélien Géron English | 2017 | ISBN: 1491962291 | 566 Pages | EPUB | 8.41 MB Through a series of recent breakthroughs, deep learning has boosted the entire field of machine learning. Now, even programmers who know close to nothing about this technology can use simple, efficient tools to implement programs capable of learning from data. This practical book shows you how. By using concrete examples, minimal theory, and two production-ready Python frameworks—scikit-learn and TensorFlow—author Aurélien Géron helps you gain an intuitive understanding of the concepts and tools for building intelligent systems. You’ll learn a range of techniques, starting with simple linear regression and progressing to deep neural networks. With exercises in each chapter to help you apply what you’ve learned, all you need is programming experience to get started. Explore the machine learning landscape, particularly neural nets Use scikit-learn to track an example machine-learning project end-to-end Explore several training models, including support vector machines, decision trees, random forests, and ensemble methods Use the TensorFlow library to build and train neural nets Dive into neural net architectures, including convolutional nets, recurrent nets, and deep reinforcement learning Learn techniques for training and scaling deep neural nets Apply practical code examples without acquiring excessive machine learning theory or algorithm details
When most people hearMachine Learning,” they picture a robot: a dependable butler or a deadly Terminator depending on who you ask. But Machine Learning is not just a futuristic fantasy, it’s already here. In fact, it has been around for decades in some specialized applications, such as Optical Character Recognition (OCR). But the first ML application that really became mainstream, improving the lives of hundreds of millions of people, took over the world back in the 1990s: it was the spam filter. Not exactly a self-aware Skynet, but it does technically qualify as Machine Learning (it has actually learned so well that you seldom need to flag an email as spam anymore). It was followed by hundreds of ML applications that now quietly power hundreds of products and features that you use regularly, from better recommendations to voice search. Where does Machine Learning start and where does it end? What exactly does it mean for a machine to learn something? If I download a copy of Wikipedia, has my computer really “learned” something? Is it suddenly smarter? In this chapter we will start by clarifying what Machine Learning is and why you may want to use it. Then, before we set out to explore the Machine Learning continent, we will take a look at the map and learn about the main regions and the most notable landmarks: supervised versus unsupervised learning, online versus batch learning, instance-based versus model-based learning. Then we will look at the workflow of a typical ML project, discuss the main challenges you may face, and cover how to evaluate and fine-tune a Machine Learning system. This chapter introduces a lot of fundamental concepts (and jargon) that every data scientist should know by heart. It will be a high-level overview (the only chapter without much code), all rather simple, but you should make sure everything is crystal-clear to you before continuing to the rest of the book. So grab a coffee and let’s get started!

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值