统计学习导论_统计学习导论 | 读书笔记15 | 广义可加模型

ISLR 7.7 广义可加模型

要点:
0.广义可加模型介绍
1.用于回归问题的GAM
-- 多元线性回归的推广
2.用于分类问题的GAM
-- 逻辑回归的推广
3.GAM的优点与不足

0. Generalized Additive Models

Polynomials, Step functions & Splines can be seen as extensions of simple linear regression

  • flexibly predicting
    on the basis of a
    single predictor

「Generalized Additive Models (GAMs)」 provide a general framework for extending multiple linear model

  • by allowing non-linear functions for
    while
    maintaining additivity

1. GAMs for Regression

Recall Multiple Linear Regression:

GAMs replace each linear component

with a
non-linear(smooth) function
:

  • It is called an 「addictive model」 because we calculate a separate
    for each
    ,
  • and then add together all of them.
❝ The beauty of GAMs is that we can use different methods as building blocks for fitting an additive model

Example:

To predict the wage, fitting the model:

  • year and age are quantitative, fit both functions using natural splines
  • education is qualitative with 5 levels: <HS, HS, <Coll, Coll, >Coll, fit step function for each level via the dummy variable.

a80f027a84a71538ea361862c0cf8acd.png

Since natural splines can be constructed using an pre-chosen set of basis functions

  • the entire model is just a big regreesion onto dummy and spline basis variables using least squares

1741a5a482ba1866d3b3fc2c484dfb54.png

Using Smooth Splines as the building blocks looks rather similar

  • fitting via backfitting instead of least square

2. GAMs for Classification

Assume

takes on values {0,1}
  • Let

Recall Logistic Regression:

  • This logit is the log of the odds of
    versus
    , which represents as a linear function of the predictors.

Logic Regression Extension GAM for non-linear relationships:

Example:

To predict the probability that an individual's income exceeds $25,000 per year, fitting the GAM:

where
  • is fit via
    smoothing spline with DF=5
  • is fit as a
    step function via dummy variables for levels of education

f0fa817b43fffffbfbc718cce915b42d.png
  • Fact: no one <HighSchool make wage>250
    • CI very wide
    • need refit the GAM without <HS level:

bacbf8169732476cd7e44e4b831cc5a9.png

3. Pros and Cons of GAMs

Advantages:

  1. GAMs allow us to fit a non-linear
    to each
    automatically
  • standard linear regression will miss this
  1. The Non-linear Fits potentially make more accurate predictions for the response
  2. Because the model is additive, we can
  • still examine the effect of each
    on
    individually while holding all of the other variables fixed
  1. The smoothness of the function
    for the variable
    can be summarized via
    degree of freedom

Limitations:

  1. The model is restricted to be additive
  2. Important interactions with many variables can be missed
  • need to manually add
  1. TOGO: For fully general models without limitations, we will look for even more flexible methods:
  • Random Forest, Boosting

4. Reference

An Introduction to Statistical Learning, with applications in R (Springer, 2013)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值