ISLR第五章重采样方法应用练习题


ISLR;R语言; 机器学习 ;线性回归

一些专业词汇只知道英语的,中文可能不标准,请轻喷


5.Default数据分析

> library(ISLR)
> summary(Default)
default    student       balance           income     
 No :9667   No :7056   Min.   :   0.0   Min.   :  772  
 Yes: 333   Yes:2944   1st Qu.: 481.7   1st Qu.:21340  
                       Median : 823.6   Median :34553  
                       Mean   : 835.4   Mean   :33517  
                       3rd Qu.:1166.3   3rd Qu.:43808  
                       Max.   :2654.3   Max.   :73554  
> attach(Default)

a)

> set.seed(1)
> glm.fit=glm(default~income+balance,data=Default,family=binomial)

b)

> FiveB=function(){
+ #i.
+ train=sample(dim(Default)[1],dim(Default)[1]/2)
+ #ii.
+ glm.fit = glm(default ~ income + balance, data=Default, family = binomial,subset=train)
+ #iii.
+ glm.pred = rep("No",dim(Default)[1]/2)
+ glm.probs=predict(glm.fit,Default[-train, ],type="response")
+ glm.pred[glm.probs > 0.5]="Yes"
+ #iv.
+ return(mean(glm.pred != Default[-train, ]$default))
+ }
> FiveB()
[1] 0.0236

2.36%的错误率
c)

> FiveB()
[1] 0.028
> FiveB()
[1] 0.0268
> FiveB()
[1] 0.0252

错误率在2.6%上下波动。
d)

> train=sample(dim(Default)[1],dim(Default)[1]/2)
> glm.fit = glm(default ~ income + balance + student, data=Default, family = binomial, subset = train)
> glm.pred = rep("No",dim(Default)[1]/2)
> glm.probs = predict(glm.fit, Default[-train,],type="response")
> glm.pred[glm.probs > 0.5] = "Yes"
> mean(glm.pred != Default[-train,]$default)
[1] 0.0246

错误率为2.46%,增加student变量并没有减少错误率


6.Default数据集

> library(ISLR)
> summary(Default)
 default    student       balance           income     
 No :9667   No :7056   Min.   :   0.0   Min.   :  772  
 Yes: 333   Yes:2944   1st Qu.: 481.7   1st Qu.:21340  
                       Median : 823.6   Median :34553  
                       Mean   : 835.4   Mean   :33517  
                       3rd Qu.:1166.3   3rd Qu.:43808  
                       Max.   :2654.3   Max.   :73554  
> attach(Default)

a)

> set.seed(1)
> glm.fit = glm(default ~ income + balance, data = Default, family = binomial)
> summary(glm.fit)

Call:
glm(formula = default ~ income + balance, family = binomial, 
data = Default)

Deviance Residuals:
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值