用rpart包建立regression tree,并利用prune函数进行修剪

body fat data is in TH.data


library(TH.data)
library(rpart)
data("bodyfat", package = "TH.data")
help("bodyfat",package="TH.data")
## starting httpd help server ... done
# head(bodyfat)
user rpart package to “grow” regression tree.Response variable and covariates defined by model formula is same way as lm().we grow a large initial tree.


bodyfat_rpart<-rpart(DEXfat~age+waistcirc+hipcirc+elbowbreadth+kneebreadth,data=bodyfat,
                     # user control arg to restrict of obs for potential binary split to 10:
                     control=rpart.control(minsplit=10))
print the graphical tree with partykit


obs that satisfy the condition shown for each node go to left and those that do not to right


library(partykit)
## Loading required package: grid
plot(as.party(bodyfat_rpart),
     tp_args=list(id=FALSE))




the cptable element of rpart object call tell us if the tree should be “pruned”


(cptable里面的元素能告诉我们这棵树是否需要修剪)


see xerror values … tree with least error has 4 splits:


print(bodyfat_rpart$cptable)
##        CP nsplit rel error xerror    xstd
## 1 0.66290      0    1.0000 1.0360 0.17147
## 2 0.09376      1    0.3371 0.4870 0.09825
## 3 0.07704      2    0.2433 0.4651 0.08414
## 4 0.04508      3    0.1663 0.4090 0.06790
## 5 0.01845      4    0.1212 0.3622 0.06585
## 6 0.01819      5    0.1028 0.3049 0.06312
## 7 0.01000      6    0.0846 0.2799 0.06086
we preserve the minimum xerror in opt(我们将最小xerror的赋值给opt)


opt<-which.min(bodyfat_rpart$cptable[,"xerror"])
here we prune back the large initial tree:(我们对初始树进行修剪)


cp<-bodyfat_rpart$cptable[opt,"CP"]
bodyfat_prune<-prune(bodyfat_rpart,cp=cp)
and then we plot the resulting pruned tree:(我们对修建后的树进行画图)


plot(as.party(bodyfat_prune),
     tp_args=list(id=FALSE))




Based on this model,one can predict the (unkown) body fact content based on covariate values … so we do just that using the data we have:(我们利用建立的模型对原有数据进行预测):


DEXfat_pred<-predict(bodyfat_prune,
                     newdata=bodyfat)
xlim<-range(bodyfat$DEXfat)
plot(DEXfat_pred~bodyfat$DEXfat,
     data=bodyfat,xlab="Observed",
     ylab="Predicted",
     ylim=xlim,
     xlim=xlim)
abline(a=0,b=1)




other approach to recursive partitioning(其他递归分隔方法)


other approach implemented in 'party' package


one each node of those trees,we test for independence bewteen any of the covariates and a split made when p-value is small.


Advantage:Do not have to prune back large initial trees because we are using a statistic motivated stopping criterion.


called a “Conditional Inference Tree”:


we do it for body fat:


library(party)
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## 
## 下列对象被屏蔽了from 'package:base':
## 
##     as.Date, as.Date.numeric
## 
## Loading required package: sandwich
## Loading required package: strucchange
## Loading required package: modeltools
## Loading required package: stats4
## 
## Attaching package: 'party'
## 
## 下列对象被屏蔽了from 'package:partykit':
## 
##     ctree, ctree_control, edge_simple, mob, mob_control,
##     node_barplot, node_bivplot, node_boxplot, node_inner,
##     node_surv, node_terminal
bodyfat_ctree<-ctree(DEXfat~age+waistcirc+hipcirc+elbowbreadth+kneebreadth,
                     data=bodyfat)

plot(bodyfat_ctree)


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

jiabiao1602

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值