CVP(Critical Value Pruning)illustration with clear principle in details

Note:
CVP(Critical Value Pruning) is also called
Chi-Square Pruning(test) in many materials.

The following is a contingency table[1]:
在这里插入图片描述

H 0 : X i j n = N i N j n 2 H_0:\frac{X_{ij}}{n}=\frac{N_iN_j}{n^2} H0:nXij=n2NiNj
H 1 : X i j n ≠ N i N j n 2 H_1:\frac{X_{ij}}{n}≠\frac{N_iN_j}{n^2} H1:nXij=n2NiNj
N i j = X i j N_{ij}=X_{ij} Nij=Xij
when
∑ i = 1 i = r ∑ j = 1 j = s ( N i j − N i ⋅ N j n ) 2 N i ⋅ N j n < χ [ ( r − 1 ) ( s − 1 ) ] , α 2 = c r i t i c a l   v a l u e \sum_{i=1}^{i=r}\sum_{j=1}^{j=s}\frac{(N_{ij}-\frac{N_i·N_j}{n})^2} {\frac{N_i·N_j}{n}}<\chi_{[(r-1)(s-1)],\alpha}^2=critical \ value i=1i=rj=1j=snNiNj(NijnNiNj)2<χ[(r1)(s1)],α2=critical value
then H 0 H_0 H0 is accepted and the decision tree is pruned.
among which α \alpha α can be set as 0.05,etc.
n n n is the total length(means counts,quantity) of your datasets.
The relationships between contingency table and Decision Tree are listed in the following table:

Split node with Attribute f f fclass 1 1 1class 2 2 2 … \dots class s s s
branch 1 1 1 n L 1 n_{L1} nL1 n L 2 n_{L2} nL2 … \dots n L n_L nL
branch 2 2 2 n R 1 n_{R1} nR1 n R 2 n_{R2} nR2 … \dots n R n_R nR
⋮ \vdots ⋮ \vdots ⋮ \vdots ⋱ \ddots ⋮ \vdots
branch r r r n 1 n_1 n1 n 2 n_2 n2 … \dots n n n

Now let’s use the above table to learn the following lecture PPT[2].

in the above picture,some parameters are explained in the table in front of it.

Let’s go on…
在这里插入图片描述
In the above ppt,note that CVP can both be used in pre-pruning and post-pruning stages.
According to the growth stage ,we know “less than critical value” happend before current sub-tree is pruned.
So,we can infer from the above lecture PDF that the post-pruning will have the same experience.

How to understand the above pruning criterion?

------------------------------------------
In above table,
different branches
=different value level of current decision node of decision tree
(also called split node,one split node owns one Attribute of datasets)

when H 0 H_0 H0 is accepted,then
X i j N i ⋅ ≈ N ⋅ j n \frac{X_{ij}}{N_{i·}}≈\frac{N_{·j}}{n} NiXijnNj
which means:

the probability of “items belongs to class j” in each i t h ( i ∈ [ 1 , r ] ) i_{th}(i∈[1,r]) ith(i[1,r]) branch
=the probability of “items belongs to class j” in all datasets
=>Merging(prune) these branches into one leaf will Not make the probability of “items belongs to class j”vary too much,which means that accuracy will not vary too much after being pruned

In conclusion,when Chi-Square Statistics do Not reach the Critical Value,branches(different value levels of split Attribute of Decision Tree)
will not contribute too much for increasing the accuracy,then these branches can be pruned.

The above conclusion can be used directly when we implement our CVP(Critical Value Pruning) algorithm with python.

We can also learn from above analysis that CVP is targeted at simplifying your decision tree while Not losing accuracy too much.

Reference:
[1]http://www.maths.manchester.ac.uk/~saralees/pslect8.pdf
[2]https://www.docin.com/p1-2336928230.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值