CVP(Critical Value Pruning)illustration with clear principle in details

最新推荐文章于 2022-09-24 18:02:45 发布

微电子学与固体电子学-俞驰

最新推荐文章于 2022-09-24 18:02:45 发布

阅读量663

点赞数

分类专栏：机器学习算法

本文链接：https://blog.csdn.net/appleyuchi/article/details/84575076

版权

机器学习算法专栏收录该内容

87 篇文章 7 订阅

订阅专栏

Note:
CVP(Critical Value Pruning) is also called
Chi-Square Pruning(test) in many materials.

The following is a contingency table[1]:
在这里插入图片描述

$H_0:\frac{X_{ij}}{n}=\frac{N_iN_j}{n^2}$
$H_1:\frac{X_{ij}}{n}≠\frac{N_iN_j}{n^2}$
$N_{ij}=X_{ij}$
when
$\sum_{i=1}^{i=r}\sum_{j=1}^{j=s}\frac{(N_{ij}-\frac{N_i·N_j}{n})^2} {\frac{N_i·N_j}{n}}<\chi_{[(r-1)(s-1)],\alpha}^2=critical \ value$
then $H_0$ is accepted and the decision tree is pruned.
among which $\alpha$ can be set as 0.05,etc.
$n$ is the total length(means counts,quantity) of your datasets.
The relationships between contingency table and Decision Tree are listed in the following table:

Split node with Attribute $f$	class $1$	class $2$	$\dots$	class $s$
branch $1$	$n_{L1}$	$n_{L2}$	$\dots$	$n_L$
branch $2$	$n_{R1}$	$n_{R2}$	$\dots$	$n_R$
$\vdots$	$\vdots$	$\vdots$	$\ddots$	$\vdots$
branch $r$	$n_1$	$n_2$	$\dots$	$n$

Now let’s use the above table to learn the following lecture PPT[2].

in the above picture,some parameters are explained in the table in front of it.

Let’s go on…
在这里插入图片描述
In the above ppt,note that CVP can both be used in pre-pruning and post-pruning stages.
According to the growth stage ,we know “less than critical value” happend before current sub-tree is pruned.
So,we can infer from the above lecture PDF that the post-pruning will have the same experience.

How to understand the above pruning criterion?

------------------------------------------
In above table,
different branches
=different value level of current decision node of decision tree
（also called split node,one split node owns one Attribute of datasets）

when $H_0$ is accepted,then
$\frac{X_{ij}}{N_{i·}}≈\frac{N_{·j}}{n}$
which means:

the probability of “items belongs to class j” in each $i_{th}(i∈[1,r])$ branch
=the probability of “items belongs to class j” in all datasets
=>Merging(prune) these branches into one leaf will Not make the probability of “items belongs to class j”vary too much,which means that accuracy will not vary too much after being pruned

In conclusion,when Chi-Square Statistics do Not reach the Critical Value,branches(different value levels of split Attribute of Decision Tree)
will not contribute too much for increasing the accuracy,then these branches can be pruned.

The above conclusion can be used directly when we implement our CVP(Critical Value Pruning) algorithm with python.

We can also learn from above analysis that CVP is targeted at simplifying your decision tree while Not losing accuracy too much.

Reference:
[1]http://www.maths.manchester.ac.uk/~saralees/pslect8.pdf
[2]https://www.docin.com/p1-2336928230.html