规则树 java_决策树规则与剪枝(Decision Tree Rules & Pruning)

References:

T. Mitchell, 1997.

R. Myers, R. Walpole, "Tests of Hypotheses", in R. Myers, R. Walpole, Probability and Statistics for Engineers and Scientists, Second Edition, Macmillan Publishing Co., Inc., New York, NY, 1978, pp. 268 - 273.

P. Winston, 1992.

Rule Generation

Once a decision tree has been constructed, it is a simple matter to convert it into an equivalent set of rules.

Converting a decision tree to rules before pruning has three main advantages:

Converting to rules allows distinguishing among the different contexts in which a decision node is used.

Since each distinct path through the decision tree node produces a distinct rule, the pruning decision regarding that attribute test can be made differently for each path.

In contrast, if the tree itself were pruned, the only two choices would be:

Remove the decision node completely, or

Retain it in its original form.

Converting to rules removes the distinction between attribute tests that occur near the root of the tree and those that occur near the leaves.

We thus avoid messy bookkeeping issues such as how to reorganize the tree if the root node is pruned while retaining part of the subtree below this test.

Converting to rules improves readability.

Rules are often easier for people to understand.

To generate rules, trace each path in the decision tree, from root node to leaf node, recording the test outcomes as antecedents and the leaf-node classification as the consequent.

Rule Simplification Overview

Once a rule set has been devised:

Eliminate unecessary rule antecedents to simplify the rules.

Construct contingency tables for each rule consisting of more than one antecedent.

Rules with only one antecedent cannot be further simplified, so we only consider those with two or more.

To simplify a rule, eliminate antecedents that have no effect on the conclusion reached by the rule.

A conclusion′s independence from an antecendent is verified using a test for independency, which is

a chi-square test if the expected cell frequencies are greater than 10.

Yates′ Correction for Continuity when the expected frequencies are between 5 and 10.

Fisher′s Exact Test for expected frequencies less than 5.

Eliminate unecessary rules to simplify the rule set.

Once individual rules have been simplified by eliminating redundant antecedents, simplify the entire set by eliminating unecessary rules.

Attempt to replace those rules that share the most common consequent by a default rule that is triggered when no other rule is triggered.

In the event of a tie, use some heuristic tie breaker to choose a default rule.

Contingency Tables

The following is a contingency table, a tabular representation of a rule.

C1

C2

Marginal Sums

R1

x11

x12

R1T = x11 + x12

R2

x21

x22

R2T = x21 + x22

Marginal Sums

CT1 = x11 + x21

CT2 = x12 + x22

T = x11 + x12 + x21 + x22

R1 and R2 represent the Boolean states of an antecedent for the conclusions C1 and C2

(C2 is the negation of C1).

x11, x12, x21 and x22 represent the frequencies of each antecedent-consequent pair.

R1T, R2T, CT1, CT2 are the marginal sums of the rows and columns, respectively.

The marginal sums and T, the total frequency of the table, are used to calculate expected cell values in step 3 of the test for independence.

Test for Independence

Given a contingency table of dimensions r by c (rows x columns):

Calculate and fix the sizes of the marginal sums.

Calculate the total frequency, T, using the marginal sums.

Calculate the expected frequencies for each cell.

The general formula for obtaining the expected frequency of any cell xij, 1

28ac5a430387069c7e22e2d70048692d.gifi

28ac5a430387069c7e22e2d70048692d.gifr, 1

28ac5a430387069c7e22e2d70048692d.gifj

28ac5a430387069c7e22e2d70048692d.gifc in a contingency table is given by:

bce0dff085a8be597cd25114a1128382.gif

where RiT and CTj are the row total for ith row and the column total for jth column.

Select the test to be used to calculate

06c768d9e07853824af96e414714edf3.gif based on the highest expected frequency, m:

if

then use

m

bc0b7dff3164a4bd0233963b7b182647.gif 10

Chi-Square Test

5

28ac5a430387069c7e22e2d70048692d.gifm

28ac5a430387069c7e22e2d70048692d.gif 10

Yates′ Correction for Continuity

m

467b796fe91281f49459cc9b05d42b52.gif 5

Fisher′s Exact Test

Calculate

06c768d9e07853824af96e414714edf3.gif using the chosen test.

Calculate the degrees of freedom.

df = (r - 1)(c - 1)

Use a chi-square table with

06c768d9e07853824af96e414714edf3.gif and df to determine if the conclusions are independent from the antecedent at the selected level of significance,

d9bac1dff5295ee902da7ff1c8dbbb53.gif.

Assume

d9bac1dff5295ee902da7ff1c8dbbb53.gif = 0.05 unless otherwise stated.

If

06c768d9e07853824af96e414714edf3.gif

bc0b7dff3164a4bd0233963b7b182647.gif

c5854279cba54b2b2f26df1385351d72.gifReject the null hypothesis of independence and accept the alternate hypothesis of dependence.

We keep the antecedents because the conclusions are dependent upon them.

If

06c768d9e07853824af96e414714edf3.gif

28ac5a430387069c7e22e2d70048692d.gif

c5854279cba54b2b2f26df1385351d72.gifAccept the null hypothesis of independence.

We discard the antecedents because the conclusions are independent from them.

Chi-Square Formulae

Chi-Square Test

14803172f2b53d755499717e1c675792.gif

Yates′ Correction for Continuity

7ff08a7a9b49e0ccc66a015a23cec601.gif

Fisher′s Exact Test

See Winston, pp. 437-442 for an explanation of Fisher′s exact test.

Click here for an exercise in decision tree pruning.

Decision Lists

A decision list is a set of if-then statements.

It is searched sequentially for an appropriate if-then statement to be used as a rule.

凡是有该标志的文章,都是该blog博主Caoer(草儿)原创,凡是索引、收藏

、转载请注明来处和原文作者。非常感谢。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值