Design Decision Tree Classifier
-Picking the root node
-Recursively branching
qPicking the root node
-The goal is to have the resulting decision tree as small as possible
(决策树要尽量的小)
-The main decision in the algorithm is the selection of the next attribute to condition on (start from the root node).
- We want attributes that split the examples to sets that are relatively pure in one label; this way we are closer to a leaf node.
(产生的孩子节点要尽量的纯也就是尽量只包含同一类别,这样跟更接近叶子节点,当节点中只包含同一类别的样本时此节点为叶子节点,不再分裂)
-The most popular heuristics is based on information gain, originated with the ID3 system of Quinlan.
(节点的分裂要依据信息增益(information gain),选择导致信息增益值比较大的属性进行分裂。)
-
Entropy measures the impurity of S
Information Gain(信息增益)
-Gain (S, A) = expected reduction in entropy due to sorting on A
-Values (A) is the set of all possible values for attribute A, Sv is the subset of S which attribute A has value v, |S| and | Sv | represent the number of samples in set S and set Sv respectively
-
Gain(S,A) is the expected reduction in entropy caused by knowing the value of attribute A.
Example
Play Tennis Example