机器学习笔记（Washington University）- Classification Specialization-week three & week four

最新推荐文章于 2024-07-26 13:24:37 发布

weixin_34163741

最新推荐文章于 2024-07-26 13:24:37 发布

阅读量86

点赞数

文章标签：人工智能

原文链接：http://www.cnblogs.com/climberclimb/p/6848037.html

版权

1. Quality metric

Quality metric for the desicion tree is the classification error

error=number of incorrect predictions / number of examples

2. Greedy algorithm

Procedure

Step 1: Start with an empty tree

Step 2: Select a feature to split data

explanation:

　　Split data for each feature

　　Calculate classification error of this decision stump

　　choose the one with the lowest error

For each split of the tree:

　　Step 3: If all data in these nodes have same y value

　　　　　　Or if we already use up all the features, stop. 　　　　　　　

　　Step 4: Otherwise go to step 2 and continue on this split

Algorithm

predict(tree_node, input)

if current tree_node is a leaf:

　　return majority class of data points in leaf

else:

　　next_node = child node of tree_node whose feature value agrees with input

　　return (tree_node, input)

3 Threshold split

Threshold split is for the continous input

we just pick a threshold value for the continous input and classify the data.

Procedure:

Step 1: Sort the values of a feature h_j(x) {v₁, v₂,...,v_n}

Step 2: For i = 1 .... N-1(all the data points)

　　　　　　consider split t_i=(v_i+v_i+1)/2

　　　　　　compute the classification error of the aplit

　　　　choose t_i with the lowest classification error

4. Overfitting

As the depth increases, the overfitting could occur.

Curing Methods

1. Early Stopping

Stop learning algorithm before tree become too complex

Like:

Limit the depth of the tree (it is difficult to choose the depth value)
Use classification error to limit depth of the tree (can be dangerous XOR)
Stop if number of data points is too few in the intermedate modes

2.Pruning

Simplify tree after learning algorithm terminates

Consider a specific total cost:

Total cost = classification error + lamda*number of leaf nodes

Start at bottom of tree T and traverse up apply prune_split(T,M) to each desicion node M

prune_split(T,M):

1. Compute the total cost of tree T using the formula above,

C(T) = Error(T)+λL(T)

2. Let T_smaller be tree after pruning subtree below M

3. Compute total cost complexity of T_smaller ,C(T_smaller) = Error(T_smaller)+λL(T_smaller)

4. If C(T_smaller ) < C(T), prune to T_smaller

5. Missing data

1. Purification by skipping data or skipping featuires

Cons:

1. Removing data points or features may remove important info from data

2. Unclear when it is better to remove data points versus features

3. Does not help if data is missing at prediction time

2. Imputation

Filling in the missing value

1. Categorical feature

Fill in the most popular value of x_i

2. Numerical feature

Fill in the average or median value of x_i

Cons:

May result in systematic error

3. Adding missing value choice to every decision node

we use the classification error to decide where to put the unknowns.

Cons:

Requires modification of learning algorithm

转载于:https://www.cnblogs.com/climberclimb/p/6848037.html

weixin_34163741

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
机器学习笔记（Washington University）- Classification Specialization-week three & week four

1.Quality metricQuality metric for the desicion tree is the classification errorerror=number of incorrect predictions / number of examples2. Greedy algorithmProcedureStep 1: Start with ...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。