决策树实践小技巧

1.当sample较少,features较多时,容易产生过拟合。所以进行降为预处理(PCA,ICA,feature selection)很重要。
2.可以用max_depth=3作为树的初始深度,export函数来可视化拟合的过程,对拟合过程有个初步体验,此后再增加树的深度。
3.用法max_depth来控制树的大小,防止过拟合。(Remember that the number of samples required to populate the tree doubles for each additional level the tree grows to. )
4.回归时max_feature的值选择n_feature 较好,分类时max_feature的值选择sqrt(n_features)较好。Empirical good default values are max_features=n_features for regression problems, and max_features=sqrt(n_features) for classification tasks (where n_features is the number of features in the data).
5.一般来说,将max_depth设置为None,min_samples_split设置为2结果较好。(Good results are often achieved when setting max_depth=None in combination with min_samples_split=2)
6.最好进行交叉验证:The best parameter values should always be cross-validated.
7.The size of the coding is at most n_estimators * 2 ** max_depth, the maximum number of leaves in the forest.
https://scikit-learn.org/stable/modules/tree.html#tree
https://scikit-learn.org/stable/modules/ensemble.html#forest

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值