bagging 集成

最新推荐文章于 2024-01-27 18:22:54 发布

ossorry

最新推荐文章于 2024-01-27 18:22:54 发布

阅读量276

点赞数 1

分类专栏：日常

本文链接：https://blog.csdn.net/qq_32172061/article/details/100751183

版权

日常专栏收录该内容

6 篇文章 0 订阅

订阅专栏

bagging 集成

基于自助采样法
每次选择m个样本
基于每个采样集训练出一个基学习器
然后将这些基学习器进行结合
关注降低方差

from sklearn.ensemble import BaggingClassifier
base_estimator=None, n_estimators=10,
max_samples=1.0, max_features=1.0, bootstrap=True, bootstrap_features=False,
oob_score=False, warm_start=False, n_jobs=None,
random_state=None, verbose=0)
from sklearn.ensemble import BaggingRegressor
(base_estimator=None, n_estimators=10,
max_samples=1.0, max_features=1.0, bootstrap=True, bootstrap_features=False,
oob_score=False, warm_start=False, n_jobs=None,
random_state=None, verbose=0)

base_esimateor
- the base estimatetor on random subsets of dataset, decisiontree default
n_estimators
- the number of base estimators
- 10,default
- int
max_samples
- 来训练的样本数目
- int
- float，比例
max_features
- 来训练的特征数目
- int
- float，比例
bootstrap
- 是不是用替代的方式选取样本
bootstrap-features
oob_score
- 是不是要用外包估计来评断泛化误差
warm_start
n_jobs
- 是否在fit\predict中使用并行计算
- 1，使用全部处理器
random
同决策树。

随机森林

以决策树为基学习器构建bagging集成的基础上。
在RF中，对基决策树的每个结点，随机选择包含k个属性的子集，再从这个子集中选择一个最优属性用于划分。
个体学习器的性能往往有所降低。
然而，随着个体学习器数目的增加，随机森林通常会收敛到更低的泛化误差。
训练数据通常优于bagging

from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import RandomForestRegressor
(n_estimators=’warn’, criterion=’gini’, max_depth=None,
min_samples_split=2, min_samples_leaf=1,
min_weight_fraction_leaf=0.0,
max_features=’auto’,
max_leaf_nodes=None,
min_impurity_decrease=0.0,
min_impurity_split=None, bootstrap=True,
oob_score=False, n_jobs=None,
random_state=None, verbose=0,
warm_start=False, class_weight=None)

n_estimators
- 10, will change to 100
- int
criterion
- ‘gini’,default - ‘mse’
- 'entorpy; - ‘mae’
max_dpeth
min_samples_split
min_samples_leaf
min_weight_fraction_leaf
max_features
max_leaf_nodes
min_impurity_decrease
bootstrap
- True,default,bootstrp samples(有放回抽样)
- False，使用所有的数据来构建每一颗树
n_jobs
random_sate
warm_start
- False,default，建立一个新树
- Ture, 接上上次的使用
class_weight
the default values for the parameters controlling the size of the trees (e.g. max_depth,
min_samples_leaf, etc.) lead to fully grown and unpruned trees which can potentially be very large on
some data sets. To reduce memory consumption, the complexity and size of the trees should be controlled by
setting those parameter values.
The features are always randomly permuted at each split. Therefore, the best found split may vary, even with
the same training data, max_features=n_features and bootstrap=False, if the improvement of the
criterion is identical for several splits enumerated during the search of the best split. To obtain a deterministic
behaviour during fitting, random_state has to be fixed

voting 分类器

class sklearn.ensemble.VotingClassifier(estimators, voting=’hard’,
 weights=None, n_jobs=None, flatten_transform=True)[source]¶

Parameters

estimators
list of (string,estimator) tuples
voting
- ‘hard’, default,多数原则
- ‘soft’,基于预测可能性的比例，在一些well-calibrated 的分类器集成中推荐用这个。
weigths
定义分类器的权重，或者可能性的权重（soft）
- None ,Default
- shape(n_classifiers)
n_jobs
flatten_transform
影响soft形式下的输出格式
- True, default

METHODS

fit(self, X, y[, sample_weight]) Fit the estimators.
fit_transform(self, X[, y]) Fit to data, then transform it.
get_params(self[, deep]) Get the parameters of the ensemble estimator
predict(self, X) Predict class labels for X.
score(self, X, y[, sample_weight]) Returns the mean accuracy on the given test data and labels.
set_params(self, **params) Setting the parameters for the ensemble estimator
transform(self, X) Return class labels or probabilities for X for each estimator.

voting 回归器

平衡相似表现的回归模型的缺点。不相似可能会拉下水哦

from sklearn.ensemble import VotingRegressor(estimators,weights=None,n_Jobs=None)

METHODS 中 score 方法返回R2系数

ossorry

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
bagging 集成

bagging 集成基于自助采样法每次选择m个样本基于每个采样集训练出一个基学习器然后将这些基学习器进行结合关注降低方差from sklearn.ensemble import BaggingClassifierbase_estimator=None, n_estimators=10,max_samples=1.0, max_features=1.0, bootstrap=Tru...
复制链接

扫一扫