集成学习_代码进阶

最新推荐文章于 2024-05-06 20:43:23 发布

雪天枫

最新推荐文章于 2024-05-06 20:43:23 发布

阅读量287

点赞数

分类专栏： python 深度学习算法文章标签：集成学习_代码进阶

本文链接：https://blog.csdn.net/zhaodeming000/article/details/90612308

版权

python 深度学习算法专栏收录该内容

10 篇文章 0 订阅

订阅专栏

随机森林-RF

有已经打包好的随机森林模型，

from sklearn.ensemble import RandomForestClassifier
.......
RF = RandomForestClassifier(n_estimators=50)
# n_estimators为RF中树的数量，默认为10,
RF.fit(x_train, y_train)
plot(RF)

Init signature: RandomForestClassifier(n_estimators=10, criterion='gini', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features='auto', max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, bootstrap=True, oob_score=False, n_jobs=1, random_state=None, verbose=0, warm_start=False, class_weight=None)
其中不同参数的设置会严重影响RF最终的精确度

Adaboosting

# 数据模拟
# 生成2维正态分布，生成的数据按分位数分为两类，500个样本,2个样本特征
x1, y1 = make_gaussian_quantiles(n_samples=500, n_features=2,n_classes=2)
# 生成2维正态分布，生成的数据按分位数分为两类，500个样本,2个样本特征均值都为3
x2, y2 = make_gaussian_quantiles(mean=(3, 3), n_samples=500, n_features=2, n_classes=2)
# 将两组数据合成一组数据
x_data = np.concatenate((x1, x2))    # 数据组合
y_data = np.concatenate((y1, - y2 + 1))  # 标签组合  标签置换 把1变成0  把0变成1
plt.scatter(x_data[:, 0], x_data[:, 1], c=y_data)
plt.show()

在这里插入图片描述

# AdaBoost模型
model = AdaBoostClassifier(DecisionTreeClassifier(max_depth=3),n_estimators=10) # 深度为3 的决策树 循环10次
# 训练模型
model.fit(x_data, y_data)
plot(model)
# 模型准确率
model.score(x_data,y_data)

在这里插入图片描述
模型准确率：0.976

Stacking

使用多个不同的分类器对训练集进预测，把预测得到的结果作为一个次级分类器的输入。次级分类器的输出是整个模型的预测结果。

# 定义三个不同的分类器
clf1 = KNeighborsClassifier(n_neighbors=1)    # KNN
clf2 = DecisionTreeClassifier()               # 决策树
clf3 = LogisticRegression()                   # 逻辑分类
 
# 定义一个次级分类器
lr = LogisticRegression()  
# 将三种分类器组合起来，在进行循环
sclf = StackingClassifier(classifiers=[clf1, clf2, clf3],   
                          meta_classifier=lr)

for clf,label in zip([clf1, clf2, clf3, sclf],
                      ['KNN','Decision Tree','LogisticRegression','StackingClassifier']):  
    scores = model_selection.cross_val_score(clf, x_data, y_data, cv=3, scoring='accuracy')    # 交叉验证，cv=3 把数据切分为3份，1 训练集 2 3 测试集 1 2 训练集 3 测试集  3 训练集1 2测试集 
    print("Accuracy: %0.2f [%s]" % (scores.mean(), label))

Accuracy: 0.91 [KNN]
Accuracy: 0.91 [Decision Tree]
Accuracy: 0.91 [LogisticRegression]
Accuracy: 0.94 [StackingClassifier]

雪天枫

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
集成学习_代码进阶

随机森林-RF有已经打包好的随机森林模型，from sklearn.ensemble import RandomForestClassifier.......RF = RandomForestClassifier(n_estimators=50)# n_estimators为RF中树的数量，默认为10,RF.fit(x_train, y_train)plot(RF)Init si...
复制链接

扫一扫