前言
什么是AI?
The theory and development of computer systems able to perform tasks normally requiring human intelligence.(–Oxford Dictionary)
Using data to solve problems.(–cy)
1软投票与硬投票区别
2鸢尾花分类举例
2.1硬投票
from sklearn.datasets import load_iris
#数据集导入
iris=load_iris()
x=iris.data
y=iris.target#数据特征和标签已经导入了
#用3个模型 训练一下 #3个模型训练
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
model1=LogisticRegression(C=0.1)
model2=SVC(C=0.1,probability=True)
model3=RandomForestClassifier(n_estimators=10,max_depth=2)
#投票
from sklearn.ensemble import VotingClassifier
from sklearn.model_selection import cross_val_score#做一下交叉验证
print("硬投票:")#下面这个参数voting='hard'
ensemble_model=VotingClassifier(estimators=[('LR',model1),('SVC',model2),('RF',model3)],voting='hard')
for model,label in zip([model1,model2,model3,ensemble_model],['LR','SVC','RF','Voting']):
scores=cross_val_score(model,x,y,cv=5,scoring='accuracy')#交叉验证
print('{}准确率平均数:{}'.format(label,scores.mean()))
硬投票:
LR准确率平均数:0.9466666666666667
SVC准确率平均数:0.9200000000000002
RF准确率平均数:0.9399999999999998
Voting准确率平均数:0.96
(从结果看出,投票之后的分数确实有提升)
2.2软投票
from sklearn.datasets import load_iris
#数据集导入
iris=load_iris()
x=iris.data
y=iris.target#数据特征和标签已经导入了
#用3个模型 训练一下 #3个模型训练
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
model1=LogisticRegression(C=0.1)
model2=SVC(C=0.1,probability=True)
model3=RandomForestClassifier(n_estimators=10,max_depth=2)
#投票
from sklearn.ensemble import VotingClassifier
from sklearn.model_selection import cross_val_score#做一下交叉验证
print("软投票:")#下面这个参数voting='soft'
ensemble_model=VotingClassifier(estimators=[('LR',model1),('SVC',model2),('RF',model3)],voting='soft')
for model,label in zip([model1,model2,model3,ensemble_model],['LR','SVC','RF','Voting']):
scores=cross_val_score(model,x,y,cv=5,scoring='accuracy')#交叉验证
print('{}准确率平均数:{}'.format(label,scores.mean()))
软投票:
LR准确率平均数:0.9466666666666667
SVC准确率平均数:0.9200000000000002
RF准确率平均数:0.9533333333333334
Voting准确率平均数:0.96
(每次运行的结果都不一样,不一定有提升)
总结
(如果您发现我写的有错误,欢迎在评论区批评指正)