蔚蓝祥和的天空
这个作者很懒,什么都没留下…
展开
-
VotingClassifier用法
"""一、Hard Voting 与 Soft Voting 的对比1)使用方式voting = 'hard':表示最终决策方式为 Hard Voting Classifier;voting = 'soft':表示最终决策方式为 Soft Voting Classifier; 2)思想Hard Voting Classifier:根据少数服从多数来定最终结果;Soft Voting Classifier:将所有模型预测样本为某一类别的概率的平均值作为标准,概率最高的对应的类型为最终的预测结原创 2020-10-09 20:08:31 · 5487 阅读 · 0 评论 -
KFold,StratifiedKFold,cross_val_score用法
一 、KFold和StratifiedKFoldKFold:KFold交叉采样:将训练/测试数据集划分n_splits个互斥子集,每次只用其中一个子集当做测试集,剩下的(n_splits-1)作为训练集,进行n_splits次实验并得到n_splits个结果。注:对于不能均等分的数据集,前n_samples%n_spllits子集拥有n_samples//n_spllits+1个样本,其余子集都只有n_samples//n_spllits个样本。(例10行数据分3份,只有一份可分4行,其他均为3行原创 2020-10-09 19:49:25 · 1738 阅读 · 3 评论 -
泰坦尼克号预测
from sklearn.model_selection import train_test_split, GridSearchCVfrom sklearn.neighbors import KNeighborsClassifierfrom sklearn.preprocessing import StandardScalerfrom sklearn.feature_extraction.text import TfidfVectorizerfrom sklearn.naive_bayes impo转载 2020-10-04 17:34:02 · 328 阅读 · 0 评论 -
KNN手写数字识别(记录)
import numpy as npimport osfrom os import listdirimport operatordef classify0(inX,dataSet,labels,k): dataSetSize = dataSet.shape[0] diffMat = np.tile(inX,(dataSetSize,1))-dataSet sqDiffMat = diffMat**2 sqDistances = sqDiffMat.sum(axis=原创 2020-08-13 22:50:13 · 224 阅读 · 0 评论 -
机器学习--KNN约会
from numpy import *import matplotlib.pyplot as pltimport osfrom sklearn import preprocessingle = preprocessing.LabelEncoder()def file2matrix(filename): fr = open(filename) arrayOLines = fr.readlines() numberOfLines = len(arrayOLines)原创 2020-08-10 23:42:21 · 271 阅读 · 0 评论 -
机器学习--KNN伪代码
from numpy import *import operatorimport matplotlib.pyplot as pltimport osfrom sklearn.neighbors import KNeighborsClassifier#KNN伪代码def createDataSet(): group = array([[1.0,1.1],[1.0,1.0],[0,0],[0,0.1]]) labels = ["A","A","B","B"] return原创 2020-08-09 23:22:13 · 1257 阅读 · 0 评论 -
KNN鸢尾花--详细记录
import matplotlib.pyplot as pltimport numpy as npimport pandas as pdfrom sklearn.datasets import load_irisiris_dataset = load_iris()from sklearn.model_selection import train_test_splitX_train,X_test,y_train,y_test = train_test_split(iris_dataset["d.原创 2020-08-16 11:37:48 · 320 阅读 · 0 评论 -
机器学习--Logistic回归梯度上升
Logistic回归梯度上升import numpy as npimport matplotlib.pyplot as pltdef loadDataset(): dataMat = [] labelMat = [] with open("testSet.txt") as fr: for line in fr.readlines(): lineArr = line.strip().split() dataMat.原创 2020-08-15 23:22:28 · 100 阅读 · 0 评论