案例：K-均值糖尿病预测

最新推荐文章于 2023-10-09 20:39:45 发布

hantuo001

最新推荐文章于 2023-10-09 20:39:45 发布

阅读量818

点赞数

分类专栏：机器学习笔记

本文链接：https://blog.csdn.net/ZhanZhan1231/article/details/103645264

版权

1、加载数据

import pandas as pd
%matplotlib inline
data=pd.read_csv('diabetes.csv')
data.head(5)

	Pregnancies	Glucose	BloodPressure	SkinThickness	Insulin	BMI	DiabetesPedigreeFunction	Age	Outcome
0	6	148	72	35	0	33.6	0.627	50	1
1	1	85	66	29	0	26.6	0.351	31	0
2	8	183	64	0	0	23.3	0.672	32	1
3	1	89	66	23	94	28.1	0.167	21	0
4	0	137	40	35	168	43.1	2.288	33	1

data.groupby('Outcome').size()#查看样本量分布

Outcome
0    500
1    268
dtype: int64

X=data.iloc[:,0:8]#选择特征
Y=data.iloc[:,8]#选择目标值
print('shape of X:{},shape of Y:{}'.format(X.shape,Y.shape))

shape of X:(768, 8),shape of Y:(768,)

from sklearn.model_selection import train_test_split
X_train,X_test,Y_train,Y_test=train_test_split(X,Y,test_size=0.2)#训练集，测试集划分

2、模型比较

使用k-均值，带权重的k-均值，指定半径的k-均值分别对数据集进行拟合并计算评分

from sklearn.neighbors import KNeighborsClassifier,RadiusNeighborsClassifier
from sklearn.model_selection import KFold
#构造3个模型
models=[]
models.append(("KNN",KNeighborsClassifier(n_neighbors=2)))
models.append(("KNN with weights",KNeighborsClassifier(n_neighbors=2,weights='distance')))
models.append(("Radius nerghbors",RadiusNeighborsClassifier(n_neighbors=

最低0.47元/天解锁文章

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

hantuo001

关注关注

0
点赞
踩
10

收藏

觉得还不错? 一键收藏
0
评论
案例：K-均值糖尿病预测

1、加载数据import pandas as pd%matplotlib inlinedata=pd.read_csv('diabetes.csv')data.head(5) Pregnancies Glucose BloodPressure SkinThickness Insulin ...
复制链接

扫一扫