KNN判断良性恶性肿瘤

该文展示了运用K近邻(KNN)算法对肿瘤数据进行分类的过程。通过对数据预处理、训练集测试集划分、标准化以及模型训练和评估,得出模型在训练集和测试集上的精度,分别约为83%和84%,并提供了分类报告详述了精确度、召回率和F1分数。
摘要由CSDN通过智能技术生成

现在有一组数据,请用knn方法来进行分析肿瘤的良性恶性(良性肿瘤用“0”,恶性肿瘤用“1”表示)部分数据样例如图所示:

radiustextureperimeterareasmoothnesscompactnesssymmetryfractal_dimensionclass
23121519540.1430.2780.2420.0791
91313313260.1430.0790.1810.0570
212713012030.1250.160.2070.061
1416783860.070.2840.260.0971
91913512970.1410.1330.1810.0591
2525834770.1280.170.2090.0760
162612010400.0950.1090.1790.0571
1518905780.1190.1650.220.0751
1924885200.1270.1930.2350.0741
2511844760.1190.240.2030.0821
24211037980.0820.0670.1530.0571
17151047810.0970.1290.1840.0611
141513211230.0970.2460.240.0780
12221047830.0840.10.1850.0531
1213945780.1130.2290.2070.0771
2219976590.1140.160.230.0711
1016956850.0990.0720.1590.0591
15141087990.1170.2020.2160.0741
201413012600.0980.1030.1580.0541
1711875660.0980.0810.1890.0580
1614865200.1080.1270.1970.0680
1724602740.1020.0650.1820.0690
20271037040.1070.2140.2520.071
import numpy as np
import pandas as pd
data1=pd.read_csv('./data/cancer1.csv')
data1.head()
X=data1.drop('class',axis=1)
y=data1['class']
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.25,random_state=6)
from sklearn.preprocessing import StandardScaler
ss=StandardScaler()
X_train=ss.fit_transform(X_train)
X_test=ss.transform(X_test)
from sklearn.neighbors import KNeighborsClassifier
model=KNeighborsClassifier()
model.fit(X_train,y_train)
from sklearn.metrics import classification_report

print("训练集的模型评估指标:")
model_score = model.score(X_train, y_train)
print()
print('The accuracy of train data', model_score)
print('--------------------------------------------------------------------------')
y_train_predict = model.predict(X_train)
model_report1 = classification_report(y_train, y_train_predict)
print(model_report1)
print('$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$')

print("测试集的模型评估指标:")
model_score = model.score(X_test, y_test)
print()
print('The accuracy of test data is', model_score)
print('--------------------------------------------------------------------------')
y_predict = model.predict(X_test)
model_report = classification_report(y_test, y_predict)
print(model_report)
print('--------------------------------------------------------------------------')

训练集的模型评估指标:

The accuracy of train data 0.8266666666666667
--------------------------------------------------------------------------
              precision    recall  f1-score   support

           0       0.80      0.71      0.75        28
           1       0.84      0.89      0.87        47

    accuracy                           0.83        75
   macro avg       0.82      0.80      0.81        75
weighted avg       0.83      0.83      0.82        75

$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
测试集的模型评估指标:

The accuracy of test data is 0.84
--------------------------------------------------------------------------
              precision    recall  f1-score   support

           0       1.00      0.60      0.75        10
           1       0.79      1.00      0.88        15

    accuracy                           0.84        25
   macro avg       0.89      0.80      0.82        25
weighted avg       0.87      0.84      0.83        25
 

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值