K-means 聚类算法及其改进

本文介绍了K-means聚类算法的应用,包括数据预处理、使用VarianceThreshold进行特征选择,以及引入自定义权重矩阵以改进聚类效果。作者评估了改进后算法在鸢尾花数据集上的性能,包括多样性、完整性、V-measure、调整RandIndex和SilhouetteCoefficient等指标。
摘要由CSDN通过智能技术生成

K-means 聚类算法及其改进

# -*- coding: utf-8 -*-
"""
Created on Sun Oct  1 11:00:21 2023

@author: HP
"""

# -*- coding: utf-8 -*-
"""
Created on Thu Sep  7 15:16:33 2023

@author: ZLM
"""

# -*- coding: utf-8 -*-
"""
Created on Mon Jul 10 13:51:39 2023

@author: Administrator
"""



import numpy as np
from sklearn.cluster import KMeans

from sklearn import metrics
from sklearn.datasets import fetch_olivetti_faces

from sklearn.datasets import load_iris
from sklearn.feature_selection import VarianceThreshold



X,labels_true=load_iris(return_X_y=(True))


sel = VarianceThreshold()
X=sel.fit_transform(X)


def critic_weight(data):
    data_cv= np.std(data,axis=0)/np.mean(data,axis=0)   
    data_corr = np.corrcoef(data.T)
    data_corr = sum(1 - data_corr)
    C = data_cv * data_corr
    critic_weight= C/sum(C)
    return critic_weight


W=np.diag(critic_weight(X))

n_clusters=len(np.unique(labels_true))


labels = KMeans(n_clusters=n_clusters,n_init=10).fit_predict(X)

#评价指标
print("Homogeneity: %0.3f" % metrics.homogeneity_score(labels_true, labels))
print("Completeness: %0.3f" % metrics.completeness_score(labels_true, labels))
print("V-measure: %0.3f" % metrics.v_measure_score(labels_true, labels))
print("Adjusted Rand Index: %0.3f" % metrics.adjusted_rand_score(labels_true, labels))
print(
    "Adjusted Mutual Information: %0.3f"
    % metrics.adjusted_mutual_info_score(labels_true, labels)
)
print(
    "Silhouette Coefficient: %0.3f"
    % metrics.silhouette_score(X, labels, metric="sqeuclidean")
)


print('一个新方法')





X=np.matmul(X,W)
labels = KMeans(n_clusters=n_clusters,n_init=10).fit_predict(X)


print("Homogeneity: %0.3f" % metrics.homogeneity_score(labels_true, labels))
print("Completeness: %0.3f" % metrics.completeness_score(labels_true, labels))
print("V-measure: %0.3f" % metrics.v_measure_score(labels_true, labels))
print("Adjusted Rand Index: %0.3f" % metrics.adjusted_rand_score(labels_true, labels))
print(
    "Adjusted Mutual Information: %0.3f"
    % metrics.adjusted_mutual_info_score(labels_true, labels)
)
print(
    "Silhouette Coefficient: %0.3f"
    % metrics.silhouette_score(X, labels, metric="sqeuclidean")
)
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值