sklearn聚类方法hierarchical clustering

给定若干个点并采用层次聚类法进行聚类

生成7个点并计算欧氏距离矩阵:

import numpy as np
# 点的坐标
A=[10,20]
B=[13,26]
C=[60,65]
D=[81,73]
E=[89,-20]
F=[73,-22]
G=[102,-28]
#计算欧氏距离矩阵
a=np.array([A,B,C,D,E,F,G])#生成一个点列
result=[np.sqrt((a[i,0]-a[j,0])**2+(a[i,1]-a[j,1])**2)for i in range(a.shape[0]) for j in range(a.shape[0])]#1X49的距离向量
result=np.array(result).reshape(a.shape[0],a.shape[0])
result=result.round(2)

欧氏距离矩阵为:

array([[  0.  ,   6.71,  67.27,  88.6 ,  88.55,  75.72, 102.42],
       [  6.71,   0.  ,  61.07,  82.66,  88.84,  76.84, 102.58],
       [ 67.27,  61.07,   0.  ,  22.47,  89.81,  87.97,  99.32],
       [ 88.6 ,  82.66,  22.47,   0.  ,  93.34,  95.34, 100.22],
       [ 88.55,  88.84,  89.81,  93.34,   0.  ,  16.12,  13.93],
       [ 75.72,  76.84,  87.97,  95.34,  16.12,   0.  ,  29.15],
       [102.42, 102.58,  99.32, 100.22,  13.93,  29.15,   0.  ]])

第一步:把门限值设为10,距离小于10的点认为是一个点
可见0和1号点可以归为一类:(A,B)
第二步:使用 single linkage,计算其他点到(A,B)的距离。更新欧氏距离矩阵

(A,B)CDEFG
(A,B)061.0782.6688.5575.72102.42
C61.07022.4789.8187.9799.32
D82.6622.47093.3495.34100.22
E88.5589.8193.34016.1213.93
F75.7289.9795.3416.12029.15
G102.4299.32100.2213.9329.150

若这次把门限定为30,则(C,D),(E,F,G)可以合并

刚刚用到的 single linkage 是度量点集之间距离的方法之一
两个点集之间距离的方法有三种度量方式:

Single Linkage

The distance between two clusters is defined as the shortest distance between two points in each cluster. For example, the distance between clusters “r” and “s” to the left is equal to the length of the arrow between their two closest points.
在这里插入图片描述

Complete Linkage

The distance between two clusters is defined as the longest distance between two points in each cluster. For example, the distance between clusters “r” and “s” to the left is equal to the length of the arrow between their two furthest points.
在这里插入图片描述

Average Linkage

The distance between two clusters is defined as the average distance between each point in one cluster to every point in the other cluster. For example, the distance between clusters “r” and “s” to the left is equal to the average length each arrow between connecting the points of one cluster to the other.
在这里插入图片描述
聚类的过程可以用以下的数状结构形象地表示:
It’s possible to visualize the tree representing the hierarchical merging of clusters as a dendrogram. Visual inspection can often be useful for understanding the structure of the data, though more so in the case of small sample sizes.
在这里插入图片描述

参考文献:
[1] Hierarchical Clustering
[2] 层次聚类算法的原理及实现Hierarchical Clustering
[3] 官方文档
[4] 机器学习—聚类系列-层次聚类(Hierarchical Clustering)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值