AI_Homework_ch9

AI_ch9 无监督学习—聚类算法

使用k-means算法,将如下八个点划分到三个聚类中。八个点坐标: A 1 = ( 2 , 10 ) , A 2 = ( 2 , 5 ) , A 3 = ( 8 , 4 ) , A 4 = ( 5 , 8 ) , A1=(2,10), A2=(2,5), A3=(8,4), A4=(5,8), A1=(2,10),A2=(2,5),A3=(8,4),A4=(5,8), A 5 = ( 7 , 5 ) , A 6 = ( 6 , 4 ) , A 7 = ( 1 , 2 ) , A 8 = ( 4 , 9 ) A5=(7,5), A6=(6,4), A7=(1,2), A8=(4,9) A5=(7,5),A6=(6,4),A7=(1,2),A8=(4,9) 各点间的欧式距离:

A 1 A 2 A 3 A 4 A 5 A 6 A 7 A 8 A 1 0 25 72 13 50 52 65 5 A 2 0 37 18 25 17 10 20 A 3 0 25 2 4 53 41 A 4 0 13 17 52 2 A 5 0 2 45 25 A 6 0 29 29 A 7 0 58 A 8 0 \begin{array}{|c|c|c|c|c|c|c|c|c|} \hline & A1 & A2 & A3 & A4 & A5 & A6 & A7 & A8 \\ \hline A1 & 0 & \sqrt{25} & \sqrt{72} & \sqrt{13} & \sqrt{50} & \sqrt{52} & \sqrt{65} & \sqrt{5} \\ \hline A2 & & 0 & \sqrt{37} & \sqrt{18} & \sqrt{25} & \sqrt{17} & \sqrt{10} & \sqrt{20} \\ \hline A3 & & & 0 & \sqrt{25} & \sqrt{2} & \sqrt{4} & \sqrt{53} & \sqrt{41} \\ \hline A4 & & & & 0 & \sqrt{13} & \sqrt{17} & \sqrt{52} & \sqrt{2} \\ \hline A5 & & & & & 0 & \sqrt{2} & \sqrt{45} & \sqrt{25} \\ \hline A6 & & & & & & 0 & \sqrt{29} & \sqrt{29} \\ \hline A7 & & & & & & & 0 & \sqrt{58} \\ \hline A8 & & & & & & & & 0 \\ \hline \end{array} A1A2A3A4A5A6A7A8A10A225 0A372 37 0A413 18 25 0A550 25 2 13 0A652 17 4 17 2 0A765 10 53 52 45 29 0A85 20 41 2 25 29 58 0
三个聚簇的初始中心点为A1,A4和A7,运行一遍k-means算法后,给出新的聚类归属关系,以及新的聚类中心点。 可以画出八个点的示意图。

  • 8个点利用python可视化如下:
import matplotlib.pyplot as plt
import numpy as np
points = np.array([
    [2, 10],  # A1
    [2, 5],   # A2
    [8, 4],   # A3
    [5, 8],   # A4
    [7, 5],   # A5
    [6, 4],   # A6
    [1, 2],   # A7
    [4, 9]    # A8
])
labels = ['A1', 'A2', 'A3', 'A4', 'A5', 'A6', 'A7', 'A8']  # 点的标签
colors=['blue']*len(points)
centers=[0,3,6]
for x in centers:
    colors[x]='red'
fig, ax = plt.subplots()
ax.scatter(points[:, 0], points[:, 1],c=colors) 
for i, label in enumerate(labels):
    ax.annotate(label, (points[i, 0], points[i, 1]), xytext=(5, -5), textcoords='offset points')

ax.set_xlim(0, 10)
ax.set_ylim(0, 12)
ax.xaxis.set_major_locator(plt.MultipleLocator(1))
ax.yaxis.set_major_locator(plt.MultipleLocator(1))
ax.grid(True)

ax.set_title("Points Visualization")
ax.set_xlabel("X")
ax.set_ylabel("Y")

plt.show()

在这里插入图片描述

  • 运行K-means,第一次聚类,容易得到,归属于 A 1 A1 A1中心的点有 [ A 1 ] [A1] [A1],归属到 A 4 A4 A4中心的点有 [ A 3 , A 4 , A 5 , A 6 , A 8 ] [A3, A4, A5, A6,A8] [A3,A4,A5,A6,A8],归属到 A 7 A7 A7中心的点有 [ A 2 , A 7 ] [A2,A7] [A2,A7]

  • 此时三个聚类中心需要重新寻找,以 A 1 A1 A1为中心的聚类只有一个点,所以 A 1 A1 A1的中心不变仍为 A 1 A1 A1

  • A 4 A4 A4为中心的聚类,新的聚类中心坐标为: ( 1 5 ( x A 3 + x A 4 + x A 5 + x A 6 + x A 8 ) , 1 5 ( y A 3 + y A 4 + y A 5 + y A 6 + y A 8 ) ) = ( 1 5 ( 8 + 5 + 7 + 6 + 4 ) , 1 5 ( 4 + 8 + 5 + 4 + 9 ) ) = ( 6 , 6 ) \left(\frac{1}{5}\left(x_{A3}+x_{A4}+x_{A5}+x_{A6}+x_{A8}\right),\frac{1}{5}\left(y_{A3}+y_{A4}+y_{A5}+y_{A6}+y_{A8}\right)\right)=\left(\frac{1}{5}\left(8+5+7+6+4\right),\frac{1}{5}\left(4+8+5+4+9\right)\right)=\left(6,6\right) (51(xA3+xA4+xA5+xA6+xA8),51(yA3+yA4+yA5+yA6+yA8))=(51(8+5+7+6+4),51(4+8+5+4+9))=(6,6)

  • 同理,以 A 7 A7 A7为中心的聚类,新的聚类中心坐标为: ( 3 2 , 7 2 ) \left(\frac{3}{2},\frac{7}{2}\right) (23,27)

fig1, ax1 = plt.subplots()
ax1.scatter(points[:, 0], points[:, 1]) 
new_centers=np.array([[2,10],[6,6],[1.5,3.5]])
center_labels=['C1','C2','C3']
ax1.scatter(new_centers[:,0],new_centers[:,1],c='red')

for i, label in enumerate(labels):
    ax1.annotate(label, (points[i, 0], points[i, 1]), xytext=(5, -5), textcoords='offset points')
    
for i, label in enumerate(center_labels):
    ax1.annotate(label, (new_centers[i, 0], new_centers[i, 1]), xytext=(-5, 5), textcoords='offset points')
ax1.set_xlim(0, 10)
ax1.set_ylim(0, 12)
ax1.xaxis.set_major_locator(plt.MultipleLocator(1))
ax1.yaxis.set_major_locator(plt.MultipleLocator(1))
ax1.grid(True)

ax1.set_title("Points Visualization")
ax1.set_xlabel("X")
ax1.set_ylabel("Y")

plt.show()

在这里插入图片描述

  • 以新的中心聚类,类似上述步骤,可以得到距离 C 1 C1 C1最近的点有 [ A 1 , A 8 ] [A1, A8] [A1,A8],距离 C 2 C2 C2最近的点有 [ A 3 , A 4 , A 5 , A 6 ] [A3,A4,A5,A6] [A3,A4,A5,A6],距离 C 3 C3 C3最近的点有 [ A 2 , A 7 ] [A2,A7] [A2,A7]

  • 类似地,更新中心点, C 1 = ( 3 , 19 2 ) , C 2 = ( 13 2 , 21 4 ) , C 3 = ( 3 2 , 7 2 ) C1=(3,\frac{19}{2}),\quad C2=(\frac{13}{2},\frac{21}{4}),\quad C3=(\frac{3}{2},\frac{7}{2}) C1=(3,219),C2=(213,421),C3=(23,27)

fig2, ax2 = plt.subplots()
ax2.scatter(points[:, 0], points[:, 1]) 
new_centers2=np.array([[3,19/2],[13/2,21/4],[1.5,3.5]])
center_labels2=['C1','C2','C3']
ax2.scatter(new_centers2[:,0],new_centers2[:,1],c='red')

for i, label in enumerate(labels):
    ax2.annotate(label, (points[i, 0], points[i, 1]), xytext=(5, -5), textcoords='offset points')
    
for i, label in enumerate(center_labels2):
    ax2.annotate(label, (new_centers2[i, 0], new_centers2[i, 1]), xytext=(-5, 5), textcoords='offset points')
ax2.set_xlim(0, 10)
ax2.set_ylim(0, 12)
ax2.xaxis.set_major_locator(plt.MultipleLocator(1))
ax2.yaxis.set_major_locator(plt.MultipleLocator(1))
ax2.grid(True)

ax2.set_title("Points Visualization")
ax2.set_xlabel("X")
ax2.set_ylabel("Y")

plt.show()

在这里插入图片描述

  • 同样地,计算出归属于 C 1 C1 C1的点有 [ A 1 , A 4 , A 8 ] [A1, A4, A8] [A1,A4,A8],归属于 C 2 C2 C2的点有 [ A 3 , A 5 , A 6 ] [A3,A5,A6] [A3,A5,A6],归属于 C 3 C3 C3的点有 [ A 2 , A 7 ] [A2,A7] [A2,A7]

  • 类似地,更新距离中心,得到 C 1 = ( 11 3 , 9 ) , C 2 = ( 7 , 13 3 ) , C 3 = ( 3 2 , 7 2 ) C1=(\frac{11}{3},9),\quad C2=(7,\frac{13}{3}),\quad C3=(\frac{3}{2},\frac{7}{2}) C1=(311,9),C2=(7,313),C3=(23,27)

fig3, ax3 = plt.subplots()
ax3.scatter(points[:, 0], points[:, 1]) 
new_centers3=np.array([[11/3,9],[7,13/3],[1.5,3.5]])
center_labels3=['C1','C2','C3']
ax3.scatter(new_centers2[:,0],new_centers3[:,1],c='red')

for i, label in enumerate(labels):
    ax3.annotate(label, (points[i, 0], points[i, 1]), xytext=(5, -5), textcoords='offset points')
    
for i, label in enumerate(center_labels2):
    ax3.annotate(label, (new_centers3[i, 0], new_centers3[i, 1]), xytext=(-5, 5), textcoords='offset points')
ax3.set_xlim(0, 10)
ax3.set_ylim(0, 12)
ax3.xaxis.set_major_locator(plt.MultipleLocator(1))
ax3.yaxis.set_major_locator(plt.MultipleLocator(1))
ax3.grid(True)

ax3.set_title("Points Visualization")
ax3.set_xlabel("X")
ax3.set_ylabel("Y")

plt.show()

在这里插入图片描述

  • 这一步,计算出归属于 C 1 C1 C1的点有 [ A 1 , A 4 , A 8 ] [A1, A4, A8] [A1,A4,A8],归属于 C 2 C2 C2的点有 [ A 3 , A 5 , A 6 ] [A3,A5,A6] [A3,A5,A6],归属于 C 3 C3 C3的点有 [ A 2 , A 7 ] [A2,A7] [A2,A7],与上一次划分没有变化,因此这就是最后的划分结果,即:
  • C l u s t e r C 1 = { A 1 , A 4 , A 8 } , C l u s t e r C 2 = { A 3 , A 5 , A 6 } , C l u s t e r C 3 = { A 2 , A 7 } Cluster_{C1}=\{A1,A4,A8\},\quad Cluster_{C2}=\{A3,A5,A6\},\quad Cluster_{C3}=\{A2,A7\} ClusterC1={A1,A4,A8},ClusterC2={A3,A5,A6},ClusterC3={A2,A7}
  • C 1 = ( 11 3 , 9 ) , C 2 = ( 7 , 13 3 ) , C 3 = ( 3 2 , 7 2 ) C1=(\frac{11}{3},9),\quad C2=(7,\frac{13}{3}),\quad C3=(\frac{3}{2},\frac{7}{2}) C1=(311,9),C2=(7,313),C3=(23,27)
  • 56
    点赞
  • 44
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值