基于无监督学习算法的滑坡易发性评价的实施（k聚类、谱聚类、Hier聚类）_滑坡易发性机器学习一般用什么编程软件-CSDN博客

本文链接：https://blog.csdn.net/qq_36980284/article/details/138612666

基于无监督学习算法的滑坡易发性评价的实施

1. k均值聚类
2. 谱聚类
3. Hier聚类
4. 基于上述聚类方法的易发性实施

本研究中的数据集和代码可从以下链接下载：

1. k均值聚类

K-Means 聚类是一种矢量量化方法，最初来自信号处理，旨在将 N 个观测值划分为 K 个聚类，其中每个观测值都属于具有最接近均值的聚类（聚类中心或聚类质心），作为聚类的原型。这导致将数据空间划分为 Voronoi 单元。k-means 聚类最小化了聚类内方差（欧几里得距离的平方），但不能最小化正则欧几里得距离，这将是更困难的韦伯问题：均值优化平方误差，而只有几何中位数最小化欧几里得距离。例如，使用 k 中位数和 k-中位数可以找到更好的欧几里得解。
在这里插入图片描述

2. 谱聚类

在多元统计中，谱聚类技术利用数据相似性矩阵的谱（特征值）在聚类到较少的维度之前执行降维。相似性矩阵作为输入提供，包括对数据集中每对点的相对相似度的定量评估。

在图像分割的应用中，光谱聚类被称为基于分割的对象分类。

聚类前
在这里插入图片描述
聚类后

3. Hier聚类

在数据挖掘和统计中，分层聚类（也称为分层聚类分析或 HCA）是一种聚类分析方法，旨在构建聚类层次结构。分层聚类的策略通常分为两类：

聚集：这是一种“自下而上”的方法：每个观测点从其自己的聚类开始，随着聚类的向上移动，成对的聚类会合并。
除法：这是一种“自上而下”的方法：所有观察值都从一个聚类开始，当一个聚类向下移动时，以递归方式执行拆分。
一般来说，合并和拆分是以贪婪的方式确定的。分层聚类[1]的结果通常以树状图的形式呈现。

分层聚类具有明显的优势，即可以使用任何有效的距离度量。事实上，观测本身并不是必需的：所使用的只是一个距离矩阵。另一方面，除了单联动距离的特殊情况外，其他算法可以保证找到最佳解决方案。

聚类前
在这里插入图片描述
聚类后

4. 基于上述聚类方法的易发性实施

数据集见链接：埕岛海域数据集

通过gis数据处理得到同一投影坐标系下的海底滑坡相关的因子

在这里插入代码片

因子——水深、坡度、坡向、沉积物类型、波致剪应力
在这里插入图片描述
2. 对数据进行重分类，采用Arcgis空间分析中的重分类工具

#数据探索
import rasterio
rs1=rasterio.open(r'Sea_mct_560_evaluate\Reclass_560\Rebathy.tif')
result1=rs1.read(masked=True)
rs2=rasterio.open(r'Sea_mct_560_evaluate\Reclass_560\Reslope.tif')
result2=rs2.read(masked=True)
rs3=rasterio.open(r'Sea_mct_560_evaluate\Reclass_560\Reaspect.tif')
result3=rs3.read(masked=True)
rs4=rasterio.open(r'Sea_mct_560_evaluate\Reclass_560\Resediment.tif')
result4=rs4.read(masked=True)
rs5=rasterio.open(r'Sea_mct_560_evaluate\Reclass_560\Reshear.tif')
result5=rs5.read(masked=True)
import matplotlib.pyplot as plt
from rasterio.plot import show
from matplotlib import colors,cm
fig,ax=plt.subplots(2,3,figsize=(10,6))
all_data=[result1,result2,result3,result4,result5]
k=0
for i in range(1,3):
    for j in range(1,4):
        if k<5:
            show(all_data[k],transform=rs1.transform,ax=ax[i-1,j-1],cmap='Accent')
            fig.colorbar(cm.ScalarMappable(norm=colors.Normalize(vmin=np.nanmin(all_data[k]), vmax=np.nanmax(all_data[k])),cmap='Accent')
                         , ax=ax[i-1,j-1],extend='both',fraction=0.05)
            k+=1

        else:
            ax[i-1,j-1].axis('off')
plt.tight_layout()
plt.show()

重分类因子
在这里插入图片描述
3. 形成聚类分析数据集

#将上述结果转成列，拼接形成矩阵
bathy=result1.compressed()
slope=result2.compressed()
aspect=result3.compressed()
sediment=result4.compressed()
shear=result5.compressed()
print(bathy.shape,slope.shape,aspect.shape,sediment.shape,shear.shape)
#形成Dataframe
import pandas as pd
ddf=pd.concat([pd.DataFrame({'bathy':bathy}), pd.DataFrame({'slope':slope}), pd.DataFrame({'aspect':aspect})
          ,pd.DataFrame({'sediment':sediment})
          ,pd.DataFrame({'shear':shear})], axis=1)
print(ddf)

聚类分析

import matplotlib.pyplot as plt
from sklearn.cluster import SpectralClustering,KMeans,AgglomerativeClustering
#谱聚类
pred_spectral_label=SpectralClustering(n_clusters=4,gamma=1,random_state=0).fit(ddf)
spectral_label=pred_spectral_label.labels_
#K聚类
pred_k_label=KMeans(n_clusters=4,random_state=0).fit(ddf)
k_label=pred_k_label.labels_
#Hierarchical clustering
pred_hier_label=AgglomerativeClustering(n_clusters=4).fit(ddf)
hier_label=pred_hier_label.labels_

保存聚类结果

#保存谱聚类结果
# np.savetxt(r'spectral_cluster.txt',Susceptibility_spectral)
np.savez("spectral_cluster.npz",spectral=Susceptibility_spectral.data,mask=Susceptibility_spectral.mask)
np.savez("spectral_k.npz",spectral=Susceptibility_k.data,mask=Susceptibility_k.mask)
np.savez("spectral_hier.npz",spectral=Susceptibility_hier.data,mask=Susceptibility_hier.mask)

得到的聚类结果是不同的类别数字，至于具体的易发性等级需要通过专家经验进行判断
聚类结果可视化与易发性等级（与AHP模型做了对比）

from rasterio.plot import show
import matplotlib.pyplot as plt
plt.rcParams['font.family']='times new roman'
from matplotlib import colors, cm
from palettable.cartocolors.sequential import PinkYl_4
fig,ax=plt.subplots(2,2,figsize=(6,6))

fig.subplots_adjust(hspace=-0.1)

#创造数组
spectral_cluster=np.load("spectral_cluster.npz")
spectral_k=np.load("spectral_k.npz")
spectral_hier=np.load("spectral_hier.npz")

Susceptibility_spectral=np.ma.masked_array(spectral_cluster['spectral'],mask=spectral_cluster['mask'])
Susceptibility_k=np.ma.masked_array(spectral_k['spectral'],mask=spectral_k['mask'])
Susceptibility_hier=np.ma.masked_array(spectral_hier['spectral'],mask=spectral_hier['mask'])

#陆地shp
import geopandas as gpd
gdf=gpd.read_file(r'shp数据\land_sea_boundary.shp')
dd=gdf.to_crs(rs1.crs)
for i in range(2):
    for j in range(2):
        dd.plot(ax=ax[i,j],color='lightgrey',edgecolor='black')
        ax[i,j].text(0.4,0.4,r'Chengdao',transform=ax[i,j].transAxes)

show(Susceptibility_spectral,transform=rs1.transform,cmap=PinkYl_4.mpl_colormap,ax=ax[0,0])
show(Susceptibility_k,transform=rs1.transform,cmap=PinkYl_4.mpl_colormap,ax=ax[0,1])
show(Susceptibility_hier,transform=rs1.transform,cmap=PinkYl_4.mpl_colormap,ax=ax[1,0])

#与已出版文献中的AHP文献对比
#AHP易发性评价结果
rs_ahp=rasterio.open(r'易发性评价结果\ReSuscetpibility_AHP2.tif')
result_ahp=rs_ahp.read(masked=True)
show(result_ahp,transform=rs_ahp.transform,cmap=PinkYl_4.mpl_colormap,ax=ax[1,1])

plt.tight_layout()
plt.savefig(r'Susceptibility.jpg',dpi=300,bbox_inches='tight')
plt.show()