层次聚类算法的Python实现

层次聚类算法实例 Hierarchical Clustering Algorithm

数据集:Travel details dataset

来源:https://www.kaggle.com/code/rkiattisak/starter-for-traveler-trip-dataset

字段描述
Trip ID旅行ID
Destination目的地
Start date开始日期
End date结束日期
Duration (days)持续时间(天)
Traveler name旅行者姓名
Traveler age旅行者年龄
Traveler gender旅行者性别
Traveler nationality旅行者国籍
Accommodation type住宿类型
Accommodation cost住宿费用
Transportation type交通方式
Transportation cost交通费用

1、数据获取以及数据预处理

(1)数据集的获取

import pandas as pd

# 读取文件
file_path = 'D:/MachineLearningDesign/TotalDataset/Travel_detailsDataset/Travel details dataset.xlsx'
travel_data = pd.read_excel(file_path)

# 打印数据集
travel_data
Trip IDDestinationStart dateEnd dateDuration (days)Traveler nameTraveler ageTraveler genderTraveler nationalityAccommodation typeAccommodation costTransportation typeTransportation cost
01London, UK5/1/20235/8/20237John Smith35MaleAmericanHotel1200Flight600.0
12Phuket, Thailand6/15/20236/20/20235Jane Doe28FemaleCanadianResort800Flight500.0
23Bali, Indonesia7/1/20237/8/20237David Lee45MaleKoreanVilla1000Flight700.0
34New York, USA8/15/20238/29/202314Sarah Johnson29FemaleBritishHotel2000Flight1000.0
45Tokyo, Japan9/10/20239/17/20237Kim Nguyen26FemaleVietnameseAirbnb700Train200.0
..........................................
132135Rio de Janeiro, Brazil8/1/20238/10/20239Jose Perez37MaleBrazilianHostel2500Car2000.0
133136Vancouver, Canada8/15/20238/21/20236Emma Wilson29FemaleCanadianHotel5000Airplane3000.0
134137Bangkok, Thailand9/1/20239/8/20237Ryan Chen34MaleChineseHostel2000Train1000.0
135138Barcelona, Spain9/15/20239/22/20237Sofia Rodriguez25FemaleSpanishAirbnb6000Airplane2500.0
136139Auckland, New Zealand10/1/202310/8/20237William Brown39MaleNew ZealanderHotel7000Train2500.0

137 rows × 13 columns

(2)将时间特征数值化处理

# 该数据集的时间特征有两个:Start date 和 End date,即旅行的开始日期和结束日期
# 将时间特征转换为日期时间格式
travel_data['Start date'] = pd.to_datetime(travel_data['Start date'])
travel_data['End date'] = pd.to_datetime(travel_data['End date'])

# 方法:使用日期分解
# 将时间特征Start date和End date分解为年、月、日
# 分解后新的数值特征将自动以列的格式连接到原数据集travel_data的后面
travel_data['sy'] = travel_data['Start date'].dt.year   #Start year
travel_data['sm'] = travel_data['Start date'].dt.month  #Start month
travel_data['sd'] = travel_data['Start date'].dt.day    #Start day

travel_data['ey'] = travel_data['End date'].dt.year   #End year
travel_data['em'] = travel_data['End date'].dt.month  #End month
travel_data['ed'] = travel_data['End date'].dt.day    #End day

# 移除 'Start date' 和 'End date' 列
travel_data.drop(['Start date', 'End date'], axis=1, inplace=True)

travel_data
Trip IDDestinationDuration (days)Traveler nameTraveler ageTraveler genderTraveler nationalityAccommodation typeAccommodation costTransportation typeTransportation costsysmsdeyemed
01London, UK7John Smith35MaleAmericanHotel1200Flight600.0202351202358
12Phuket, Thailand5Jane Doe28FemaleCanadianResort800Flight500.020236152023620
23Bali, Indonesia7David Lee45MaleKoreanVilla1000Flight700.0202371202378
34New York, USA14Sarah Johnson29FemaleBritishHotel2000Flight1000.020238152023829
45Tokyo, Japan7Kim Nguyen26FemaleVietnameseAirbnb700Train200.020239102023917
......................................................
132135Rio de Janeiro, Brazil9Jose Perez37MaleBrazilianHostel2500Car2000.02023812023810
133136Vancouver, Canada6Emma Wilson29FemaleCanadianHotel5000Airplane3000.020238152023821
134137Bangkok, Thailand7Ryan Chen34MaleChineseHostel2000Train1000.0202391202398
135138Barcelona, Spain7Sofia Rodriguez25FemaleSpanishAirbnb6000Airplane2500.020239152023922
136139Auckland, New Zealand7William Brown39MaleNew ZealanderHotel7000Train2500.020231012023108

137 rows × 17 columns

(3)非数值特征数值化处理以及数据集缺失值的处理

from sklearn.preprocessing import LabelEncoder #标签编码 用于将类别型数据转换为数值型数据

# 读取数据
data = travel_data

# 提取非数值型的列名(特征)
non_numeric_columns = ['Destination', 'Traveler gender','Traveler nationality',
                       'Accommodation type', 'Transportation type']

# 对非数值列进行数值化处理
label_encoders = {}  # 该字典用于存储每个特征的 LabelEncoder 

# 对于数据集中的每个非数值型列,依次执行以下操作
for column in non_numeric_columns:

    # 检查当前列的数据类型是否为字符串对象
    if data[column].dtype == 'object':

        # 如果列的数据类型为字符串,创建一个名为 column 的列的 LabelEncoder 对象,
        # 并将其存储在名为 label_encoders 的字典中,以便稍后使用
        label_encoders[column] = LabelEncoder()

        # 使用 LabelEncoder 对象对当前列进行转换。
        # fit_transform() 方法用于对数据进行拟合和转换,
        # 将非数字型数据转换为数字型数据,并将转换后的数据存储回原始的数据集中的当前列
        data[column] = label_encoders[column].fit_transform(data[column])

print(label_encoders)

# 重新排列列--把旅客姓名特征放到前面以便后期处理数据
cols = list(data.columns)
cols.remove('Traveler name')  # 从列列表中删除 'Traveler name'
cols.insert(1, 'Traveler name')  # 将 'Traveler name' 插入到第二列
# 将排好序的列名重新赋值给数据集
data = data[cols]

# 对数值列进行缺失值填充
numeric_columns = data.select_dtypes(include=['number']).columns
data[numeric_columns] = data[numeric_columns].fillna(data[numeric_columns].mean())

data

{'Destination': LabelEncoder(), 'Traveler gender': LabelEncoder(), 'Traveler nationality': LabelEncoder(), 'Accommodation type': LabelEncoder(), 'Transportation type': LabelEncoder()}
Trip IDTraveler nameDestinationDuration (days)Traveler ageTraveler genderTraveler nationalityAccommodation typeAccommodation costTransportation typeTransportation costsysmsdeyemed
01John Smith3073510312005600.0202351202358
12Jane Doe425280748005500.020236152023620
23David Lee6745123710005700.0202371202378
34Sarah Johnson361429043200051000.020238152023829
45Kim Nguyen5772604007008200.020239102023917
......................................................
132135Jose Perez44937132250022000.02023812023810
133136Emma Wilson58629073500003000.020238152023821
134137Ryan Chen9734192200081000.0202391202398
135138Sofia Rodriguez117250330600002500.020239152023922
136139William Brown37391263700082500.020231012023108

137 rows × 17 columns

(4)数据标准化处理

from sklearn.preprocessing import StandardScaler
# 从第三列开始截取数据并转换为数组
array_data = data.iloc[:, 2:].values  # [;,2:]:逗号前表示行,所有行;逗号后表示列,从第3列到结束

# array_data
transfer = StandardScaler() # 实例化转换器
scaler_data = transfer.fit_transform(array_data)
#检查标准化后的数据,样本数row + 特征数column
print(scaler_data,'\n',scaler_data.shape)
[[-0.06124973 -0.37973645  0.25631927 ...  0.1651258  -0.52158185
  -1.30645256]
 [ 0.61004731 -1.63332426 -0.72692145 ...  0.1651258  -0.19379876
   0.35465186]
 [-1.4038438  -0.37973645  1.66094888 ...  0.1651258   0.13398433
  -1.30645256]
 ...
 [-1.23601954 -0.37973645  0.11585631 ...  0.1651258   0.78955051
  -1.30645256]
 [-1.1241367  -0.37973645 -1.14831034 ...  0.1651258   0.78955051
   0.63150259]
 [-1.57166806 -0.37973645  0.81817112 ...  0.1651258   1.1173336
  -1.30645256]] 
 (137, 15)

(5)获取样本名称(人名)-- 第0次聚类迭代的样本标签

# 获取每个样本的名称(人名)
column_names_array = data.iloc[:, 1:2].values

column_names_array

array([['John Smith'],
       ['Jane Doe'],
       ['David Lee'],
       ['Sarah Johnson'],
       ['Kim Nguyen'],
       ['Michael Brown'],
       ['Emily Davis'],
       ['Lucas Santos'],
       ['Laura Janssen'],
       ['Mohammed Ali'],
       ['Ana Hernandez'],
       ['Carlos Garcia'],
       ['Lily Wong'],
       ['Hans Mueller'],
       ['Fatima Khouri'],
       ['James MacKenzie'],
       ['Sarah Johnson'],
       ['Michael Chang'],
       ['Olivia Rodriguez'],
       ['Kenji Nakamura'],
       ['Emily Lee'],
       ['James Wilson'],
       ['Sofia Russo'],
       ['Raj Patel'],
       ['Lily Nguyen'],
       ['David Kim'],
       ['Maria Garcia'],
       ['Alice Smith'],
       ['Bob Johnson'],
       ['Charlie Lee'],
       ['Emma Davis'],
       ['Olivia Martin'],
       ['Harry Wilson'],
       ['Sophia Lee'],
       ['James Brown'],
       ['Mia Johnson'],
       ['William Davis'],
       ['Amelia Brown'],
       ['Mia Johnson'],
       ['Adam Lee'],
       ['Sarah Wong'],
       ['John Smith'],
       ['Maria Silva'],
       ['Peter Brown'],
       ['Emma Garcia'],
       ['Michael Davis'],
       ['Nina Patel'],
       ['Kevin Kim'],
       ['Laura van den Berg'],
       ['Jennifer Nguyen'],
       ['David Kim'],
       ['Rachel Lee'],
       ['Jessica Wong'],
       ['Felipe Almeida'],
       ['Nisa Patel'],
       ['Ben Smith'],
       ['Laura Gomez'],
       ['Park Min Woo'],
       ['Michael Chen'],
       ['Sofia Rossi'],
       ['Rachel Sanders'],
       ['Kenji Nakamura'],
       ['Emily Watson'],
       ['David Lee'],
       ['Ana Rodriguez'],
       ['Tom Wilson'],
       ['Olivia Green'],
       ['James Chen'],
       ['Lila Patel'],
       ['Marco Rossi'],
       ['Sarah Brown'],
       ['Sarah Lee'],
       ['Alex Kim'],
       ['Maria Hernandez'],
       ['John Smith'],
       ['Mark Johnson'],
       ['Amanda Chen'],
       ['David Lee'],
       ['Nana Kwon'],
       ['Tom Hanks'],
       ['Emma Watson'],
       ['James Kim'],
       ['John Smith'],
       ['Sarah Lee'],
       ['Maria Garcia'],
       ['David Lee'],
       ['Emily Davis'],
       ['James Wilson'],
       ['Fatima Ahmed'],
       ['Liam Nguyen'],
       ['Giulia Rossi'],
       ['Putra Wijaya'],
       ['Kim Min-ji'],
       ['John Smith'],
       ['Emily Johnson'],
       ['David Lee'],
       ['Sarah Brown'],
       ['Michael Wong'],
       ['Jessica Chen'],
       ['Ken Tanaka'],
       ['Maria Garcia'],
       ['Rodrigo Oliveira'],
       ['Olivia Kim'],
       ['Robert Mueller'],
       ['John Smith'],
       ['Sarah Lee'],
       ['Michael Wong'],
       ['Lisa Chen'],
       ['David Kim'],
       ['Emily Wong'],
       ['Mark Tan'],
       ['Emma Lee'],
       ['George Chen'],
       ['Sophia Kim'],
       ['Alex Ng'],
       ['Alice Smith'],
       ['Bob Johnson'],
       ['Cindy Chen'],
       ['David Lee'],
       ['Emily Kim'],
       ['Frank Li'],
       ['Gina Lee'],
       ['Henry Kim'],
       ['Isabella Chen'],
       ['Jack Smith'],
       ['Katie Johnson'],
       ['John Doe'],
       ['Jane Smith'],
       ['Michael Johnson'],
       ['Sarah Lee'],
       ['David Kim'],
       ['Emily Davis'],
       ['Jose Perez'],
       ['Emma Wilson'],
       ['Ryan Chen'],
       ['Sofia Rodriguez'],
       ['William Brown']], dtype=object)
#检查样本名个数和数据行数是否匹配
column_names_array.shape, len(scaler_data)
((137, 1), 137)

2、运用不同的相似度计算方式绘制不同的聚类树状图,根据树状图选择较好的计算方式


– 选用数据集前51个样本 –

– 距离计算采用的是欧氏距离 euclidean –

(1)层次聚类分析的类以及绘图工具的导入

from scipy.cluster import hierarchy as sch #层次聚类分析 按行(样本)进行聚类
import matplotlib.pylab as plt
import matplotlib;matplotlib.rc('font',family='Microsoft YaHei')

(2)采用最小距离法(单链接算法)进行相似度的计算

# 构建树状图1
Z1 = sch.linkage(scaler_data[:51], metric='euclidean', method='single')

# 绘制树状图,并设置列名
plt.figure(figsize=(10, 5))
dendrogram = sch.dendrogram(Z1, labels=column_names_array[:51])

# 设置斜体列名
plt.xticks(rotation=90, fontsize=10) #90度倾斜

plt.ylabel('欧氏距离 + 单链接算法')
plt.title('层次聚类树状图1')

plt.show()

在这里插入图片描述

(3)采用最大距离法(全链接算法)进行相似度的计算

# 构建树状图2
Z2 = sch.linkage(scaler_data[:51], metric='euclidean', method='complete')

plt.figure(figsize=(10, 5))
dendrogram = sch.dendrogram(Z2, labels=column_names_array[:51])

plt.xticks(rotation=90, fontsize=10)

plt.ylabel('欧氏距离 + 全链接算法')
plt.title('层次聚类树状图2')

plt.show()

在这里插入图片描述

*补充:
语句 Z2 = sch.linkage(scaler_data[:51], metric='euclidean', method='complete')
返回一个层次聚类树的连接矩阵(linkage matrix),其结构类似于一个包含有关聚类的信息的二维数组。连接矩阵的每一行代表一次簇的合并,其中包含以下信息:
1. 前两列是被合并的簇的索引或者标签。
2. 第三列是这两个簇之间的距离或者相似度。
3. 第四列是被合并后的新簇中包含的数据点个数。

注意:可以根据第2点,即根据连接矩阵的第三列获取模型每一次迭代所得的树的高度

这个连接矩阵是由层次聚类算法生成的,通过对数据集进行层次聚类,将数据点逐步合并成越来越大的簇。在这个过程中,会产生一系列的合并操作,而连接矩阵就记录了这些合并操作的细节。

(3)采用平均距离法(均链接算法)进行相似度的计算

# 构建树状图3
Z3 = sch.linkage(scaler_data[:51], metric='euclidean', method='average')

plt.figure(figsize=(10, 5))
dendrogram = sch.dendrogram(Z2, labels=column_names_array[:51])

plt.xticks(rotation=90, fontsize=10)

plt.ylabel('欧氏距离 + 均链接算法')
plt.title('层次聚类树状图3')

plt.show()

在这里插入图片描述

  根据上述的3张树状图可知,使用 全链接算法 或者 均链接算法 计算簇的相似度,每一次迭代的距离效果更好,层次分离的效果明显。相比之下,单链接算法 的距离层次分的就不如前二者。下述的聚类结果计算,我们就选择 全链接算法的结果 进行操作。

3、聚类结果计算

(1)给定阈值下的聚类结果:可以分成几个簇,每个簇对应的样本有哪些

#给定阈值下的聚类结果
yuzhi = float(input('请输入阈值:'))
# yuzhi=7.4  #聚类的高度阈值,决定了在树状结构中被切割成多少个聚类。

label=[]   #用于存储每个数据点所属的聚类标签 

#外层:根据给定的高度阈值yuzhi对层次聚类树进行切割,返回每个数据点所属的聚类标签 如[0]、[1]...
for i in sch.cut_tree(Z2,height=yuzhi):
    
    #内层:遍历了当前聚类的标签,并将每个数据点的标签添加到label列表中。
    for j in i : label.append(j)  #将每个样本所属的标签值存入列表

labelCount = set(label)  #标签的集合 有多少类标签
    
print('输入阈值 = '+str(yuzhi)+' \n聚类簇数 = '+str(len(list(labelCount))))

# 构建游客名称列表
guest_names = [str(name) for name in column_names_array[:51]]

print('-'*15)
print('聚类簇 \t 样本编号&&旅客姓名')
unique_labels = list(set(label))
if unique_labels:  # 检查 label 是否为空
    for i in unique_labels: #遍历标签集的列表
        print(i, '   : ', end='  ')
        for j in range(len(label)): #遍历标签列表
            if i == label[j]:       #检查标签列表元素的值是否为当前迭代的标签值
                print(j,guest_names[j], end='\t\t')  #检查为是时,打印该元素对应的下表索引
            else:
                pass                #检查为否时,跳到下一个元素
        print()  # 换行到下一个簇
else:
    print("没有聚类结果")     # 若检查 label 为空,则断定没有聚类结果

输入阈值 = 6.5 
聚类簇数 = 5
---------------
聚类簇 	 样本编号&&旅客姓名
0    :   0 ['John Smith']		2 ['David Lee']		5 ['Michael Brown']		13 ['Hans Mueller']		19 ['Kenji Nakamura']		21 ['James Wilson']		25 ['David Kim']		41 ['John Smith']		
1    :   1 ['Jane Doe']		11 ['Carlos Garcia']		14 ['Fatima Khouri']		15 ['James MacKenzie']		17 ['Michael Chang']		18 ['Olivia Rodriguez']		23 ['Raj Patel']		24 ['Lily Nguyen']		26 ['Maria Garcia']		27 ['Alice Smith']		28 ['Bob Johnson']		36 ['William Davis']		38 ['Mia Johnson']		44 ['Emma Garcia']		45 ['Michael Davis']		48 ['Laura van den Berg']		49 ['Jennifer Nguyen']		
2    :   3 ['Sarah Johnson']		30 ['Emma Davis']		
3    :   4 ['Kim Nguyen']		6 ['Emily Davis']		16 ['Sarah Johnson']		20 ['Emily Lee']		22 ['Sofia Russo']		29 ['Charlie Lee']		31 ['Olivia Martin']		32 ['Harry Wilson']		33 ['Sophia Lee']		37 ['Amelia Brown']		40 ['Sarah Wong']		42 ['Maria Silva']		47 ['Kevin Kim']		50 ['David Kim']		
4    :   7 ['Lucas Santos']		8 ['Laura Janssen']		9 ['Mohammed Ali']		10 ['Ana Hernandez']		12 ['Lily Wong']		34 ['James Brown']		35 ['Mia Johnson']		39 ['Adam Lee']		43 ['Peter Brown']		46 ['Nina Patel']		

(2)给定簇数下的聚类结果:每个簇对应的样本有哪些

# 给定簇数下的聚类结果
n=int(input('请输入簇数:'))
# n = 9
label = []
for i in sch.cut_tree(Z2, n_clusters=n):
    for j in i:
        label.append(j)
print('输入簇数 = ' + str(n))

# 构建游客名称列表
guest_names = [str(name) for name in column_names_array[:51]]

print('-'*15)
print('聚类簇 \t 样本编号&&旅客姓名')
unique_labels = list(set(label))
if unique_labels:  # 检查 label 是否为空
    for i in unique_labels:  # 遍历标签集的列表
        print(i, '   : ', end='  ')
        # 使用 zip() 函数同时遍历 label 和 guest_names
        for idx, name in zip(range(len(label)), guest_names):
            if i == label[idx]:  # 检查标签列表元素的值是否为当前迭代的标签值
                print(idx, name, end='\t\t')  # 检查为是时,打印该元素对应的下表索引
        print()  # 换行到下一个簇
else:
    print("没有聚类结果")  # 若检查 label 为空,则断定没有聚类结果

输入簇数 = 5
---------------
聚类簇 	 样本编号&&旅客姓名
0    :   0 ['John Smith']		2 ['David Lee']		5 ['Michael Brown']		13 ['Hans Mueller']		19 ['Kenji Nakamura']		21 ['James Wilson']		25 ['David Kim']		41 ['John Smith']		
1    :   1 ['Jane Doe']		11 ['Carlos Garcia']		14 ['Fatima Khouri']		15 ['James MacKenzie']		17 ['Michael Chang']		18 ['Olivia Rodriguez']		23 ['Raj Patel']		24 ['Lily Nguyen']		26 ['Maria Garcia']		27 ['Alice Smith']		28 ['Bob Johnson']		36 ['William Davis']		38 ['Mia Johnson']		44 ['Emma Garcia']		45 ['Michael Davis']		48 ['Laura van den Berg']		49 ['Jennifer Nguyen']		
2    :   3 ['Sarah Johnson']		30 ['Emma Davis']		
3    :   4 ['Kim Nguyen']		6 ['Emily Davis']		16 ['Sarah Johnson']		20 ['Emily Lee']		22 ['Sofia Russo']		29 ['Charlie Lee']		31 ['Olivia Martin']		32 ['Harry Wilson']		33 ['Sophia Lee']		37 ['Amelia Brown']		40 ['Sarah Wong']		42 ['Maria Silva']		47 ['Kevin Kim']		50 ['David Kim']		
4    :   7 ['Lucas Santos']		8 ['Laura Janssen']		9 ['Mohammed Ali']		10 ['Ana Hernandez']		12 ['Lily Wong']		34 ['James Brown']		35 ['Mia Johnson']		39 ['Adam Lee']		43 ['Peter Brown']		46 ['Nina Patel']		

(3)给定簇数下阈值的取值范围

# 求解层次聚类模型每次迭代聚合的树高
# 将树高度存入列表 TreehighList 中,将列表的0号元素设置为0,表示初始时无高度
# 迭代树的高度从列表的1号元素开始追加
# 此时列表的长度 = 样本数
TreehighList = []
TreehighList.append(0)
for i in range(len(Z2)):
    TreehighList.append(Z2[i,2])

TreehighList, len(TreehighList)
    
([0,
  1.3207220250706178,
  1.713192288277125,
  1.948618219530062,
  2.1938156142412155,
  2.344051009400656,
  2.385236125346607,
  2.5227496951700554,
  2.5847786704830993,
  2.6609018439884236,
  2.664531638770863,
  2.7194643386552477,
  2.7231357411789694,
  2.838154747820297,
  2.959983314668614,
  3.014549824867137,
  3.0870393522464195,
  3.21498864327589,
  3.228770291629527,
  3.242982467896726,
  3.293801779310498,
  3.346960826020779,
  3.396800112724491,
  3.4031309441625845,
  3.4379960453193914,
  3.7387064608965064,
  3.7774664570067644,
  3.7988233271953695,
  3.843327398012852,
  3.9861651879658195,
  4.2197659666669365,
  4.291448853880782,
  4.463605246207876,
  4.6155892525418025,
  4.782801321694507,
  4.7839476461738535,
  4.793592690309439,
  4.851137858942306,
  4.945266643885942,
  4.994082102721412,
  5.214885156903247,
  5.29249457737719,
  5.356639595890159,
  5.835802656747644,
  5.967735673534881,
  6.1319341491966295,
  6.340049875642774,
  6.650802835756102,
  6.85452516718054,
  7.432114871942092,
  8.118708906268715],
 51)
# 获取模型的最顶层的高度
top = TreehighList[-1] # 列表的最后一个元素(使用切片)
top
8.118708906268715
# 合并次数: 样本数 = 合并次数 + 1 = 连接矩阵的行数 + 1
linkCounts = len(Z2)
linkCounts
50
# 求解阈值范围函数
def ThresholdRange(n):
    # 输入簇数 > 样本数:输入不合法
    if n > len(TreehighList): 
        return "找不到满足簇数为 {} 的阈值!".format(n)
    
    # 输入簇数 == 1时,阈值 >= 最顶层数高度
    elif n == 1:
        y = top
        return "簇数为 {} 的阈值范围:[{} , INF).".format(n, y)
    

    else:
        # 下界值对应的列表索引
        low = (len(TreehighList)) - n 

        # 上界值对应的列表索引
        high = low + 1
    
    # 左闭右开
    return "簇数为 {} 的阈值范围:[{} , {}).".format(n, TreehighList[low], TreehighList[high])
n = int(input("请输入簇数:"))
# n=46
ThresholdRange(n)
'簇数为 9 的阈值范围:[5.356639595890159 , 5.835802656747644).'
评论 5
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值