T-linkage和J-linkage

J-linkage是一种多结构聚类算法,可自动为数据中的每个structure选择合适参数的模型,如下图:
在这里插入图片描述
定义
consensus set (CS) of each model: the set of points such that
their distance from the model is less than a threshold ε
preference set (PS) of a point:the set of models such that the point prefers
PS of a cluster:the intersection of the preference sets of its points.

在这里插入图片描述
Jaccard distance:
在这里插入图片描述
方法

  1. M model hypothesis are generated by drawing M minimal sets of data points necessary to estimate the model
  2. The consensus set (CS) of each model is computed(分别算N个点与这M个model的距离), a N × M matrix is thus built, where entry (i, j) is 1 if point i belongs to the CS of model j, 0 otherwise. Each column of that matrix is the characteristic
    function of the CS of a model hypothesis. Each row indicates which models a points has given consensus to, i.e., which models it prefers.用这个matrix可以求出每个cluster的PS(the intersection of the preference sets of its points.)
  3. 用cluster的PS来代表cluster,cluster两两之间的距离为cluster的PS的Jaccard distance
  4. 用agglomerative clustering的方法来聚类:
    (1)Among all current clusters, pick the two clusters with the smallest Jaccard distance between the respective PSs.
    (2)Replace these two clusters with the union of the two original ones.
    重复这两步直到最小Jaccard距离为1(所有cluster都没有交集)为止

在这里插入图片描述

T-linkage就是把PS换成了PF(soft preference,PS中的preference只能是0或1,而PF中的preference是介于0到1之间的一个数),similarity的衡量方式变成了相应的Tanimoto distance:在这里插入图片描述

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值