最近在使用层次聚类时,简单了解了一下合并类簇时所使用的几种linkage方法,在这里作一下简单的总结。
1963年,Ward J H提出了使用离差平方(Error Sum of Square)和或者说是信息缺失量作为目标函数的思想来决定究竟应该怎么一步一步合并小类簇为一个大类簇,他在文献中指出,类簇合并后的离差平方和应当最小,即最好的目标函数就是使得类簇合并后的信息缺失最小。
(1)The first step in grouping is to select two of these n subsets which , when united , will reduce by one the number of subsets while producing the least impairment of the optimal value of the objective function .
(2)The n-1 resulting subsets then are examined to determine if a third member should be united with the first pair or another pairing made in order to secure the optional value of the objective function for n-2 groups.
(3)This procedure can be continued until all