FCM公式详细推及代码

FCM作为一种模糊聚类的方法,依靠的是概率来进行聚类的,它的准则函数是:
J = Σ j = 1 C Σ i = 1 N [ μ j ( x i ) ] b ∣ ∣ x i − m j ∣ ∣ 2 s t : Σ j = 1 C μ j ( x i ) = 1 J=\Sigma_{j=1}^{C}\Sigma_{i=1}^{N}[\mu_j(x_i)]^b||x_i-m_j||^2\\st :\Sigma_{j=1}^{C}\mu_j(x_i) =1 J=Σj=1CΣi=1N[μj(xi)]bximj2st:Σj=1Cμj(xi)=1
之后需要将这个约束条件使用拉格朗日乘子放入到 J J J中,得到的损失函数就是:
J = Σ j = 1 C Σ i = 1 N [ μ j ( x i ) ] b ∣ ∣ x i − m j ∣ ∣ 2 + Σ i = 1 N λ i ( Σ j = 1 C μ j ( x i ) − 1 ) J=\Sigma_{j=1}^{C}\Sigma_{i=1}^{N}[\mu_j(x_i)]^b||x_i-m_j||^2 + \Sigma_{i=1}^{N}\lambda_i(\Sigma_{j=1}^{C}\mu_j(x_i)-1) J=Σj=1CΣi=1N[μj(xi)]bximj2+Σi=1Nλi(Σj=1Cμj(xi)1)

介绍了这个之后,我们基于 J J J分别对 μ j ( x i ) , m j \mu_j(x_i), m_j μj(xi),mj求偏导等于0:
∂ J ∂ m j = Σ i = 1 N μ j ( x i ) b ( − 2 ( x i − m j ) ) = 0 解 得 m j = Σ i = 1 C μ j ( x i ) b x i Σ i = 1 C μ j ( x i ) b \frac{\partial{J}}{\partial{m_j}} =\Sigma_{i=1}^{N}\mu_j(x_i)^b(-2(x_i - m_j)) =0\\ 解得m_j=\frac{\Sigma_{i=1}^{C}\mu_j(x_i)^bx_i}{\Sigma_{i=1}^{C}\mu_j(x_i)^b} mjJ=Σi=1Nμj(xi)b(2(ximj))=0mj=Σi=1Cμj(xi)bΣi=1Cμj(xi)bxi
下面是对$\mu_j(x_i)求偏导数:
∂ J ∂ μ j ( x i ) = b ( μ j ( x i ) b − 1 ∣ ∣ x i − m j ∣ ∣ 2 ) + λ i = 0 解 得 μ j ( x i ) = ( − λ i b ) 1 b − 1 ∣ ∣ x i − m j ∣ ∣ − 2 ( b − 1 ) \frac{\partial{J}}{\partial{\mu_j(x_i)}} =b(\mu_j(x_i)^{b-1}||x_i-m_j||^2) + \lambda_i =0\\ 解得\mu_j(x_i) = (\frac{-\lambda_i}{b})^{\frac{1}{b-1}}||x_i-m_j||^{-\frac{2}{(b-1)}} μj(xi)J=b(μj(xi)b1ximj2)+λi=0μj(xi)=(bλi)b11ximj(b1)2
由于 λ i \lambda_i λi不知道,但是知道 Σ j = 1 C μ j ( x i ) = 1 \Sigma_{j=1}^{C}\mu_j(x_i) = 1 Σj=1Cμj(xi)=1,所以仍可以求解得到:
μ j ( x i ) = ( − λ i b ) 1 b − 1 ∣ ∣ x i − m j ∣ ∣ − 2 ( b − 1 ) Σ k = 1 C ( − λ i b ) 1 b − 1 ∣ ∣ x i − m k ∣ ∣ − 2 ( b − 1 ) = ∣ ∣ x i − m j ∣ ∣ − 2 ( b − 1 ) Σ k = 1 C ∣ ∣ x i − m k ∣ ∣ − 2 ( b − 1 ) \mu_j(x_i) = \frac{(\frac{-\lambda_i}{b})^{\frac{1}{b-1}}||x_i-m_j||^{-\frac{2}{(b-1)}}}{\Sigma_{k=1}^{C}(\frac{-\lambda_i}{b})^{\frac{1}{b-1}}||x_i-m_k||^{-\frac{2}{(b-1)}}}=\frac{||x_i-m_j||^{-\frac{2}{(b-1)}}}{\Sigma_{k=1}^{C}||x_i-m_k||^{-\frac{2}{(b-1)}}} μj(xi)=Σk=1C(bλi)b11ximk(b1)2(bλi)b11ximj(b1)2=Σk=1Cximk(b1)2ximj(b1)2
有了上面的这些准备工作之后,介绍FCM的整个工作流程:

  1. 初始化参数 b , m j b, m_j b,mj
  2. 使用 m j m_j mj来更新 μ j ( x i ) \mu_j(x_i) μj(xi)
  3. 使用新的 μ j ( x i ) \mu_j(x_i) μj(xi)来更新 m j m_j mj
  4. 判断新的 m j m_j mj和旧的 m j m_j mj是否近似相等,若不相等返回step2;否则step5
  5. 将样本根据 μ j ( x i ) \mu_j(x_i) μj(xi)划分到隶属度最大的一类中
# -*- coding: utf-8 -*-
"""
FCM

@author: ASUS
"""
from sklearn import datasets
from sklearn.decomposition import PCA
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

def FCM(K,X,B):
    dist = np.array([[np.dot(X[i]- K[j],X[i]- K[j])**(-1/(B-1)) for j in range(len(K))] for i in range(len(X))])
    dist = np.sum(dist,axis = 1)
    pro = np.array([[np.dot(X[i]-K[j],X[i]-K[j])**(-1/(B-1))/dist[i] for j in range(len(K))]for i in range(len(X))])
    return pro.reshape(len(X),-1)

def precision(Y,Y_predict):
    return len(np.where(Y == Y_predict)[0])/len(Y)


iris_datas = datasets.load_iris()
Y = iris_datas.target
X_re = iris_datas.data
X = (X_re -np.min(X_re,axis = 0))/ (np.max(X_re,axis = 0)-np.min(X_re,axis = 0))     #归一化
#选取初始中心
K = np.array([np.average(X[50*i:50*(i+1)],axis = 0) for i in range(len(np.unique(Y)))])  #选取初始中心
Y_predict = np.zeros((len(Y),),dtype = np.uint32)                    #产生预测矩阵

#开始FCM的程序
B = 2
probility = np.ones((len(Y),len(K))) * 0.5
theta = 1
count = 0
while theta > 0.00001:
    probility = FCM(K,X,B)
    new_K = np.array([np.sum((probility[:,i]**(B)).reshape(len(X),-1)*X,axis = 0)/np.sum((probility[:,i])**B) for i in range(len(K))])
    theta = np.sum((K-new_K)**2)
    K = new_K
    count += 1

Y_predict = np.array([np.argmax(probility[i]) for i in range(len(X))])
print('FCM在iris数据集上的正确率为:'+str(round(precision(Y,Y_predict)*100,2))+'%')



#sonar数据集
sonar_datas = pd.read_csv('d:/microsoft/sonar.csv',header= None)
sonar_datas[61] = 0

sonar_datas.loc[np.where(sonar_datas[60]=='M')[0],61] = 1
    
Y = np.array(sonar_datas[61])
X = np.array(sonar_datas.iloc[:,:60])
X = (X -np.min(X,axis = 0))/ (np.max(X,axis = 0)-np.min(X,axis = 0))


# pca = PCA(0.95)
# X = pca.fit_transform(X)
K = np.array([np.average(X[:97],axis =0),np.average(X[97:],axis =0)])  #选取初始中心
#选取初始中心,用FCM中得到的聚类中心
Y_predict = np.zeros((len(Y),),dtype = np.uint32)
B = 2
probility = np.ones((len(Y),len(K))) * 0.5

theta = 1
count = 0
while theta > 0.00001:
    probility = FCM(K,X,B)
    new_K = np.array([np.sum((probility[:,i]**(B)).reshape(len(X),-1)*X,axis = 0)/np.sum((probility[:,i])**B) for i in range(len(K))])
    theta = np.sum((K-new_K)**2)
    K = new_K
    count += 1

Y_predict = np.array([np.argmax(probility[i]) for i in range(len(X))])
print('FCM在sonar数据集上的正确率为:'+str(round(precision(Y,Y_predict)*100,2))+'%')
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值