推荐经典算法实现之BPMF(python+MovieLen)

因前一篇https://blog.csdn.net/fjssharpsword/article/details/97000479采样问题未解决,发现如下github上有BPMF代码,采用wishart先验,性能和pymc3一致。

参考:https://github.com/LoryPack/BPMF

# coding:utf-8  
'''
@author: Jason.F
@data: 2019.08.01
@function: baseline BPMF(Bayesian Probabilistic Matrix Factorization)
           Datatset: MovieLens-1m:https://grouplens.org/datasets/movielens/  
           Evaluation: RMSE
'''

import numpy as np
import random
import pandas as pd
from numpy.random import multivariate_normal
from scipy.stats import wishart

class DataSet:
    def __init__(self):
        self.trainset, self.testset, self.maxu, self.maxi, self.maxr = self._getDataset_as_list()
        
    def _getDataset_as_list(self):
        #trainset
        filePath = "/data/fjsdata/BMF/ml-1m.train.rating" 
        data = pd.read_csv(filePath, sep='\t', header=None, names=['user', 'item', 'rating'], \
                                 usecols=[0, 1, 2], dtype={0: np.int32, 1: np.int32, 2: np.float})
        maxu, maxi, maxr = data['user'].max()+1, data['item'].max()+1, data['rating'].max()
        print('Dataset Statistics: Interaction = %d, User = %d, Item = %d, Sparsity = %.4f' % \
                  (data.shape[0], maxu, maxi, data.shape[0]/(maxu*maxi)))
        trainset = data.values.tolist()
        #testset
        filePath = "/data/fjsdata/BMF/ml-1m.test.rating" 
        data = pd.read_csv(filePath, sep='\t', header=None, names=['user', 'item', 'rating'], \
                                 usecols=[0, 1, 2], dtype={0: np.int32, 1: np.int32, 2: np.float})
        testset = data.values.tolist()
        return trainset, testset, maxu, maxi, maxr 
    
    def list_to_matrix(self, dataset, maxu, maxi):              
        dataMat = np.zeros([maxu, maxi], dtype=np.float32)
        for u,i,r in dataset:
            dataMat[int(u)][int(i)] = float(r)
        return np.array(dataMat)
    
def Normal_Wishart(mu_0, lamb, W, nu, seed=None):
    """Function extracting a Normal_Wishart random variable"""
    # first draw a Wishart distribution:
    Lambda = wishart(df=nu, scale=W, seed=seed).rvs()  # NB: Lambda is a matrix.
    # then draw a Gaussian multivariate RV with mean mu_0 and(lambda*Lambda)^{-1} as covariance matrix.
    cov = np.linalg.inv(lamb * Lambda)  # this is the bottleneck!!
    mu = multivariate_normal(mu_0, cov)
    return mu, Lambda, cov

def BPMF(R, R_test, U_in, V_in, T, D, initial_cutoff, lowest_rating, highest_rating,
         mu_0=None, Beta_0=None, W_0=None, nu_0=None):
    """
    R is the ranking matrix (NxM, N=#users, M=#movies); we are assuming that R[i,j]=0 means that user i has not ranked movie j
    R_test is the ranking matrix that contains test values. Same assumption as above. 
    U_in, V_in are the initial values for the MCMC procedure. 
    T is the number of steps. 
    D is the number of hidden features that are assumed in the model.    
    
    mu_0 is the average vector used in sampling the multivariate normal variable
    Beta_0 is a coefficient (?)
    W_0 is the DxD scale matrix in the Wishart sampling 
    nu_0 is the number of degrees of freedom used in the Wishart sampling. 
    
    U matrices are DxN, while V matrices are DxM.
    
    """

    def ranked(i, j):  # function telling if user i ranked movie j in the train dataset.
        if R[i, j] != 0:
            return True
        else:
            return False

    def ranked_test(i, j):  # function telling if user i ranked movie j in the test dataset.
        if R_test[i, j] != 0:
            return True
        else:
            return False

    N = R.shape[0]
    M = R.shape[1]

    R_predict = np.zeros((N, M))
    U_old = np.array(U_in)
    V_old = np.array(V_in)

    train_err_list = []
    test_err_list = []
    train_epoch_list = []

    # initialize now the hierarchical priors:
    alpha = 2  # observation noise, they put it = 2 in the paper
    mu_u = np.zeros((D, 1))
    mu_v = np.zeros((D, 1))
    Lambda_U = np.eye(D)
    Lambda_V = np.eye(D)

    # COUNT HOW MAY PAIRS ARE IN THE TEST AND TRAIN SET:
    pairs_test = 0
    pairs_train = 0
    for i in range(N):
        for j in range(M):
            if ranked(i, j):
                pairs_train = pairs_train + 1
            if ranked_test(i, j):
                pairs_test = pairs_test + 1

    # print(pairs_test, pairs_train)

    # SET THE DEFAULT VALUES for Wishart distribution
    # we assume that parameters for both U and V are the same.

    if mu_0 is None:
        mu_0 = np.zeros(D)
    if nu_0 is None:
        nu_0 = D
    if Beta_0 is None:
        Beta_0 = 2
    if W_0 is None:
        W_0 = np.eye(D)
        
    # results = pd.DataFrame(columns=['step', 'train_err', 'test_err'])

    for t in range(T):
        # print("Step ", t)
        # FIRST SAMPLE THE HYPERPARAMETERS, conditioned on the present step user and movie feature matrices U_t and V_t:

        # parameters common to both distributions:
        Beta_0_star = Beta_0 + N
        nu_0_star = nu_0 + N
        W_0_inv = np.linalg.inv(W_0)  # compute the inverse once and for all

        # movie hyperparameters:
        V_average = np.sum(V_old, axis=1) / N  # in this way it is a 1d array!!
        # print (V_average.shape)
        S_bar_V = np.dot(V_old, np.transpose(V_old)) / N  # CHECK IF THIS IS RIGHT!
        mu_0_star_V = (Beta_0 * mu_0 + N * V_average) / (Beta_0 + N)
        W_0_star_V_inv = W_0_inv + N * S_bar_V + Beta_0 * N / (Beta_0 + N) * np.dot(
            np.transpose(np.array(mu_0 - V_average, ndmin=2)), np.array((mu_0 - V_average), ndmin=2))
        W_0_star_V = np.linalg.inv(W_0_star_V_inv)
        mu_V, Lambda_V, cov_V = Normal_Wishart(mu_0_star_V, Beta_0_star, W_0_star_V, nu_0_star, seed=None)

        # user hyperparameters
        # U_average=np.transpose(np.array(np.sum(U_old, axis=1)/N, ndmin=2)) #the np.array and np.transpose are needed for it to be a column vector
        U_average = np.sum(U_old, axis=1) / N  # in this way it is a 1d array!!  #D-long
        # print (U_average.shape)
        S_bar_U = np.dot(U_old, np.transpose(U_old)) / N  # CHECK IF THIS IS RIGHT! #it is DxD
        mu_0_star_U = (Beta_0 * mu_0 + N * U_average) / (Beta_0 + N)
        W_0_star_U_inv = W_0_inv + N * S_bar_U + Beta_0 * N / (Beta_0 + N) * np.dot(
            np.transpose(np.array(mu_0 - U_average, ndmin=2)), np.array((mu_0 - U_average), ndmin=2))
        W_0_star_U = np.linalg.inv(W_0_star_U_inv)
        mu_U, Lambda_U, cov_U = Normal_Wishart(mu_0_star_U, Beta_0_star, W_0_star_U, nu_0_star, seed=None)

        # print (S_bar_U.shape, S_bar_V.shape)
        # print (np.dot(np.transpose(np.array(mu_0-U_average, ndmin=2)),np.array((mu_0-U_average), ndmin=2).shape))

        # UP TO HERE IT PROBABLY WORKS, FROM HERE ON IT HAS TO BE CHECKED!!!

        """SAMPLE THEN USER FEATURES (possibly in parallel):"""

        U_new = np.array([])  # define the new stuff.
        V_new = np.array([])

        for i in range(N):  # loop over the users
            # first compute the parameters of the distribution
            Lambda_U_2 = np.zeros((D, D))  # second term in the construction of Lambda_U
            mu_i_star_1 = np.zeros(D)  # first piece of mu_i_star
            for j in range(M):  # loop over the movies
                if ranked(i, j):  # only if movie j has been ranked by user i!
                    Lambda_U_2 = Lambda_U_2 + np.dot(np.transpose(np.array(V_old[:, j], ndmin=2)),
                                                     np.array((V_old[:, j]), ndmin=2))  # CHECK
                    mu_i_star_1 = V_old[:, j] * R[i, j] + mu_i_star_1  # CHECK DIMENSIONALITY!!!!!!!!!!!!
                    # coeff=np.transpose(np.array(V_old[j]*R[i,j], ndmin=2))+coeff  #CHECK DIMENSIONALITY!!!!!!!!!!!!

            Lambda_i_star_U = Lambda_U + alpha * Lambda_U_2
            Lambda_i_star_U_inv = np.linalg.inv(Lambda_i_star_U)

            mu_i_star_part = alpha * mu_i_star_1 + np.dot(Lambda_U,
                                                          mu_U)  ###CAREFUL!! Multiplication matrix times a row vector!! It should give as an output a row vector as for how it works
            mu_i_star = np.dot(Lambda_i_star_U_inv, mu_i_star_part)
            # extract now the U values!
            U_new = np.append(U_new, multivariate_normal(mu_i_star, Lambda_i_star_U_inv))

        # you need to reshape U_new and transpose it!!
        U_new = np.transpose(np.reshape(U_new, (N, D)))
        # print (U_new.shape)

        """SAMPLE THEN MOVIE FEATURES (possibly in parallel):"""

        for j in range(M):
            Lambda_V_2 = np.zeros((D, D))  # second term in the construction of Lambda_U
            mu_i_star_1 = np.zeros(D)  # first piece of mu_i_star
            for i in range(N):  # loop over the movies
                if ranked(i, j):
                    Lambda_V_2 = Lambda_V_2 + np.dot(np.transpose(np.array(U_new[:, i], ndmin=2)),
                                                     np.array((U_new[:, i]), ndmin=2))
                    mu_i_star_1 = U_new[:, i] * R[i, j] + mu_i_star_1  # CHECK DIMENSIONALITY!!!!!!!!!!!!
                    # coeff=np.transpose(np.array(V_old[j]*R[i,j], ndmin=2))+coeff  #CHECK DIMENSIONALITY!!!!!!!!!!!!

            Lambda_j_star_V = Lambda_V + alpha * Lambda_V_2
            Lambda_j_star_V_inv = np.linalg.inv(Lambda_j_star_V)

            mu_i_star_part = alpha * mu_i_star_1 + np.dot(Lambda_V, mu_V)
            mu_j_star = np.dot(Lambda_j_star_V_inv, mu_i_star_part)
            V_new = np.append(V_new, multivariate_normal(mu_j_star, Lambda_j_star_V_inv))

        # you need to reshape U_new and transpose it!!
        V_new = np.transpose(np.reshape(V_new, (M, D)))

        # save U_new and V_new in U_old and V_old for next iteration:         
        U_old = np.array(U_new)
        V_old = np.array(V_new)

        if t > initial_cutoff:  # initial_cutoff is needed to discard the initial transient
            R_step = np.dot(np.transpose(U_new), V_new)
            for i in range(N):  # reduce all the predictions to the correct ratings range.
                for j in range(M):
                    if R_step[i, j] > highest_rating:
                        R_step[i, j] = highest_rating
                    elif R_step[i, j] < lowest_rating:
                        R_step[i, j] = lowest_rating

            R_predict = (R_predict * (t - initial_cutoff - 1) + R_step) / (t - initial_cutoff)
            
            train_err = 0  # initialize the errors.
            test_err = 0
            # compute now the RMSE on the train dataset:
            for i in range(N):
                for j in range(M):
                    if ranked(i, j):
                        train_err = train_err + (R_predict[i, j] - R[i, j]) ** 2
            train_err_list.append(np.sqrt(train_err / pairs_train))
            print("Training RMSE at iteration ", t - initial_cutoff, " :   ", "{:.4}".format(train_err_list[-1]))
            # compute now the RMSE on the test dataset:
            for i in range(N):
                for j in range(M):
                    if ranked_test(i, j):
                        test_err = test_err + (R_predict[i, j] - R_test[i, j]) ** 2
            test_err_list.append(np.sqrt(test_err / pairs_test))
            print("Test RMSE at iteration ", t - initial_cutoff, " :   ", "{:.4}".format(test_err_list[-1]))

    return R_predict

if __name__ == "__main__":
    ds = DataSet()#loading dataset\
    R = ds.list_to_matrix(ds.trainset, ds.maxu, ds.maxi)#get matrix
    R_test = ds.list_to_matrix(ds.testset, ds.maxu, ds.maxi)#get matrix
    for K in [8, 16, 32, 64]:
        U_in = np.zeros((K, ds.maxu))  
        V_in = np.zeros((K, ds.maxi))
        R_pred = BPMF(R, R_test, U_in, V_in, T=100, D=K, initial_cutoff=0, lowest_rating=0, highest_rating=ds.maxr)
        squaredError = []
        for u,i,r in ds.testset:
            error=r - nR[int(u)][int(i)]
            squaredError.append(error * error)
        rmse =math.sqrt(sum(squaredError) / len(squaredError))
        print("RMSE@{}:{}".format(K, rmse))

 

  • 2
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
### 回答1: DS证据理论(Dempster-Shafer Theory of Evidence)是一种用于处理不确定性和不完整信息的数学理论。在DS证据理论中,将每个证据看作一个“证据支持度函数”,它表示该证据对不同假设的支持程度。通过合并不同证据的支持度函数,可以得到一个综合的支持度函数,从而得出最终的结论。 对于两个证据支持度函数A和B,它们的合成可以通过Dempster's rule进行计算。Dempster's rule是DS证据理论的核心公式之一,计算方式如下: 1. 计算A和B的交集,即A和B同时支持的假设的支持度。 2. 计算A和B的并集,即A和B支持的所有假设的支持度。 3. 计算A和B的冲突度量,即A和B支持的假设有多少是不一致的。 4. 根据冲突度量对支持度进行修正,得到综合的支持度函数。 通过Dempster's rule合成证据支持度函数,可以将不同证据的信息进行整合,得到更准确的结果。 ### 回答2: DS证据理论(Dempster-Shafer evidence theory)是一种用于合成不确定性证据的推理方法,它可以将多种证据的结果组合成一个综合的结果。其合成过程主要分为三个步骤:基本概率分配、合成规则和极大似然法。 首先,基本概率分配是将每种证据的不确定性量化成为基本概率分配函数(Basic Probability Mass Function,简称BPMF)的过程。基本概率分配是将确定性和不确定性合理地分配到每个可能的事件上。通过考虑证据对于每种可能事件的置信度,可以为每个事件分配一个权重。 接下来,合成规则是将多种证据的基本概率分配函数进行合并的过程。DS证据理论采用的主要合成规则是Dempster's combination rule。该规则通过计算不同证据的交叉影响度量来确定每个事件的最终概率。合成规则不仅考虑了证据的证据力量,还考虑了证据之间的可能互斥和相互依赖关系。 最后,极大似然法是一种使用DS证据理论的附加方法,用于消除证据中的冲突。通过寻找使得合成结果达到最大的某种证据分配,可以确定最终的结果。这种方法在证据之间存在矛盾或不一致时,可以让合成结果更加准确。 总而言之,DS证据理论通过基本概率分配、合成规则和极大似然法的综合运用,可以将多种证据结果合成为一个综合的结果。这种合成方法使得对不确定性的推理更加准确和可靠,为决策和推断提供了重要的工具。 ### 回答3: DS证据理论是一种用于合成两种证据结果的方法,它综合了Dempster-Shafer理论和证据理论的思想。DS证据理论基于概率推理和不确定性理论,通过量化和融合不同证据的不确定性来得出最终的结果。 在DS证据理论中,每一种证据都表示为一个信任分布函数,用来表示该证据对不同假设的支持程度。这个信任分布函数表示了证据支持某种假设的程度,其中每个假设的支持程度由一个置信度表示。 合成两种证据结果的过程可以分为两个主要步骤: 1. 信任度传播:首先,将每种证据的置信度按照一定规则进行组合,得到每个假设的信任度。这个规则可以是Dempster规则,它基于可能性和不可能性的计算,将两种证据的置信度进行合并。在这个阶段,两种证据的置信度被传播到所有可能的假设上。 2. 阈值设置:根据使用者设定的阈值,对所有可能的假设进行筛选,选出最合理的结果。这个过程可以根据需求进行调整,根据不同的应用场景来设置不同的阈值。 DS证据理论的优点是能够将不同证据的不确定性进行量化,以及能够进行合理的融合和推理。它适用于处理不完全和不确定的信息,为决策提供了一种有效的方法。然而,DS证据理论也存在一些限制,比如需要准确设定置信度和阈值,否则结果可能不准确。此外,证据之间的关联性也会对结果产生影响。 总而言之,DS证据理论通过将两种证据的置信度进行合成,量化和融合了不同证据的不确定性,以得出合理的结果。它在处理不完全和不确定性信息时具有一定的优势,但需要准确设定相关参数。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值