User-Based
- 用户相似度:
w u v = ∣ N ( u ) ∩ N ( v ) ∣ ∣ N ( u ) ∪ N ( v ) ∣ w_{uv} = \frac{|N(u) \cap N(v)|}{|N(u) \cup N(v)|} wuv=∣N(u)∪N(v)∣∣N(u)∩N(v)∣
w u v = ∣ N ( u ) ∩ N ( v ) ∣ ∣ N ( u ) ∣ ∣ N ( v ) ∣ w_{uv} = \frac{|N(u) \cap N(v)|}{\sqrt{|N(u)| |N(v)|}} wuv=∣N(u)∣∣N(v)∣∣N(u)∩N(v)∣ - 建立物品-用户倒排表,转化为用户相似度矩阵:
def UserSimilarity(train):
# build inverse table for item_users
item_users = dict()
for u, items in train.items():
for i in items.keys():
if i not in item_users:
item_users[i] = set()
item_users[i].add(u)
#calculate co-rated items between users
C = dict()
N = dict()
for i, users in item_users.items():
for u in users:
N[u] += 1
for v in users:
if u == v:
continue
C[u][v] += 1
#calculate finial similarity matrix W
W = dict()
for u, related_users in C.items():
for v, cuv in related_users.items():
W[u][v] = cuv / math.sqrt(N[u] * N[v])
return W
- UserCF下用户
u
u
u对物品
i
i
i的感兴趣程度,
S
(
u
,
k
)
S(u,k)
S(u,k)是和用户
u
u
u相似度最接近的
K
K
K个用户,
N
(
i
)
N(i)
N(i)是对物品
i
i
i有过行为的用户集合:
p ( u , i ) = ∑ v ∈ S ( u , K ) ∩ N ( i ) w u v r v i p(u, i) = \sum_{v\in S(u,K) \cap N(i)}w_{uv}r_{vi} p(u,i)=v∈S(u,K)∩N(i)∑wuvrvi - 代码实现:
def Recommend(user, train, W):
rank = dict()
interacted_items = train[user]
for v, wuv in sorted(W[u].items, key=itemgetter(1), \
reverse=True)[0:K]:
for i, rvi in train[v].items:
if i in interacted_items:
#we should filter items user interacted before
continue
rank[i] += wuv * rvi
return rank
- 改进用户相似度计算公式:
w u v = ∣ N ( u ) ∩ N ( v ) ∣ ∣ N ( u ) ∣ ∣ N ( v ) ∣ w_{uv} = \frac{|N(u) \cap N(v)|}{\sqrt{|N(u)| |N(v)|}} wuv=∣N(u)∣∣N(v)∣∣N(u)∩N(v)∣