instance:
(
u
,
i
,
r
u
i
,
t
u
i
)
(u, i, r_{ui}, t_{ui})
(u,i,rui,tui)
describes user
u
u
u applies item
i
i
i on time
t
u
i
t_{ui}
tui with score
r
u
i
r_{ui}
rui.
formular:
S
i
j
=
1
k
j
λ
∑
u
r
u
i
r
u
j
k
u
ρ
g
(
t
u
i
,
t
u
j
)
g
(
t
1
,
t
2
)
=
exp
[
−
(
t
1
−
t
2
)
2
2
τ
2
]
S_{ij} = \frac 1 {k_j^\lambda} \sum_u \frac {r_{ui} r_{uj}} {k_u^\rho} g(t_{ui}, t_{uj}) \\ g(t_1, t_2) = \exp[ - \frac {(t_1 - t_2)^2} {2\tau^2}]
Sij=kjλ1u∑kuρruirujg(tui,tuj)g(t1,t2)=exp[−2τ2(t1−t2)2]
parameters: λ , ρ , τ \lambda, \rho, \tau λ,ρ,τ
source data:
(
u
,
p
,
r
,
t
)
(u, p, r, t)
(u,p,r,t)
- MAP:
p : ( u , r , t ) p: (u,r,t) p:(u,r,t)
REDUCE:
p : [ ( u , r , t ) , ( ) , . . . ] p: [(u,r,t), (), ...] p:[(u,r,t),(),...]
calc: k p = ( ∑ r ) λ k_p = (\sum r)^\lambda kp=(∑r)λ,
u : ( p , r , t , k p ) u:(p,r,t, k_p) u:(p,r,t,kp) - REDUCE:
u : [ ( p , r , t , k p ) , ( ) , . . . ] u: [(p,r,t, k_p), (), ...] u:[(p,r,t,kp),(),...] - Map
calc: k u = ( ∑ r ) ρ k_u = (\sum r)^\rho ku=(∑r)ρ,
p 0 : { ( p i , s i ) , ⋯ , } p_0: \{(p_i, s_i), \cdots,\} p0:{(pi,si),⋯,}
with : s i → r 0 r i k u k p i g ( t 0 − t i ) s_i \to \frac {r_0r_i} {k_u k_{pi} }g(t_0 - t_i) si→kukpir0rig(t0−ti)
Reduce:
p
0
:
[
(
p
i
,
s
i
)
,
(
)
,
.
.
.
]
p_0: [(p_i,s_i), (), ...]
p0:[(pi,si),(),...]
with:
s
i
→
∑
j
=
i
s
j
s_i \to \sum_{j = i} s_j
si→∑j=isj
Note
map(3)可合并到Reduce(2)中,但会极大增加2结果文件的大小,且会略微增加总耗时
时间对比:
#合并前
job1 time: 115s 1.9m
job2 time: 110s 1.83m
job3 time: 4551s 75.8m
total time: 4776s 79.6m
# 合并后
job1 time: 110s 1.83m
job2 time: 1217s 20.3m
job3 time: 3656s 60.9m
total time: 4983s 83m