1、用户分类
新增用户、留存用户、活跃用户、有效用户、流失用户、僵尸用户。
2、LTV
Lifetime Value 生命周期价值,新增用户在给定时间内的活跃天数。ARPU平均每活跃用户收入。
3、用户物品购买关联分析
Apriori 算法
支持度:交集/全集
置信度:X∩Y/X
提升度:X∩Y/(Y-X∩Y)
data1 = read.csv("玩家购物数据.csv")
library(reshape)
data1_matrix <- cast(data1,player_id~product_name,value = "qty")
data1_matrix_new <- apply(data1_matrix[,-1],2,function(x) {ifelse(is.na(x),0,1)})
data1_matrix_new <- matrix(data1_matrix_new,nrow = dim(data1_matrix_new)[1],
ncol = dim(data1_matrix_new)[2],dimnames = list(data1_matrix[,1],colnames(data1_matrix)[-1]))
library(arules)
data1_class <- as(data1_matrix_new,"transactions")
inspect(data1_class[1:6])
summary(data1_class)
第二部分:最频繁出现的商品及频数
第三部分:每笔交易商品数对应频数
itemFrequency()
itemFrequencyPlot(data1_class,support=0.05,main="大于5%的项集支持度频率图")
itemFrequencyPlot(data1_class,topN=20,main="前20项集支持度频率图")
rules <- apriori(data1_class,parameter =list(support=0.005,confidence=0.1,
target="rules",minlen=2))
summary(rules)
len=2 46条 len=3 6条
筛选符合一定条件的关联规则,subset(rules,subset)
inspect(subset(rules,subset=lhs %in% c("超值大礼包","新手礼包")))
inspect(subset(rules,subset=lhs %pin% c("超值大礼包","新手礼包")))
inspect(subset(rules,subset=lhs %ain% c("超值大礼包","新手礼包")))
rules_lift <- subset(rules,subset=lift>2)
library(arulesViz)
plot(rules_lift,method = "graph",control = list(nodeCol=grey.colors(10),
edgeCol=grey(.7),alpha=1))
圆圈表示规则的支持度,圆圈越大支持度越大;圆圈的颜色表示规则的提升度,颜色由灰到黑,提升度由小到大,箭头方向lhs→rhs。
library(colorspace)
plot(rules_lift,method = "grouped",control = list(col=sequential_hcl(50),alpha =1.2))
write(rules,"rules.txt",sep="|",row.names=F)