推荐系统
定义:通过人工或非人工方法向用户建议购买/浏览物品的有规律行为
推荐引擎模块
推荐引擎模块由三部分组成:接收请求、处理请求、返回结果
制造日志
一般推荐日志log:字段1&字段2&字段3&字段4&……
制造一个简单的日志:
日志格式:cookie&uid&user_agent&ip&video_id&topic&order_id&log_type
日志种类log_type:点击1、播放2、点赞3、收藏4、付费观看5、站外分享6、评论7
def produce():
print("start")
for num in range(0,100):
print(num)
cookie = "oneone"
uid = "first_blood"
user_agent = "Macintosh Chrome Safari"
ip = "1.1.1.1"
video_id = "898923"
topic = "苹果发布会"
order_id = "0"
log_type = "1"
final = cookie+"&"+uid+"&"+user_agent+"&"+ip+"&"+video_id+"&"+topic+"&"+order_id+"&"+log_type
print(final)
produce()
制造一个较复杂的日志,并写入文件:
import random
user_list = ["one","two","three","four","five"]
albet_num = ["a","b","c","d","e","f","g","h",'I','G','K','L','M','1','2','3','4','5']
log_type_array = ["1","2","3","4","5","6","7"]
def produce():
print("start")
file_object = open('./logfile.txt','w') #写到文件中
for num in range(0,2000):
cookie = ''.join(random.sample(albet_num, 6)) #''.join(random.sample(albet, 5))
uid = ''.join(random.sample(user_list, 1))
user_agent = "Macintosh Chrome Safari"
ip = "1.1.1.1"
video_id = "898923"
topic = "苹果发布会"
order_id = "0"
log_type = ''.join(random.sample(log_type_array, 1))
final = cookie+"&"+uid+"&"+user_agent+"&"+ip+"&"+video_id+"&"+topic+"&"+order_id+"&"+log_type
file_object.write(final+'\n')
file_object.close()
produce()
retargeting营销:电商重定向、电影重定向、文章重定向
重定向:对用户发生过行为(如浏览,购买)的商品进行二次推荐,最大化用户对其进行消费概率
处理日志
处理日志种类为点击的日志:
click_action = {} #字典 key uid value video_ids
file = open('./logfile.txt')
for line in file.readlines():
#print(line)
line = line.strip() #去掉空格
ls = line.split("&")
if ls[7] != "1":
continue
if ls[1] not in click_action.keys():
click_action[ls[1]] = []
click_action[ls[1]].append(ls[4])
print(ls[1]+'\t'+ls[4])
file.close()
# 显示
for k,v in click_action.items():
print(k+ "\t" + str(len(v)) + "\t" + str(v))