一、模型
1.先创建工程
2.创建数据库表(用户表,图书表,点击表)makemigrations migrate
二、推荐
1.推荐算法
- a.内容关联算法:推荐类似,没有新意。
- b.协同过滤算法:推荐类似的口味,比较智慧,但需要大量数据。
2.导入数据
3.训练模型
- a.定义:隐性(评分)和显性(浏览、点击)
- b.函数:隐性(train)和显性(trainImplicit)
- c.思路:(和电影推荐类似(只是函数不同,电影是根据评分))
- 数据是前端用户点击数据
- 训练模型
- 保存模型
- 加载(调用)模型:使用redis(使用用户登录的状态session、调用模型)来缓存,推荐会随时间和用户点击变化
三、前端
1.创建表:
models.py
class hits(models.Model):
userid=models.IntegerField(default=0)
bookid=models.IntegerField(default=0)
hitnum=models.IntegerField(default=0)
def __str__(self):
return str(self.userid)
class Meta:
verbose_name = "点击量"
verbose_name_plural = "点击量"
2.记录点击量
views.py
学习中…
四、代码
训练代码
from pyspark import SparkContext
from pyspark.streaming import StreamingContext
from pyspark.mllib.recommendation import ALS, MatrixFactorizationModel
from pyspark.sql import SparkSession
sc = SparkContext()
sc.setLogLevel("WARN")
txt = sc.textFile('hit.txt')
ratingsRDD = txt.flatMap(lambda x: x.split()).map(lambda x: x.split(','))
sqlContext = SparkSession.builder.getOrCreate()
from pyspark.sql import Row
user_row = ratingsRDD.map(lambda x: Row(
userid=int(x[0]), bookid=int(x[1]), hitnum=int(x[2])
))
user_df = sqlContext.createDataFrame(user_row)
user_df.registerTempTable('test')
datatable = sqlContext.sql("select userid, bookid,sum(hitnum) as hitnum from test group by userid,bookid")
bookrdd = datatable.rdd.map(lambda x: (x.userid, x.bookid, x.hitnum))
model = ALS.trainImplicit(bookrdd, 10, 10, 0.01)
import os
import shutil
if os.path.exists('recommendModel'):
shutil.rmtree('recommendModel')
model.save(sc, 'recommendModel')
加载和推荐4个
from pyspark import SparkContext
from pyspark.streaming import StreamingContext
from pyspark.mllib.recommendation import ALS,MatrixFactorizationModel
from pyspark.sql import SparkSession
import redis
pool = redis.ConnectionPool(host='192.168.43.50', port=6379)
redis_client = redis.Redis(connection_pool=pool)
sc = SparkContext()
sc.setLogLevel("WARN")
def redisOp():
redis_client.set(1,'b')
print(redis_client.get(1))
def getRecommendByUserId(userid,rec_num):
try:
model=MatrixFactorizationModel.load(sc,'recommendModel')
result=model.recommendProducts(userid,rec_num)
temp=''
for r in result:
temp+=str(r[0])+','+str(r[1])+','+str(r[2])+'|'
redis_client.set(userid,temp)
print('load model success !')
except Exception as e:
print('load model failed!'+str(e))
sc.stop()
if __name__ == '__main__':
getRecommendByUserId(7,4)
print(redis_client.get(5))
五、推荐数据
大概思路理清了,但是感觉存在很多缺点