技术选型与架构设计
后端采用Spring Boot框架搭建RESTful API,MySQL作为数据存储,推荐算法使用基于用户的协同过滤(UserCF)或基于物品的协同过滤(ItemCF)。前端可搭配Vue.js或React实现交互界面。
数据库表结构设计
-- 用户表
CREATE TABLE `user` (
`user_id` int NOT NULL AUTO_INCREMENT,
`username` varchar(50) NOT NULL,
PRIMARY KEY (`user_id`)
);
-- 电影表
CREATE TABLE `movie` (
`movie_id` int NOT NULL AUTO_INCREMENT,
`title` varchar(100) NOT NULL,
`genres` varchar(100) DEFAULT NULL,
PRIMARY KEY (`movie_id`)
);
-- 评分表
CREATE TABLE `rating` (
`rating_id` int NOT NULL AUTO_INCREMENT,
`user_id` int NOT NULL,
`movie_id` int NOT NULL,
`score` float NOT NULL,
`timestamp` bigint DEFAULT NULL,
PRIMARY KEY (`rating_id`),
KEY `idx_user_movie` (`user_id`,`movie_id`)
);
协同过滤核心算法实现
用户相似度计算(余弦相似度)
$$ similarity(u,v) = \frac{\sum_{i \in I_{uv}} r_{ui} \cdot r_{vi}}{\sqrt{\sum_{i \in I_u} r_{ui}^2} \cdot \sqrt{\sum_{i \in I_v} r_{vi}^2}} $$
public class UserCF {
// 计算用户相似度矩阵
public Map<Integer, Map<Integer, Double>> calculateUserSimilarity(
Map<Integer, Map<Integer, Double>> userItemMatrix) {
Map<Integer, Map<Integer, Double>> similarityMatrix = new HashMap<>();
List<Integer> users = new ArrayList<>(userItemMatrix.keySet());
for (int i = 0; i < users.size(); i++) {
int u = users.get(i);
for (int j = i + 1; j < users.size(); j++) {
int v = users.get(j);
double similarity = cosineSimilarity(
userItemMatrix.get(u),
userItemMatrix.get(v)
);
similarityMatrix.computeIfAbsent(u, k -> new HashMap<>())
.put(v, similarity);
similarityMatrix.computeIfAbsent(v, k -> new HashMap<>())
.put(u, similarity);
}
}
return similarityMatrix;
}
private double cosineSimilarity(
Map<Integer, Double> user1Ratings,
Map<Integer, Double> user2Ratings) {
double dotProduct = 0.0;
double norm1 = 0.0;
double norm2 = 0.0;
Set<Integer> commonItems = new HashSet<>(user1Ratings.keySet());
commonItems.retainAll(user2Ratings.keySet());
for (int itemId : commonItems) {
double r1 = user1Ratings.get(itemId);
double r2 = user2Ratings.get(itemId);
dotProduct += r1 * r2;
norm1 += Math.pow(r1, 2);
norm2 += Math.pow(r2, 2);
}
return norm1 == 0 || norm2 == 0 ? 0 :
dotProduct / (Math.sqrt(norm1) * Math.sqrt(norm2));
}
}
推荐生成逻辑
public List<Recommendation> generateRecommendations(
int targetUserId,
Map<Integer, Map<Integer, Double>> userItemMatrix,
Map<Integer, Map<Integer, Double>> similarityMatrix,
int k) {
// 获取K个最近邻用户
List<Map.Entry<Integer, Double>> neighbors = similarityMatrix.get(targetUserId)
.entrySet().stream()
.sorted(Map.Entry.<Integer, Double>comparingByValue().reversed())
.limit(k)
.collect(Collectors.toList());
// 计算推荐物品的预测评分
Map<Integer, Double> predictions = new HashMap<>();
Set<Integer> targetUserItems = userItemMatrix.get(targetUserId).keySet();
for (Map.Entry<Integer, Double> otherUserRatings : userItemMatrix.values()) {
for (int itemId : otherUserRatings.keySet()) {
if (!targetUserItems.contains(itemId)) {
double weightedSum = 0.0;
double similaritySum = 0.0;
for (Map.Entry<Integer, Double> neighbor : neighbors) {
int neighborId = neighbor.getKey();
double similarity = neighbor.getValue();
if (userItemMatrix.get(neighborId).containsKey(itemId)) {
weightedSum += similarity * userItemMatrix.get(neighborId).get(itemId);
similaritySum += similarity;
}
}
if (similaritySum > 0) {
predictions.put(itemId, weightedSum / similaritySum);
}
}
}
}
// 返回排序后的推荐列表
return predictions.entrySet().stream()
.sorted(Map.Entry.<Integer, Double>comparingByValue().reversed())
.map(e -> new Recommendation(e.getKey(), e.getValue()))
.collect(Collectors.toList());
}
性能优化策略
引入缓存机制存储用户相似度矩阵,使用Redis缓存热门推荐结果。对于大规模数据,采用矩阵分解技术(如SVD++)替代内存计算。
@Cacheable(value = "userRecommendations", key = "#userId")
public List<Recommendation> getCachedRecommendations(int userId) {
return generateRecommendations(userId, userItemMatrix, similarityMatrix, 20);
}
系统部署方案
使用Docker容器化部署,MySQL配置读写分离,Spring Boot应用采用集群部署。推荐计算模块可拆分为独立微服务,通过消息队列处理离线计算任务。
250

被折叠的 条评论
为什么被折叠?



