web项目使用定时器定时聚类

最新推荐文章于 2020-04-27 17:47:25 发布

weixin_38437243

最新推荐文章于 2020-04-27 17:47:25 发布

阅读量231

点赞数

分类专栏： spring MVC

本文链接：https://blog.csdn.net/weixin_38437243/article/details/78553380

版权

spring MVC 专栏收录该内容

16 篇文章 0 订阅

订阅专栏

最近项目需求：web项目已做好，某一个模块的功能是针对一条一条的数据记录进行聚类，然后向用户进行推荐。

现在问题是为了提高推荐精度，boss要求每天凌晨两点进行聚类，将聚类结果写入到本地文件。如何编写定时器成了重中之重。

上网查阅资料显示，定时器有很多种，总结一下，可以分为3种，

1、java.util包中自带的Timer，很简单，可以解决一些定时的任务

所以小主就把Timer引入到了项目中，小试了一把，是的，结果并不理想，只能做一些简单的工作，比如定时输出。不能如小主所愿定时调用聚类函数，无奈，挥泪斩之！

2、spring+quartz

网上大量网友推荐使用这个，quzrtz可以和spring集成完成优秀的定时任务，是的，确实很好！问题是小主就是定时调用聚类函数啊。弃之！

3、spring自带的定时器

简单易用，顺利的完成了小主的任务。主要步骤三步：

第一步：设置web.xml中listener

第二步：添加定时器类

第三步：配置spring中的配置文件：

函数类如下：

@Component
public class RecommendUtils {
	
	@Resource
	private ProblemService service;
	@Resource
	private RecordService recordService;
	//读取本地文件放入该Map
	private static Map<Integer,Integer> clusterMap = null;
	
	//问题推荐条数
	private static final Integer RECOMMEND_SIZE = 10;
	//权重系数
	private static final Double C = 0.02;
	
	/**
	 * 对数据库中问题内容进行聚类，并将聚类结果写入本地磁盘
	 */
	public void getClusterResults(){
		//读取所有问题记录
		List<Problem> list = null;
		list = service.selectAllProblemOfChecked(); // 获取所有已审核的问题
		
		//将问题迭代放入Map，<问题id，问题内容>
		Map<String,String> promap = new HashMap<String,String>();
		for(Problem pro : list){
			promap.put(String.valueOf(pro.getId()), pro.getContent());
		}
		
		ComputeTFIDF.map = promap;
		
		//1、计算各个问题的tf-idf；
		//2、聚类
		//3、将结果写入本地磁盘
		try {
			HashMap<String, HashMap<String, Double>> tf = ComputeTFIDF.tfAllFiles(ComputeTFIDF.map);
			HashMap<String, Double> idf = ComputeTFIDF.idf(tf);
			Map<String, HashMap<String, Double>> tfIdf = ComputeTFIDF.tf_idf(tf, idf);//计算各个问题的tf-idf
			KmeansCluster kc = new KmeansCluster();
			Map<String, Integer> cluster = kc.doProcess(tfIdf, 5);//聚类结果
			kc.printClusterResult(cluster);//写入本地磁盘
		} catch (IOException e) {
			e.printStackTrace();
		}
	}
	
	/**
	 * 根据用户浏览记录形成推荐
	 */
	public List<Problem> recommendList(int uid){
		//1、读取聚类文件，以<K,V>格式保存。其中K保存类别号，V保存问题id的集合，格式如下：<10,1>
		try {
			clusterMap = RecommendUtils.readClusterFile();
		} catch (IOException e) {
			e.printStackTrace();
		}
		//2、根据用户id读取用户浏览记录，以<K,V>格式保存，K存放问题id，V存放时间戳，格式如下：<10,17865432>
	    //2-1读取浏览记录时设置滑动窗口，只读取一月之内的记录；
	    //2-2若该用户一月之内没有浏览记录，则推荐当前热点问题；
		List<Record> recordList = recordService.selectRecord(uid);
		if(recordList.size() == 0){
			return service.selectHotSpot();
		}
		Map<Integer,Long> recordMap = new HashMap<Integer,Long>();
		for(Record record : recordList){
			recordMap.put(record.getPid(), StringUtils.date2Long(record.getTime()));
		}
		//3、计算各类别权重
		//读取recordMap中每一条浏览记录，在clusterMap中找到对应的类别号
		//若该类别号在weightMap中不存在，则计算权重后，将类别号和权重添加到map；
		//若已存在，则计算权重后，将该类别号所在的权重相加；
		long current = new Date().getTime();
		Map<Integer,Double> weightMap = new HashMap<Integer,Double>();//<K:类别号,V:权重>
		double tmpWeight = 1- C;
		for(Integer record : recordMap.keySet()){
            if(clusterMap.containsKey(record)){
            	Integer K = clusterMap.get(record);
            	double weight = Math.pow(tmpWeight, (current - recordMap.get(record)) / (24 * 60 * 60 * 1000));
            	if(weightMap.containsKey(K)){
            		weightMap.put(K, weightMap.get(K) + weight);
            	}else{
            		weightMap.put(K, weight);
            	}
            }
		 }
		
		//4、根据权重形成推荐
		Map<Integer,Integer> sizeMap = new HashMap<Integer,Integer>();//<K:类别号，V：推荐条数>
		double weightSum = 0.0;
		for(Integer cate : weightMap.keySet()){
			weightSum += weightMap.get(cate);
		}
		for(Integer cate : weightMap.keySet()){
			int size = (int)Math.round(RECOMMEND_SIZE * (weightMap.get(cate) / weightSum));//四舍五入取条数
			sizeMap.put(cate, size);
		}
		
		//根据类别号去clusterMap寻到对应的key,每个类别号会找到多个问题id，放到tmpList中，shuffle之后，根据sizeMap中的条数保存到最终结果resultList中
		List<Integer> resultList = new ArrayList<Integer>();
		List<Integer> tmpList = new ArrayList<Integer>();
		for(Integer cate : sizeMap.keySet()){
			for (Map.Entry<Integer, Integer> entry : clusterMap.entrySet()) {
				if (cate == entry.getValue()){
					tmpList.add(entry.getKey());
				}
			}
			Collections.shuffle(tmpList);
			for(int i = 0; i < sizeMap.get(cate); i++){
				resultList.add(tmpList.get(i));
			}
			tmpList.clear();
		}

		//取得resultList中的id，依次去数据库中取得问题，放入problemeList中，返回调用该方法的控制层
		List<Problem> problemList = new ArrayList<Problem>();
		for(Integer pid : resultList){
			Problem pro = service.selectProblemDetails(pid);
			problemList.add(pro);
		}
		return problemList;
	}
	
	/**
	 * 按行读取读取本地聚类文件，返回map
	 * @return map
	 * @throws IOException 
	 */
	public static Map<Integer,Integer> readClusterFile() throws IOException {
		FileReader fr = new FileReader("C:/cluster.txt");
		BufferedReader br = new BufferedReader(fr);
		Map<Integer,Integer> map = new HashMap<Integer,Integer>();
		String[] arrs;
		String line = "";
		while ((line = br.readLine()) != null) {
			arrs = line.split(" ");
			map.put(Integer.parseInt(arrs[0]), Integer.parseInt(arrs[1]));
		}
		br.close();
		fr.close();
		return map;
	}
}

结果：

凌晨两点形成聚类

文件格式如下：