计算最常出现的网页序列

假设一个网站有8个网页。我们只关注计算最经常访问的3有序个网页。给出一个日志文件,上面有几行特定的时间信息。每行有如下的信息:时间,用户ID,访问的网页。设计一个理想的算法找到找到最经常访问的3个有序网页。

Assume there's a website with 8 pages. We are interested in calculating the most frequently visited page sequences of size 3( e.g 1->5->2 ).We are given a log file that has several rows for a particular time period. Each row has following info : time, UserID, page visited

Suggest an optimal algorithm to find the most frequest visited page sequence of size 3.

每个有序数列都是唯一的。假设按照时间的先后顺序有序数列为:1->3->6->3->5->4

有序数列可被分为:136,363,635,354,........计算出现次数最多的序列就是所求。

lets number every sequence like 1->3->8 = 138, so every sequence will be unique
Now let say the sequence with time is 1->3->6->3->5->4....
So the sequence combinations can be 136,363,635,354,........
Now counting the sequence which occurs most can be the answer

使用hash表保存每个序列页面出现的频率。

you can use Hashmap to save frequency of page sequence with this idea.That gives o(1) for lookup



评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值