据说是一道百度的笔试题

在论坛上看到有人问这个,觉得其中一个有点意思就回了。顺便把回的内容发在这里。

题目是这样的:

/  answer beginning  ///

考虑一个在线好友系统。系统为每个用户维护一个好友列表,列表限制最多可以有500个好友,好友必须是这个系统中的其它用户。好友关系是单向的,用户B是用户A的好友,但A不一定是B的好友。

用户以ID形式表示,现给出好友列表数据的文本形式如下:
1     3,5,7,67,78,3332
2     567,890
31     1,66
14         567
78     10000

每行数据有两列,第一列为用户ID,第二列为其好友ID,不同ID间用”,”分隔,ID升序排列。列之间用”t”分隔。


要求:
请设计合适的索引数据结构,来完成以下查询:
给定用户A和B,查询A和B之间是否有这样的关系:B是A的二维好友(好友的好友)。
如上例中,10000为1的二维好友,因为78为1的好友,10000为78的好友。

详细说明自己的解题思路,说明自己实现的一些关键点。并给出实现的伪代码实现建立索引过程和查询过程,并说明空间和时间复杂度。

限制:
用户数量不超过1000万,平均50个好友。 

/  answer ending  ///

/  my reply beginning  ///

Construct   a   link   which   has   10,000,000  nodes   and   node's   definition   is:

struct node{
   
int cur_node;
    std::vector
< node *> p_friend_vec;
    std::vector
< node *> p_indirect_vec;
};


cur_node   is   the   number   of   user   from   1~10,000,000,   p_friend_vec   is   a   container   to   contain   pointers   point   to   its   firends   nodes,   and   p_indirect_vec   contain   pointers   point   to   nodes   who   has   friend   of   current   node,   i.e.   node   10000   has   a   pointer   to   78   to   indicates   10000   is   a   friend   of   78.

To   construct   this   list   we   first   init   a   10000   nodes   list   which   has   cur_node   vaule   from   1   to   10000,   then   reads   txt   file   row   by   row   to   index   the   friends   relationship,   I   think   this   should   be   simple.   This   construction's   complexity   is   O(N)   or   O(50N)   in   time   and   O(N)   or   O(100N)   in   space.

To   query   a   relationship   between   A   and   B,   we   need   to   iterate   all   pointers   in   A   and   B,   to   check   out   if   or   not   they   have   a   nodes   both   be   the   pointee   of   A   and   B.   To   do   this,   typically,   a   double   loop:
for (std::vector < node *> ::iterator i == p_friend_vec.begin(); i != p_friend_vec.end(); ++ i)
{
   
for (std::vector < node *> ::iterator j == p_indirect_vec.begin(); j != p_indirect_vec.end(); ++ j)
    {
       
if (i -> cur_node == j -> cur_node)
       
// bingo!
    }
}


This   search   is   O(50 2)   or   O(M 2)   which   M   is   average   friends   number.

Above   method   is   typical,   however   there   is   another   way   to   do   this   more   quickly.   This   method   is   somewhat   similar   to   idf.   It   requires   a   10,000,000*10,000,000   matrix.   The   matrix   record   friends   information,   e.g.   1   has   friends   3,5,7...   so   first   row   of   that   matrix   is   a   vector:   [1,   0,   1,   0,   1,   0,   1,...](notice   that   the   one   always   is   friend   of   itself).   And   notice   that   the   column   of   matirx   recode   friends   who   belonged   to.   To   see   if   or   not   A   and   B   has   the   required   relationship   we   just   'OR'   A's   row   and   B's   column   to   see   any   1   exist   in   result   vector,   if   the   vector   has   one   or   more   than   one   1,   A   and   B   has   the   required   relationship.   This   method   is   O(50)   or   O(M)   (which   M   is   average   friends   number)   in   finding.   That's   much   faster   than   above   method   which   is   O(M 2).   And   we   only   use   0,   1   in   this   matrix,   so   it's   space   complexity   is   also   low   which   is   100M/8   =   13M

/  my reply ending  ///
 

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值