据说是一道百度的笔试题

最新推荐文章于 2024-08-09 21:50:50 发布

magicblue

最新推荐文章于 2024-08-09 21:50:50 发布

阅读量1.1k

点赞数

分类专栏： Algorithm 文章标签：百度 vector pointers matrix iterator construction

本文链接：https://blog.csdn.net/magicblue/article/details/2065594

版权

Algorithm 专栏收录该内容

7 篇文章 0 订阅

订阅专栏

在论坛上看到有人问这个，觉得其中一个有点意思就回了。顺便把回的内容发在这里。

题目是这样的：

/ answer beginning ///

考虑一个在线好友系统。系统为每个用户维护一个好友列表，列表限制最多可以有500个好友，好友必须是这个系统中的其它用户。好友关系是单向的，用户B是用户A的好友，但A不一定是B的好友。

用户以ID形式表示，现给出好友列表数据的文本形式如下：
1 3,5,7,67,78,3332
2 567,890
31 1,66
14 567
78 10000
…
每行数据有两列，第一列为用户ID，第二列为其好友ID，不同ID间用”,”分隔，ID升序排列。列之间用”t”分隔。

要求：
请设计合适的索引数据结构，来完成以下查询：
给定用户A和B，查询A和B之间是否有这样的关系：B是A的二维好友（好友的好友）。
如上例中，10000为1的二维好友，因为78为1的好友，10000为78的好友。

详细说明自己的解题思路，说明自己实现的一些关键点。并给出实现的伪代码实现建立索引过程和查询过程，并说明空间和时间复杂度。

限制：
用户数量不超过1000万，平均50个好友。

/ answer ending ///

/ my reply beginning ///

Construct a link which has 10,000,000 nodes and node's definition is:

struct node{
    int cur_node;
    std::vector < node *> p_friend_vec;
    std::vector < node *> p_indirect_vec;
};

cur_node is the number of user from 1~10,000,000, p_friend_vec is a container to contain pointers point to its firends nodes, and p_indirect_vec contain pointers point to nodes who has friend of current node, i.e. node 10000 has a pointer to 78 to indicates 10000 is a friend of 78.

To construct this list we first init a 10000 nodes list which has cur_node vaule from 1 to 10000, then reads txt file row by row to index the friends relationship, I think this should be simple. This construction's complexity is O(N) or O(50N) in time and O(N) or O(100N) in space.

To query a relationship between A and B, we need to iterate all pointers in A and B, to check out if or not they have a nodes both be the pointee of A and B. To do this, typically, a double loop:
for (std::vector < node *> ::iterator i == p_friend_vec.begin(); i != p_friend_vec.end(); ++ i)
{
    for (std::vector < node *> ::iterator j == p_indirect_vec.begin(); j != p_indirect_vec.end(); ++ j)
    {
        if (i -> cur_node == j -> cur_node)
        // bingo!
    }
}

This search is O(50 ²) or O(M ²) which M is average friends number.

Above method is typical, however there is another way to do this more quickly. This method is somewhat similar to idf. It requires a 10,000,000*10,000,000 matrix. The matrix record friends information, e.g. 1 has friends 3,5,7... so first row of that matrix is a vector: [1, 0, 1, 0, 1, 0, 1,...](notice that the one always is friend of itself). And notice that the column of matirx recode friends who belonged to. To see if or not A and B has the required relationship we just 'OR' A's row and B's column to see any 1 exist in result vector, if the vector has one or more than one 1, A and B has the required relationship. This method is O(50) or O(M) (which M is average friends number) in finding. That's much faster than above method which is O(M ²). And we only use 0, 1 in this matrix, so it's space complexity is also low which is 100M/8 = 13M

/ my reply ending ///

magicblue

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
2
评论
据说是一道百度的笔试题

在论坛上看到有人问这个，觉得其中一个有点意思就回了。顺便把回的内容发在这里。题目是这样的：///////////////////////////////////////////////////////////////////// answer beginning //////////////////////////////////////////////////////////////
复制链接

扫一扫