Leetcode算法学习日志-737 Sentence Similarity II

最新推荐文章于 2022-06-04 21:26:45 发布

Zarlove

最新推荐文章于 2022-06-04 21:26:45 发布

阅读量1.6k

点赞数

分类专栏：算法，leetcode，并查集，union find，深度优先搜索文章标签： C++ leetcode 并查集 union find 深度优先搜索

本文链接：https://blog.csdn.net/Zarlove/article/details/78704215

版权

算法，leetcode，并查集，union find，深度优先搜索专栏收录该内容

1 篇文章 0 订阅

订阅专栏

Leetcode 737 Sentence Similarity

题目原文

Given two sentences words1, words2 (each represented as an array of strings), and a list of similar word pairs pairs, determine if two sentences are similar.

For example, words1 = ["great", "acting", "skills"] and words2 = ["fine", "drama", "talent"] are similar, if the similar word pairs are pairs = [["great", "good"], ["fine", "good"], ["acting","drama"], ["skills","talent"]].

Note that the similarity relation is transitive. For example, if "great" and "good" are similar, and "fine" and "good" are similar, then "great" and "fine" are similar.

Similarity is also symmetric. For example, "great" and "fine" being similar is the same as "fine" and "great" being similar.

Also, a word is always similar with itself. For example, the sentences words1 = ["great"], words2 = ["great"], pairs = [] are similar, even though there are no specified similar word pairs.

Finally, sentences can only be similar if they have the same number of words. So a sentence like words1 = ["great"] can never be similar to words2 = ["doubleplus","good"].

Note:

The length of words1 and words2 will not exceed 1000.
The length of pairs will not exceed 2000.
The length of each pairs[i] will be 2.
The length of each words[i] and pairs[i][j] will be in the range [1, 20].

题意分析

pairs中存储了成对的similar words，similar关系满足传递性和反向性。对于words1和words2中的每个相对应的words，如果根据pairs中的关系推断出它们满足similar（相同则一定similar），则说words1和word2满足similar，返回true，否则返回false。注意如果words1和words2长度不同，则直接返回false。

解法分析

对于words1和words2相对应的一对words，判断它们是否similar实际上是判断它们之间是否有连通路，把一个word看做节点，该问题可以看做判断两点之间是否连通的图问题。对于给定两点，判断它们是否连同的问题，一般采用两种方法，如果只判断连通与否，不需要给出路径，则采用Union Find（并查集算法），如果需要给出路径，一般采用DFS，如果采用BFS，则会造成能存消耗大，并会超时，但如果求最短路径，则采用BFS比较好。本题只需要判断两点之间是否连通，所以采用并查集算法。下面对并查集算法进行简要讨论。

并查集算法的思想就是根据题目所给的点之间的关联信息，将所有点分到不同的连通分量中，同一个连通分量中任意两点间有连通路，上述过程称为Union；对比两点所在的连通分量，如果是同一个连通分量，则他们之间连通，否则他们之间不连通。因为在Union过程中，一个连通分量中的所有元素可能被整体加入到另一连通分量中，如何表征不同连通分量是关键问题。一般将同一个连通分量的节点（任两点间可达）用一棵树来表示，而该树用数组或者map来实现，比如本题用map<sting,string> directParent，来实现一棵树，对于节点a，b，如果他们可互通，也即{a,b}在pairs中，则可选择a，b中的任意一个的根节点作为另一个节点根节点的父节点，逐渐构造一个森林。

find(a)用于返回节点a的祖宗节点（它所存在的数的根节点），为了使得在union时，始终将重量小的树连接到重量大的树，从而避免树的深度过大，使find复杂度加大，可以用map记录每个根对应数的节点个数，并在union后改变其值。C++代码如下：

class Solution {
public:
    map<string,string> directParent;//the directParent
    bool areSentencesSimilarTwo(vector<string>& words1, vector<string>& words2, vector<pair<string, string>> pairs) {
        if(words1.size()!=words2.size())
            return false;
        unionString(pairs);
        int i;
        for(i=0;i<words1.size();i++){
            if(find(words1[i])!=find(words2[i]))
                return false;
        }
        return true;
    }
    void unionString(vector<pair<string,string>> pairs){
        for(auto p:pairs){
            if(find(p.first)!=find(p.second))
                directParent[find(p.first)]=find(p.second);//任意连接这两个树，可能造成树的深度过大的不平衡树
        }
    }
    string find(string a){
        return (!directParent.count(a))?a:((directParent[a]==a)?a:find(directParent[a]));//to find the root
    }
};

在find中完成了每个节点对应父节点的初始化过程，初始化为自己。下面的代码加入了树节点计数功能：

class Solution {
public:
    map<string,string> directParent;//the directParent
    map<string,int> countN;
    bool areSentencesSimilarTwo(vector<string>& words1, vector<string>& words2, vector<pair<string, string>> pairs) {
        if(words1.size()!=words2.size())
            return false;
        unionString(pairs);
        int i;
        for(i=0;i<words1.size();i++){
            if(find(words1[i])!=find(words2[i]))
                return false;
        }
        return true;
    }
    void unionString(vector<pair<string,string>> pairs){
        for(auto p:pairs){
            if(find(p.first)!=find(p.second)){
                if(!countN.count(find(p.first)))
                    countN[find(p.first)]=1;
                if(!countN.count(find(p.second)))
                    countN[find(p.second)]=1;
                if(countN[find(p.first)]>=countN[find(p.second)]){
                    directParent[find(p.second)]=find(p.first);
                    countN[find(p.first)]+=countN[find(p.second)]; 
                }  
                else{
                    directParent[find(p.first)]=find(p.second); 
                    countN[find(p.second)]+=countN[find(p.first)];  
                }
                
            }    
        }
    }
    string find(string a){
        return (!directParent.count(a))?a:((directParent[a]==a)?a:find(directParent[a]));//to find the root
    }
};