LeetCode 811. Subdomain Visit Count(哈希表的简单运用，c++，python)

最新推荐文章于 2022-10-05 09:31:45 发布

重学CS

最新推荐文章于 2022-10-05 09:31:45 发布

阅读量316

点赞数

分类专栏：刷题 Leetcode

本文链接：https://blog.csdn.net/ha_ha_ha233/article/details/89396913

版权

刷题同时被 2 个专栏收录

36 篇文章 0 订阅

订阅专栏

Leetcode

22 篇文章 0 订阅

订阅专栏

A website domain like “discuss.leetcode.com” consists of various subdomains. At the top level, we have “com”, at the next level, we have “leetcode.com”, and at the lowest level, “discuss.leetcode.com”. When we visit a domain like “discuss.leetcode.com”, we will also visit the parent domains “leetcode.com” and “com” implicitly.

Now, call a “count-paired domain” to be a count (representing the number of visits this domain received), followed by a space, followed by the address. An example of a count-paired domain might be “9001 discuss.leetcode.com”.

We are given a list cpdomains of count-paired domains. We would like a list of count-paired domains, (in the same format as the input, and in any order), that explicitly counts the number of visits to each subdomain.

Example 1:
Input:

["9001 discuss.leetcode.com"]

Output:

["9001 discuss.leetcode.com", "9001 leetcode.com", "9001 com"]

Explanation:
We only have one website domain: “discuss.leetcode.com”. As discussed above, the subdomain “leetcode.com” and “com” will also be visited. So they will all be visited 9001 times.

Example 2::
Input:

["900 google.mail.com", "50 yahoo.com", "1 intel.mail.com", "5 wiki.org"]

Output:

["901 mail.com","50 yahoo.com","900 google.mail.com","5 wiki.org","5 org","1 intel.mail.com","951 com"]

Explanation:
We will visit “google.mail.com” 900 times, “yahoo.com” 50 times, “intel.mail.com” once and “wiki.org” 5 times. For the subdomains, we will visit “mail.com” 900 + 1 = 901 times, “com” 900 + 50 + 1 = 951 times, and “org” 5 times.

Notes:
The length of cpdomains will not exceed 100.
The length of each domain name will not exceed 100.
Each address will have either 1 or 2 “.” characters.
The input count in any count-paired domain will not exceed 10000.
The answer output can be returned in any order.

思路：一道很简单的哈希表的运用题，把域名按“.”进行切分，然后将子域名作为键，访问次数作为值存储到哈希表中。如果键已经存在则需要把对应的值进行相加，而不是覆盖。

感觉就是STL库的运用，这里自己写了个split函数，做的逻辑也有点复杂，所以造成速度有些慢。
另外使用unodered_map比map的速度更快。如下图
在这里插入图片描述
原因：STL中，map 对应的数据结构是 红黑树。红黑树是一种近似于平衡的二叉查找树，里面的数据是有序的。在红黑树上做查找操作的时间复杂度为 O(logN)。而 unordered_map 对应 哈希表，哈希表的特点就是查找效率高，时间复杂度为常数级别 O(1)，而额外空间复杂度则要高出许多。所以对于需要高效率查询的情况，使用 unordered_map 容器。而如果对内存大小比较敏感或者数据存储要求有序的话，则可以用 map 容器。（参考）

class Solution {
public:
    vector<string> subdomainVisits(vector<string>& cpdomains) {
        vector<string> ans;//保存答案
        unordered_map<string, int> dic;
        string domain;
        for(int i = 0; i< cpdomains.size(); i++){
            domain = cpdomains[i];
            vector<string> tmp = split(domain, " ");
            int count = stoi(tmp[0]);//次数
            
            vector<string> sub_domains = split(tmp[1], ".");
            for(int j = 0; j<sub_domains.size(); j++){
                string domain_name = sub_domains[j];
                for(int k = j+1; k<sub_domains.size(); k++){
                    domain_name = domain_name+"."+sub_domains[k];
                }
                if(dic.count(domain_name)==1)//map中已经存在
                    dic[domain_name] += count;
                else
                    dic[domain_name] = count;//不存在加入
            }
        }
        unordered_map<string, int>::iterator it = dic.begin();
        for(; it!= dic.end(); it++){
            string tmp;
            tmp = to_string(it->second);
            tmp = tmp + " " + it->first;
            ans.push_back(tmp);
        }
        return ans;
    }
    
    vector<string> split(const string &str, const string &delim){
        vector<string> vec;
        string s = str;
        int index;
        while( (index = s.find(delim)) != -1){
            string tmp = s.substr(0, index);
            vec.push_back(tmp);
            s = s.substr(index+1);
        }
        vec.push_back(s);
        return vec;
    }
};

python中直接使用dic即可，剩下的就是字符串的一些操作

class Solution(object):
    def subdomainVisits(self, cpdomains):
        """
        :type cpdomains: List[str]
        :rtype: List[str]
        """
        dic = {}
        for domain in cpdomains:
            cnt, name = domain.split()
            cnt = int(cnt)
            list_name = name.split('.')
            for i in range(len(list_name)):
                sep = '.'
                tmp_name = sep.join(list_name[i:])
                if tmp_name not in dic:
                    dic[tmp_name] = cnt
                else:
                    dic[tmp_name] += cnt
        ans = ["{} {}".format(value, key) for key, value in dic.items()]
        return ans

速度貌似还行
在这里插入图片描述

附录：参考LeetCode讨论区的c++优秀解答

public:
    vector<string> subdomainVisits(vector<string>& cpdomains) {
        vector<string> res;
        unordered_map<string, int> m;
        for(auto &i : cpdomains){
            int num = stoi(i);//c++11新特性，会自动判断字符串中哪些是数字，并把数字部分转换为int
            int lo = i.size() - 1;
            while(i[lo] != ' '){
                if(i[lo] == '.')
                    m[i.substr(lo + 1)] += num;//找到一个子域名
                lo--;
            }
            m[i.substr(lo + 1)] += num;
        }
        for(auto &i : m)
            res.push_back(to_string(i.second) + " " + i.first);
        return res;
    }
};