49. Group Anagrams(作业Anagram detection相同字母构成的字符串归类成字典)

最新推荐文章于 2022-03-24 11:17:46 发布

微辣不香

最新推荐文章于 2022-03-24 11:17:46 发布

阅读量254

点赞数 1

分类专栏：作业 LeetCode

本文链接：https://blog.csdn.net/weixin_45659871/article/details/116465310

版权

LeetCode 同时被 2 个专栏收录

3 篇文章 0 订阅

订阅专栏

作业

1 篇文章 0 订阅

订阅专栏

本文介绍了如何在大规模文本中高效查找异位词，通过排序和计数两种方法实现。在Python中，利用预排序和哈希映射能快速找到词典中的异位词集合。同时，讲解了LeetCode49题的解决方案，包括使用Java的Stream API的groupingBy算子以及通过计数数组进行编码。这两种方法的时间复杂度和空间复杂度分别分析，并总结了相关编程技巧。

摘要由CSDN通过智能技术生成

文章目录

作业6.1.11 Anagram detection

Design an efficient algorithm for finding all sets of anagrams in a large file such as a dictionary of English words [Ben00]. For example, eat, ate, and tea belong to one such a set.

Python

def presort(testDict):
    pre_dict = []

    for cur_word in testDict:
        cur_word = list(cur_word)
        cur_word = sorted(cur_word)#按照字母顺序排好
        cur_word = "".join(cur_word)#再次转化成字符串
        # print(cur_word)
        pre_dict.append(cur_word)
    return pre_dict

def begin(testDict):
    pre = []
    pre=presort(testDict)
    print(pre)
    
    hashmap = {}#哈希表查找
    for i in range(len(pre)):
        aph = pre[i]
        if aph in hashmap:
            hashmap[aph].append(testDict[i])
        else:
            sublist = [testDict[i]]
            hashmap[aph] = sublist
    print(hashmap)
    # keys = list(hashmap.keys())
    # values = list(hashmap.values())
    # num = values.index(max(values))
    # print(keys[num])#用来求出最大出现次数的

if __name__== "__main__":
    testDict = ["eta","eat","tea","mane","mean","tik","kit","cup","upc","mary","army"]
    begin(testDict)

LeetCode49

Given an array of strings strs, group the anagrams together. You can return the answer in any order.

An Anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.

Example 1:
Input: strs = [“eat”,“tea”,“tan”,“ate”,“nat”,“bat”]
Output: [[“bat”],[“nat”,“tan”],[“ate”,“eat”,“tea”]]

Example 2:
Input: strs = [""]
Output: [[""]]

Example 3:
Input: strs = [“a”]
Output: [[“a”]]

方法一：排序

作者：sweetiee
链接：https://leetcode-cn.com/problems/group-anagrams/solution/kan-wo-yi-ju-hua-ac-zi-mu-yi-wei-ci-fen-yrnis/

字母相同，但排列不同的字符串，排序后都一定是相同的。因为每种字母的个数都是相同的，那么排序后的字符串就一定是相同的。

这里可以利用 stream 的 groupingBy 算子实现直接返回结果：

JAVA

class Solution {
    public List<List<String>> groupAnagrams(String[] strs) {
        return new ArrayList<>(Arrays.stream(strs)
            .collect(Collectors.groupingBy(str -> {
                // 返回 str 排序后的结果。
                // 按排序后的结果来grouping by，算子类似于 sql 里的 group by。
                char[] array = str.toCharArray();
                Arrays.sort(array);
                return new String(array);
            })).values());
    }
}

注意 groupingBy 算子计算完以后，返回的是一个 Map<String, List>，map 的键是每种排序后的字符串，值是聚合的原始字符串，我们只关心值，所以我们最后 new ArrayList<>(map.values())。

class Solution {
    public List<List<String>> groupAnagrams(String[] strs) {
        // str -> intstream -> sort -> collect by StringBuilder
        return new ArrayList<>(Arrays.stream(strs).collect(Collectors.groupingBy(str -> str.chars().sorted().collect(StringBuilder::new, StringBuilder::appendCodePoint, StringBuilder::append).toString())).values());
    }
}

方法二：计数

对每个字符串计数得到该字符串的计数数组，对于计数数组相同的字符串，就互为异位词。
因为数组类型没有重写 hashcode() 和 equals() 方法，因此不能直接作为 HashMap 的 Key 进行聚合，那么我们就把这个数组手动编码变成字符串就行了。
比如将 [b,a,a,a,b,c] 编码成 a3b2c1，使用编码后的字符串作为 HashMap 的 Key 进行聚合

class Solution {
    public List<List<String>> groupAnagrams(String[] strs) {
        return new ArrayList<>(Arrays.stream(strs)
            .collect(Collectors.groupingBy(str -> {
                int[] counter = new int[26];
                for (int i = 0; i < str.length(); i++) {
                    counter[str.charAt(i) - 'a']++;
                }
                StringBuilder sb = new StringBuilder();
                for (int i = 0; i < 26; i++) {
                    // 这里的 if 是可省略的，但是加上 if 以后，生成的 sb 更短，后续 groupingBy 会更快。
                    if (counter[i] != 0) {
                        sb.append((char) ('a' + i));
                        sb.append(counter[i]);
                    }
                }
                return sb.toString();
            })).values());
    }
}