代码随想录算法训练营Day6 | 242. 有效的字母异位词 | 349. 两个数组的交集 | 202. 快乐数 | 1. 两数之和

最新推荐文章于 2024-11-11 20:03:56 发布

Kolbe_Huang

最新推荐文章于 2024-11-11 20:03:56 发布

阅读量376

点赞数

分类专栏：代码随想录算法训练营一刷文章标签：算法哈希算法散列表

本文链接：https://blog.csdn.net/Kolbe_Huang/article/details/131758148

版权

代码随想录算法训练营一刷专栏收录该内容

53 篇文章 0 订阅

订阅专栏

文章讨论了哈希表在解决编程问题中的应用，如判断字母异位词、找数组交集和快乐数等。哈希表用于快速查找和存储元素，通过哈希函数和处理碰撞的方法（如拉链法和线性探测法）。文章提供了不同场景下选择哈希结构（如数组、set、字典和Counter）的策略，并给出了具体问题的Python解决方案。

摘要由CSDN通过智能技术生成

Hash Table

哈希表是根据关键码的值而直接进行访问的数据结构，一般哈希表都是用来快速判断一个元素是否出现集合里（得到关键索引然后直接访问即可）。

哈希函数：函数 $f$ 输入 $\rightarrow$ 索引
哈希碰撞：不同的元素拥有同一索引（e.g., data size > table size），此时即使得到了索引也无法直接确定对应的元素
- 拉链法：令每一个索引指向一个链表，用于储存具有相同索引的元素
  - 要选择适当的 hash table size 和 hash function，否则会出现 table 中有大量空置索引浪费内存，同时单个索引的链表过大（查询繁琐）
- 线性探测法：一定要保证 table size 大于 data size
  - 每当碰撞发生，就沿着当前的位置向后寻找，直到找到下一个空位可以存放当前冲突的元素
  - 不同的寻找方法（线性，平方。。。）会提供不同的性能

选择哈希法的时机

需要查询一个元素是否出现过，或者一个元素是否在集合里的时候

选择哈希结构的思路（python）

数组：
- 使用场景：限制了元素的范围即可使用
- 劣势：如果哈希值比较少、特别分散、跨度非常大，使用数组就造成空间的极大浪费。
set：
- 使用场景：元素数值跨度大、分散时考虑使用
- 劣势：直接使用 set 不仅占用空间比数组大，而且速度要比数组慢，set 把数值映射到 key 上都要做 hash 计算。
- set 的方法：
  - a.add(...)
  - a & b：返回 a 和 b 中的共同元素
mapping：
- 使用场景：很明确的 mapping 结构，可以使用 Counter 等子类进行便捷的实现
- 劣势：大部分场景都能被数组或者 set 更优地实现

242. 有效的字母异位词

题目链接 | 解题思路

ord(): takes string argument of a single Unicode character and return its integer Unicode code point value
获得输入的 Unicode 值，在 hash table 中会很有用！

数组 - 哈希表

时间复杂度： $O (n)$
空间复杂度： $O (1)$

class Solution:
    def isAnagram(self, s: str, t: str) -> bool:
        table = [0] * 26        # lowercase English letters
        for letter in s:
            table[ord(letter) % 26] += 1
        for letter in t:
            table[ord(letter) % 26] -= 1
        for idx in range(len(table)):
            if table[idx] != 0:
                return False
        return True

字典 - 哈希表

比较两个 defalutdict 是否相等的时间复杂度：

内部在比较时，首先会比较两个字典是否大小相等，如果不相等会直接返回 false，时间是 $O (1)$
如果大小相等，则会遍历其中一个字典，来看另一个字典中是否存在该 key-val pair，时间复杂度是 $O (n)$

class Solution:
    def isAnagram(self, s: str, t: str) -> bool:
        from collections import defaultdict
        
        s_dict = defaultdict(int)
        t_dict = defaultdict(int)
        for x in s:
            s_dict[x] += 1
        
        for x in t:
            t_dict[x] += 1
        return s_dict == t_dict

Counter - 哈希表

Counter 是 dict 的一个子类，主要用于统计计数。
Counter 可以从 list、string、tuple 这样的 iteratable objects 中实例化，也可以从 dict 这样的 mapping 中实例化。

Counter 的一些特征：

dict 要求 key 是唯一的，所以如果传入的 mapping 有重复的 key，Counter 会保留其中的最后一个 key-value pair。
如果在 Counter 中查找一个不存在的元素，不会产生异常，而是会返回 0（符合计数的定义）。
Counter 允许 key-value pair 中的 value 为负数。

Counter的参考链接

class Solution(object):
    def isAnagram(self, s: str, t: str) -> bool:
        from collections import Counter
        a_count = Counter(s)
        b_count = Counter(t)
        return a_count == b_count

349. 两个数组的交集

题目链接 | 解题思路

题目特征：唯一元素，顺序不论，看到这样的关键词就应该优先考虑 set。

同时也可以参考选择标准，因为这题最初是没有数值范围的。

字典 - 哈希表

class Solution:
    def intersection(self, nums1: List[int], nums2: List[int]) -> List[int]:
        records = {}
        for num in nums1:
            if num not in records:
                records[num] = 1

        results = set()
        for num in nums2:
            if num in records:
                results.add(num)
    
        return list(results)

数组 - 哈希表

超高的空间复杂度，如果不是题目限制了数值范围，数组解法很可能无法 pass。

class Solution:
    def intersection(self, nums1: List[int], nums2: List[int]) -> List[int]:
        max_value = max(max(nums1), max(nums2))
        min_value = min(min(nums1), min(nums2))
        records1 = [0] * (max_value - min_value + 1)
        records2 = [0] * (max_value - min_value + 1)
        results = []

        for num in nums1:
            records1[num - min_value] += 1
        for num in nums2:
            records2[num - min_value] += 1
        for i in range(len(records1)):
            if records1[i] > 0 and records2[i] > 0:
                results.append(i + min_value)
        return results

set - 哈希表（最优解）

class Solution:
    def intersection(self, nums1: List[int], nums2: List[int]) -> List[int]:
        return list(set(nums1) & set(nums2))

202. 快乐数

题目链接 | 解题思路

题目中提到“重复出现的 sum”，代表着需要判断一个元素是否出现在一个集合中，Hash！

时间复杂度: $O(\log{n})$
空间复杂度: $O(\log{n})$

对于一个（足够大的）数 $n$ ，各位数字的平方和不会超过 $9^2 \log_{10}{n}$ ，所以 hash table 需要储存的值的数量不会超过 $9^2 \log_{10}{n}$ 。

set - 哈希表

class Solution:
    def digit_square_sum(self, n: int) -> int:
        sum = 0
        while (n != 0):
            sum += (n % 10) ** 2
            n = n // 10
        return sum

    def isHappy(self, n: int) -> bool:
        records = set()
        curr_num = n
        while (1):
            if curr_num == 1:
                return True
            if curr_num not in records:
                records.add(curr_num)
            else:
                return False
            curr_num = self.digit_square_sum(curr_num)

1. 两数之和

题目链接 | 解题思路

数组的大小是受限制的，而且如果元素很少，而哈希值太大会造成内存空间的浪费。
set是一个集合，里面放的元素只能是一个 key，而两数之和这道题目，不仅要判断 y 是否存在而且还要记录 y 的下标位置，因为要返回 x 和 y 的下标。所以 set 也不能用。

本题重点：

为什么会想到用哈希表
哈希表为什么用map
本题map是用来存什么的
map中的key和value用来存什么的

排序+双指针

一定要牢记lambda函数的妙用：a.sort(key=lambda x:x[1])

class Solution:
    def twoSum(self, nums: List[int], target: int) -> List[int]:
        records = []
        for i in range(len(nums)):
            records.append([i, nums[i]])
        records.sort(key=lambda x:x[1])
        
        head_idx = 0
        tail_idx = len(nums) - 1
        while (head_idx <= tail_idx):
            if records[head_idx][1] + records[tail_idx][1] < target:
                head_idx += 1
            elif records[head_idx][1] + records[tail_idx][1] > target:
                tail_idx -= 1
            else:
                return [records[head_idx][0], records[tail_idx][0]]

字典（标准解法）

简洁：遍历数组，并且判断所需的 complement element 是否已经被找到了。

class Solution:
    def twoSum(self, nums: List[int], target: int) -> List[int]:
        records = {}

        for idx, value in enumerate(nums):
            if target - value in records:
                return [records[target - value], idx]
            records[value] = idx