Introduction to Algorithms (Table Doubling, Karp-Rabin)

最新推荐文章于 2023-02-02 23:21:37 发布

长安一片月噢

最新推荐文章于 2023-02-02 23:21:37 发布

阅读量497

点赞数

How Large should Table be?

want m = Θ(n) at all times

Idea

Start small (constant) and grow (or shrink) at necessary

Rehashing

To grow or shrink table hash function must change

must rebuild hash table from scratch
Θ(n + m) time = Θ(n), if m = Θ(n)

How fast to grow

When n reaches m, say

m += 1, rebuild every step, n inserts cost Θ(n^2)
m *= 2, rebuild at insertion 2^i, n inserts cost Θ(n)
a few inserts cost linear time, but Θ(1) “on average”

Amortized Analysis

This is a common technique in data structures

an operation has amortized cost T(n) if k operations cost ≤ k · T(n)
“T(n) amortized” roughly means T(n) “on average”, but averaged over all ops.
e.g. inserting into a hash table takes O(1) amortized time.

Back to hashing

Maintain m = Θ(n) =⇒ α = Θ(1) =⇒ support search in O(1) expected time (assuming simple uniform or universal hashing)

Deletion

Also, O(1) expected as is.

space can get big with respect to n e.g. n× insert, n× delete
solution: when n decreases to m/4, shrink to half the size =⇒ O(1) amortized cost for both insert and delete

Resizable Arrays

list.append and list.pop in O(1) amortized

String Matching

Given two strings s & t: does s occur as a substring of t

Simple Algorithm:

any(s == t[i : i + len(s)] for i in range(len(t) − len(s)))

O(|s|) time for each substring comparison

O(|s| · (|t| − |s|)) time = O(|s| · |t|) potentially quadratic

Karp-Rabin Algorithm

Rolling Hash ADT:

Maintain string x subject to

r(): reasonable hash function h(x) on string x
r.append(c): add letter c to end of string x
r.skip(c): remove the front letter from string x, assuming it is c

Karp-Rabin Application:

for c in s: 
    rs.append(c)
for c in t[:len(s)]:
    rt.append(c)
if rs() == rt(): ...
                                        O(|s|)
for i in range(len(s), len(t)):
    rt.skip(t[i-len(s)])
    rt.append(t[i])
    if rs() == rt(): ...
                                        O(|t|) + O(#matches*|s|)

Data Structure:

Treat string x as a multi-digit number u in base a where a denotes the alphabet size, e.g., 256

r() = u mod p for (ideally random) prime p ≈ |s| or |t| (division method)
r stores u mod p and |x| (really $a^{|x|}$ ), not u ⇒ smaller and faster to work with (u mod p fits in one machine word)
r.append(c): (u·a + ord(c)) mod p = [(u mod p) · a + ord(c)] mod p
r.skip(c): [u − ord(c) · ( $a^{|u|-1}$ mod p)] mod p = [(u mod p) − ord(c) · ( $a^{|x|-1}$ mod p)] mod p

长安一片月噢

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Introduction to Algorithms (Table Doubling, Karp-Rabin)

How Large should Table be?want m = Θ(n) at all timesIdeaStart small (constant) and grow (or shrink) at necessaryRehashingTo grow or shrink table hash function must changemust rebuild hash t...
复制链接

扫一扫