Chapter 11 Exercises and Problems

Exercises
11.2-3 Professor Marley hypothesizes that substantial performance gains can be obtained if we modify the chaining scheme so that each list is kept in sorted order. How does the professor's modification affect the running time for successful searches, unsuccessful searches, insertions, and deletions?
The conclusion is: some constants may be optimized, but time complexity cannot be changed. There is a way to improve the time complexity of searches, but it's of little use.
The time of successful search is identical, Θ(1+n/m), based on the identical analysis of the ordinary hashtable.
If we use linked list to implement each hash slots, the expected running time of unsuccessful searching is 50% of the original time if we simply assume that the probability that one element's value falls between two consecutive elements in the hash slot is uniformly distributed. Then the time is still Θ(1+n/m).
Insertions and deletions are based on searches, so the overall time complexity does not change if we sort each slot.
If we use array to implement each has slots, the search can be implemented by divide-and-conquer. Thus the search time of each slot is O(logni), while ni is the size of the slot. It seems that it's optimized, but you should be aware that you must allocate the memory for the array before using it. It is also annoying that overflow occurs. And though the theoretical time is O(logn), this does not beat O(n) ordinary search when n is very small, say, n<=10 in most situations.

11.3-5 Define a family H of hash functions from a finite set U to a finite set B to be ε-universal if for all pairs of distinct elements k and l in U,
Pr{h(k) = h(l)} <= ε,
where the probability is taken over the drawing of hash function h at random the family H. Show that an ε-universal family of hash functions must have ε >= 1/|B| - 1/|U|.
Our goal is to prove max(Pr{h(k) = h(l)}) >= 1/|B| - 1/|U|
Let's focus on the overall collision that occurs.
Let m = |B|, n = |U|
First, for each slot which has x elements, there are C(x,2) = x*(x-1)/2 collisions.
For a function that makes a (d1,d2,...,dm) distribution for n elements into m slots,
the number of collisions are sigma(i=1~m,C(di,2)) = sigma(i=1~m,(di^2 - di)/2) >= (n^2 - nm)/2m
Thus the total collisions are at least |H|*n(n-m)/2m since sigma(i=1~m,di^2) >= sigma(i=1~m,di)^2 / m and sigma(i=1~m,di) = n
Second, |H|*Pr{h(k) = h(l)} is the number of collision that happens for the pair (k,l)
If we sum up all the (k,l) pairs, we will get the same number of total collisions. i.e. |H|*sigma(k,l,Pr{h(k) = h(l)}) = sigma(i=1~m,C(di,2)) >= n(n-m)/2m
Since there are C(n,2) different pairs of (k,l), and C(n,2)*max(Pr{h(k) = h(l)}) >= sigma(k,l,Pr{h(k) = h(l)}), we have:
|H|*C(n,2)*max(Pr{h(k) = h(l)}) >= |H|*n(n-m)/2m
max(Pr{h(k) = h(l)}) >= (n-m)/(n-1)m >= (n-m)/nm = 1/m - 1/n = 1/|B| - 1/|U|
Since for all (k,l) pair, Pr{h(k) = h(l)} <= ε
Thus ε >= max(Pr{h(k) = h(l)}) >= 1/|B| - 1/|U|

11.3-6 Let U be the set of n-tuples of values drawn from Z[p], and let B = Z[p], where p is prime. Define the hash function h(b):U->B for b∈Z[p] on an input n-tuple from U as h(b)[] = sigma(j=0~n-1,aj*b^j)
and let H = {h[b]:b∈Z[p]}. Argue that H is ((n-1)/p)-universal.
We just focus on a certain n-tuple , and simply call h(b)[] = h(b) without any confusion.
It is easy to see that h(b) is a (n-1)-degree polynomial. Thus for any constant integer c, the formula h(b) = c (mod p) has at most n-1 roots in Z[p].
Thus, for any certain value y∈Z[p], there are at most n-1 roots for h(x) = h(y) (mod p) in Z[p], since h(y) is a constant.
Thus, for a fixed y, the probability that x collides with y is at most (n-1)/p.
Based on the similar analysis for Theorem 11.5 (which has the conclusion that the pair (x,y) collides at the probability of at most 1/p), we can conclude that H is ((n-1)/p)-universal.

11.5-1 Suppose that we insert n keys into a hash table of size m using open addressing and uniform hashing. Let p(n,m) be the probability that no collisions occur. Show that p(n,m) <= e^(-n(n-1)/2m). Argue that when n exceeds sqrt(m), the probability of avoiding collisions goes rapidly to zero.
When n keys are randomly put into the hash table, there are n^m cases. However, only P(m,n) = m*(m-1)*...*(m-n+1) of them have no collisions. Thus the probability, p(n,m) = m*(m-1)*...*(m-n+1)/m^n
Since (m-i)*(m-n+i) < (m-n/2)^2 for all real number i
Thus p(n,m) < m*(m-n/2)^(n-1) / m^n = (1-n/2m)^(n-1) < (e^(-n/2m))^(n-1) = e^(-n(n-1)/2m)

Problems
11-1 Longest-probe bound for hashing
The ultimate task is to prove a O(logn) expected length of the longest probe sequence for a hash table using open addressing. I won't analyze it in detail because it shares the similar model with 5.4.3, Streaks.

转载于:https://www.cnblogs.com/FancyMouse/articles/1069646.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值