算法导论Lecture 6:中值与顺序统计

Order statistics

Problem: Given n elements in an array, find the kth smallest element (rank k).

 

The naive algorithm to solve this problem: sort the array A and return A[k]. If use heap sort or merge sort, requires Theta(nlgn) time.

 

We can do better than this: in linear time.

 

The trivial case is:

 

1) k = 1, it's the minimun,

2) k = n, the maximum, both these two cases requires Theta(n).

3) k = floor((n+1)/2) or ceiling((n+1)/2), it's the median.

 

Idea: radomized divide and conquer. Use the randomized partition in Quicksort and find in the two splits recursively.

 

RANDOM-SELECT(A, p, q, i)
if p = q then return A[p]
r := RAND-PARTITION(A, p, q) //randomly select pivot, partition around it, return its rank.
k := r - p + 1
if i = k then return A[r]
if i < k then return RANDOM-SELECT(A, p, r-1, i)
         else return RANDOM-SELECT(A, r+1, q, i-k)

 

Intuition for analysis:

 

Lucky case - 1/10 : 9/10 splits:

T(n) <= T(n/10) + T(9n/10) + Theta(n)

T(n) = Theta(n);

 

Unlucky case - 0 : n-1 splits:

T(n) = T(n-1) + T(0) + Theta(n)

      = Theta(n^2)

 

So RANDOM-SELECT needs Theta(n^2) in worst case.

 

Expected running time of RANDOM-SELECT: Define T(n) be the random variable for running time of RANDOM-SELECT on input of size n, assuming random numbers are independent. Define indicator random variables X_k for k=0,1,2,...,n-1:

X_k = 1 if RAND-PARTITION generates k:n-k-1 split,

       = 0 otherwise.

then

T(n) = sum_{k=0}^{n-1} X_k T(max{k, n-k-1}) + Theta(n).

E[T(n)] = 1/n sum_{k=0}^{n-1} E[T(max{k,n-k-1})]+Theta(n)

            = 2/n sum_{k=floor(n/2)}^{n-1} E[T(k)] + Theta(n).

Claim E[T(n)] <= cn for some c>0

T(n) <= 2/n sum_{k=floor(n/2)}^{n-1} ck + Theta(n)

       <= cn - (cn/4 - Theta(n))

       <= cn if cn/4 dominates Theta(n).

 

So E[T(n)] = Theta(n), e.g the expected running time is linear.

 

Here's a worst case linear time order statistics [Blum, Floyd, Pratt, Rivest, Tarjan 1973]: Idea is to generate good pivot recursively (it's garanteed to be good).

 

SELECT(i, n)

1. Divide the n elements into floor(n/5) groups of 5 elements each. Find the median of each group.

2. Recursively select the median x of the floor(n/5) group medians.

3. Partition with x as pivot, let k = rank(x).

4. if i = k then return x

5. if i < k then recursively select ith smallest elements in the lower part

              otherwise select the (i-k)th smallest elements in the upper part.

 

Analysis:

After the partition, there're 3*floor(floor(n/5)/2)=3*floor(n/10) elements greater than or equal to x, and the same for less than or equal to x.

A simplication for analysis, assume for n>=50, 3*floor(n/10) >= n/4.

 

So T(n) <= T(n/5) + T(3n/4) + Theta(n). Claim T(n) <= cn

T(n) <= cn/5 + 3cn/4 + Theta(n)

       = cn - (cn/20 - Theta(n))

      <= cn if cn/20 dominates Theta(n)

 

So T(n) is linear time.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值