算法导论8-3思考题-排序长度不同的数据项


a)给定一个整数数组,其中不同的整数中包含的数字的个数可能不同,但该数组中,所有整数中总的数字数为n。如何在O(n)的时间内对该数组排序
b)给定一个字符串数组,其中不同的串包含的字符数可能不同,但所有串中的总的字符个数为n。如何在O(n)的时间内对该数组排序。(注意是字符串排序,例如a<ab<b)

这两个题目都是考对counting sort和radix sort的活学活用。

a.   The usual, unadorned radix sort algorithm will not solve this problem in the required time bound. The number of passes, d, would have to be the number of digits in the largest integer. Suppose that there are m integers; we always have m n. In the worst case, we would have one integer with n/2 digits and n/2 integers with one digit each. We assume that the range of a single digit is constant. Therefore, we would have d =n/2 and m =n/+1, and so the running time would be (dm)=(n2).

Let us assume without loss of generality that all the integers are positive and have no leading zeros. (If there are negative integers or 0, deal with the positive numbers, negative numbers, and 0 separately.) Under this assumption, we can observe that integers with more digits are always greater than integers with fewer digits. Thus, we can first sort the integers by number of digits (using counting sort), and then use radix sort to sort each group of integers with the same length. Noting that each integer has between 1 and n digits, let mbe the number of integers with i digits, for i =1,2,...,n. Since there are n digits

altogether, we have sum(·mi     i=1,2,...n)=n.

It takes O(n)time to compute how many digits all the integers have and, once the numbers of digits have been computed, it takes O(+n)=O(n)time to group the integers by number of digits. To sort the group with mdigits by radix sort takes (·m)time. The time to sort all groups, therefore, is O(n)

b. One way to solve this problem is by a radix sort from right to left. Since the strings have varying lengths, however, we have to pad out all strings that are shorter than the longest string. The padding is on the right end of the string, and it’s with a special character that is lexicographically less than any other character (e.g., in C, the character \0with ASCII value 0). Of course, we don’t have to actually change any string; if we want to know the jth character of

a string whose length is k, then if j >k, the jth character is the pad character. Unfortunately, this scheme does not always run in the required time bound. Suppose that there are m strings and that the longest string has d characters. In the worst case, one string has n/2 characters and, before padding, n/2 strings have one character each. As in part (a), we would have d =n/2 and m =n/+1. We still have to examine the pad characters in each pass of radix sort, even if we don’t actually create them in the strings. Assuming that the range of a single character is constant, the running time of radix sort would be (dm)=(n2).

To solve the problem in O(n)time, we use the property that, if the first letter of string x is lexicographically less that the first letter of string y, then x is lexicographically less than y, regardless of the lengths of the two strings. We take advantage of this property by sorting the strings on the first letter, using counting sort. We take an empty string as a special case and put it first. We gather together all strings with the same first letter as a group. Then we recurse, within each group, based on each string with the first letter removed.

The correctness of this algorithm is straightforward. Analyzing the running time is a bit trickier. Let us count the number of times that each string is sorted by a call of counting sort. Suppose that the ith string, s, has length l. Then sis sorted by at most l+1 counting sorts. (The “+1” is because it may have to be sorted as an empty string at some point; for example, aband aend up in the same group in the first pass and are then ordered based on band the empty string in the second pass. The string ais sorted its length, 1, time plus one more time.) A call of counting sort on t strings takes (t)time (remembering that the number of different characters on which we are sorting is a constant.) Thus,

the total time for all calls of counting sort is O(n)



FROM:http://hi.baidu.com/rangemq/blog/item/42929bccc44faa1201e9288c.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值