The maximum-subarray problem
Giving an array A[1..n] which contains some negative numbers, then find out a nonempty, contiguous subarray of A whose values have the largest sum. (Note: just need find out "a" maximum subarray rather than "the" maximum subarray, since there could be more than one subarray that achieves the maximum sum)
A brute-force solution
We can easily devise a brute-force solution to this problem: just try every possible subarray. Since an array of n numbers has n(n - 1)/2 subarrays, so this solution takes O(n2) time.
A solution using divide-and-conquer
Let's think about how we might solve the maximum-subarray problem using the divide-and-conquer technique. Suppose we want to find a maximum subarray of the subarray A[low..high]. Divide-and-conquer suggests that we divide the subarray into two subarrays of as equal size as possible. That is, we find the midpoint, say mid, of th subarray, and consider the subarrays A[low..mid] and A[mid + 1.. high]. Then any contiguous subarray A[i..j] of A[low..high] must lie in exactly one of the following places:
- entirely in the subarray A[low..mid], so that low <= i <= j <= mid;
- entirely in the subarray A[mid + 1..high], so that mid< i <= j <= high;
- crossing the midpoint, so that low <= i <= mid <j <= high.
We can easily find a maximum subarray crossing the midpoint in time linear in the size of the subarray A[low..high]. This problem is not a smaller instance of our original problem, because it has the added restriction that the subarray it chooses must cross the midpoint. Any subarray crossing the midpoint is itself made of two subarrays A[i..mid] and A[mid + 1..j], where low <= i <= mid and mid < j <= high. Therefore, we just need to find maximum subarrays of the form A[i..mid] and A[mid + 1..j] and then combine them. The procedure FIND-MAX-CROSSING-SUBARRAY takes as input the array A and the indices low, mid, and high, and it returns a tuple containing the indices demarcating a maximum subarray that cross the midpoint, along with the sum of the values in a maximum subarray.
**********************************************************************************************************************************************************
FIND-MAX-CROSSING-SUBARRAY(A,low,mid,high)
1 left-sum = -∞
2 sum = 0
3 for i = mid downto low
4 sum = sum + A[i]
5 if sum > left-sum
6 left-sum = sum
7 max-left = i
8 right-sum = -∞
9 sum = 0
10 for j = mid + 1 to high
11 sum = sum + A[j]
12 if sum > right-sum
13 right-sum = sum
14 max-right = j
15 return(max-left,max-right, left-sum + right-sum)
FIND-MAXIMUM-SUMARRAY(A,low,high)
1 if high == low
2 return(low, high,A[low]) //base case: only one element
else
3 mid = (low + high)/2
4 (left-low, left-high, left-sum) = FIND-MAXIMUM-SUBARRAY(A,low,mid)
5 (right-low, right-high, right-sum) = FIND-MAXIMUM-SUBARRAY(A,mid + 1, high)
6 (cross-low, cross-high, cross-sum) = FIND-MAX-CROSS-SUBARRAY(A,low,mid,high)
7 if left-sum >= right-sum and left-sum >= cross-sum
8 return(left-low, left-high,left-sum)
9 elseif right-sum >= left-sum and right-sum >= cross-sum
10 return(right-low, right-high, right-sum)
else
11 return(corss-low, cross-high, cross-sum)
**********************************************************************************************************************************************************
The procedure FIND-MAX-CROSSING-SUBARRAY works as follows. Lines 1-7 find a maximum subarray of the left half, A[low..mid]. Since this subarray must contain A[mid], the for loop of lines 3-7 starts the index i at mid and works down to low, so that every subarray it considers is of the form A[i..mid]. Lines 1-2 initialize the variables left-sum, which holds the greatest sum found so far, and sum, holding the sum of the entries in A[i..mid]. Whenever we find, in line 5, a subarray A[i..mid] with a sum of values greater than left-sum, we update left-sum to this subarray's sum in line 6, and in line 7 we update the variable max-left to record this index i. Lines 8-14 work analogously for the right half, A[mid+1..high]. Here, the for loop of lines 10-14 starts the index j at mid+1 and works up to high, so that every subarray it considers is of the form A[mid + 1.. j]. Finally, line 15 returns the indices max-left and max-right that demarcate a maximum subarray crossing the midpoint, along with the sum left-sum + right-sum of the values in the subarray A[max-left..max-right].
If the subarray A[low..high] contains n entries (so that n = high - low + 1), we claim that the call FIND-MAX-CROSSING-SUBARRAY(A, low, mid, high) takes Θ(n) time. Since each iteration of each of the two for loops takes Θ(1) times, we just need to count up how many iterations there are altogether. The for loop of lines 3-7 makes mid - low + 1 iterations, and the for loop of lines 10-14 makes high - mid iterations ,and so the total number of iterations is n.
Then we can write pseudocode -- the procedure FIND-MAXIMUM-SUBARRAY(A, low, high), for a divide-and-conquer algorithm to solve the maximum-subarray problem. Similar to FIND-MAX-CROSSING-SUBARRAY, the recursive procedure FIND-MAXIMUM-SUBARRAY returns a tuple containing the indices that demarcate a maximum subarray, along with the sum of the values in a maximum subarray. Line 1 tests for the base case, where the subarray has just one element. A subarray with just one element has only on subarray-itself-and so line 2 returns a tuple with the starting and ending indices of just the one element, along with its value. Lines 3-11 handle the recursive case. Line 3 does the divide part, computing the index mid of the midpoint. Let's refer to the subarray A[low..mid] as the left subarray and to A[mid+1..high] as the right subarray. Because we know that the subarray A[low..high] contains at least two elements, each of the left and right subarrays must have at least one element. Lines 4 and 5 conquer by recursively finding maximum subarrays within the left and right subarrays, respectively. Lines 6-11 form the combine part. Line 6 finds a maximum subarray that crosses the midpoint.(Because line 6 solves a subproblem that is not a smaller instance of the original problem, we consider it to be in the combine part.) Line 7 tests whether the left subarray contains a subarray with the maximum sum, and line 8 returns that maximum subarray. Otherwise, line 9 tests whether the right subarray contains a subarray with the maximum sum, and line 10 returns that maximum subarray. If neither the left or right subarrays contain a subarray achieving the maximum sum, then a maximum subarray must cross the midpoint, and line 11 returns it.
Analyzing the divide-and-conquer algorithm
We denote T(n) the running time of FIND-MAXIMUM-SUBARRAY on a subarray of n elements. For starters, line 1 takes constant time. The base case, when n = 1, is easy: line 2 takes constant time and so T(1) = Θ(1). The recursive case occurs when n > 1. Lines 1 and 3 take constant time. Each of the subproblems solved in lines 4 and 5 is on a subarray of n/2 elements, and so we spend T(n/2) time solving each of them. Because we have two subproblems--for the left subarray and for the right subarray-- the contribution to the running time from lines 4 and 5 comes to 2T(n/2). As we have already seen, the call to FIND-MAX-CROSSING-SUBARRAY in line 6 takes Θ(n) time. Lines 7 - 11 take only Θ(1) time. For the recursive case, therefore, we have T(n) = 2T(n/2) + Θ(n). This recurrence has the solution T(n) = Θ(nlgn).
A linear-time algorithm for the maximum-subarray problem
Use the following ideas to develop a nonrecursive, linear-time algorithm for the maximum-subarray problem. Start at the left end of the array, and progress toward the right, keeping track of the maximum subarray seen so far. Knowing a maximum subarray of A[1..j], extend the answer to find a maximum subarray ending at index j + 1 by using the following observation: a maximum subarray of A[1..i + 1] is either a maximum subarray of A[1..j] or a subarray A[i..j + 1], for some 1 <= i <= j + 1. Determine a maximum subarray of the form A[i..j + 1] in constant time based on knowing a maximum subarray ending at index j. We can achieve this by dynamically adjusting the left index "i" of the subarray A[i..j+1] during the progressing. The pseudocode as follow:
*****************************************************************************************************************************************************************
FIND-MAXIMUM-SUMARRAY-LINEAR-TIME(A,low,high)
1 if high == low
2 return(low,high,A[low])
else
3 max-sum = A[low]
4 tmp-sum = A[low]
5 max-left = max-right = tmp-left = low
6 for j = low + 1 to high
7 tmp-sum = tmp-sum + A[j]
8 if sum > max-sum
9 max-sum = tmp-sum
10 max-left = tmp-left
11 max-right = j
12 if tmp-sum < 0
13 tmp-left = j + 1
14 tmp-sum = 0
15 return(max-left, max-right, max-sum)
*****************************************************************************************************************************************************************