Majority Element问题

36 篇文章 0 订阅

问题: 数组中有一个数字出现的次数超过了数组长度的一半, 找出这个数字。

上述这个问题的有很多的解决办法。 下面一一描述(注意当要求时间复杂度为O(N)(数组的元素的个数为N), 也即线性时间完成任务的时候, 有一个非常elegant的算法, 称为Moore’s Voting Algorithm该算法的时间复杂度为O(n), 空间复杂度为O(1), 即算法运行的时候只需要常数的空间开销, simple, elegant as well as powerful)。

Majority Element: A majority element in an array A[] of size n is an element that appears more than n/2 times (and hence there is at most one such element).

Write a function which takes an array and emits the majority element (if it exists), otherwise prints NONE as follows:


stack overflow的提示。 试想, 既然是majority element, 那么用这个majority element同那些非majority 相对应, 遇到不是majority的element, 就用这个majority element将其抵消掉(假如我们知道那一个数是majority), 最后剩下的数字必然只有我们的majority element。 当然算法运行前, 我们并不知道哪一个数是majority element。 所以我们就把相异的元素抵消掉, 最后剩下的元素就是我们的majority element了。


METHOD 1 (最基本的解决办法)

从前到后遍历数组计数。 其实我们只需要遍历一半的数组就可以有两个结果: 要么找到了这个众数,要么数组中不存在众数。 我们从索引0开始, 记录这个位置, 从这个位置开始向后遍历。 对arr[0]的值进行统计整个数组中出现的次数, 如果统计完成之后超过n/2, 那么这个数就是我们要找的majority element了, 直接输出即可, 否则从arr[1]的值进行统计, 一次进行下去, 只需要到达arr[n/2]我们就能知道答案了。   

Time Complexity: O((n - 1)+(n-1) + ... + n/2) = O(n^2).
Auxiliary Space : O(1).


METHOD 2 (Using Binary Search Tree)

这个办法是使用二叉搜索树进行统计。 每个节点保存有数组元素数值以及对应的出现的频率信息, 每次插入都要检查插入元素的频率是否超过n/2, 超过, 即退出, 就是找到了这个元素。

Node of the Binary Search Tree (used in this approach) will be as follows.

structtree
{
  intelement;
  intcount;
}BST;

Insert elements in BST one by one and if an element is already present then increment the count of the node. At any stage, if count of a node becomes more than n/2 then return.
The method works well for the cases where n/2+1 occurrences of the majority element is present in the starting of the array, for example {1, 1, 1, 1, 1, 2, 3, 4}.

Time Complexity:
 If a binary search tree is used then time complexity will be O(n^2). If a self-balancing-binary-search tree is used then O(nlogn)
Auxiliary Space: O(n)


METHOD 3 (使用 Moore’s Voting Algorithm)

This is a two step process.
1. Get an element occurring most of the time in the array. This phase will make sure that if there is a majority element then it will return that only.
2. Check if the element obtained from above step is majority element.

1. Finding a Candidate:
The algorithm for first phase that works in O(n) is known as Moore’s Voting Algorithm. Basic idea of the algorithm is if we cancel out each occurrence of an element e with all the other elements that are different from e then e will exist till end if it is a majority element.

Above algorithm loops through each element and maintains a count of a[maj_index], If next element is same then increments the count, if next element is not same then decrements the count, and if the count reaches 0 then changes the maj_index to the current element and sets count to 1.
First Phase algorithm gives us a candidate element. In second phase we need to check if the candidate is really a majority element. Second phase is simple and can be easily done in O(n). We just need to check if count of the candidate element is greater than n/2.

Example:
A[] = 2, 2, 3, 5, 2, 2, 6
Initialize:
maj_index = 0, count = 1 –> candidate ‘2?
2, 2, 3, 5, 2, 2, 6

Same as a[maj_index] => count = 2
2, 2, 3, 5, 2, 2, 6

Different from a[maj_index] => count = 1
2, 2, 3, 5, 2, 2, 6

Different from a[maj_index] => count = 0
Since count = 0, change candidate for majority element to 5 => maj_index = 3, count = 1
2, 2, 3, 5, 2, 2, 6

Different from a[maj_index] => count = 0
Since count = 0, change candidate for majority element to 2 => maj_index = 4
2, 2, 3, 5, 2, 2, 6

Same as a[maj_index] => count = 2
2, 2, 3, 5, 2, 2, 6

Different from a[maj_index] => count = 1

Finally candidate for majority element is 2.

First step uses Moore’s Voting Algorithm to get a candidate for majority element.

2. Check if the element obtained in step 1 is majority

printMajority (a[], size)
1.  Find the candidate for majority
2.  If candidate is majority. i.e., appears more than n/2 times.
       Print the candidate
3.  Else
       Print "NONE"

Implementation of method 3:

<span style="font-size:14px;">/* Program for finding out majority element in an array */
# include<stdio.h>
# define bool int
  
int findCandidate(int *, int);
bool isMajority(int *, int, int);
  
/* Function to print Majority Element */
void printMajority(int a[], int size)
{
  /* Find the candidate for Majority*/
  int cand = findCandidate(a, size);
  
  /* Print the candidate if it is Majority*/
  if(isMajority(a, size, cand))
    printf(" %d ", cand);
  else
    printf("NO Majority Element");
}
  
/* Function to find the candidate for Majority */
int findCandidate(int a[], int size)
{
    int maj_index = 0, count = 1;
    int i;
    for(i = 1; i < size; i++)
    {
        if(a[maj_index] == a[i])
            count++;
        else
            count--;
        if(count == 0)
        {
            maj_index = i;
            count = 1;
        }
    }
    return a[maj_index];
}
  
/* Function to check if the candidate occurs more than n/2 times */
bool isMajority(int a[], int size, int cand)
{
    int i, count = 0;
    for (i = 0; i < size; i++)
      if(a[i] == cand)
         count++;
    if (count > size/2)
       return 1;
    else
       return 0;
}
  
/* Driver function to test above functions */
int main()
{
    int a[] = {1, 3, 3, 1, 2};
    printMajority(a, 5);
    getchar();
    return 0;
}</span>
Time Complexity: O(n)
Auxiliary Space : O(1)

算法证明:

 * A one-pass, linear time algorithm that returns the majority element of a
 * range of data.  The majority element is an element that appears strictly
 * more than half the time.  For example, in the sequence
 *
 *                                  0 1 0 0 2 0 3
 *
 * The number 0 is a majority element, since it appears 4/7 times.  However,
 * in the sequence
 *
 *                                 0 1 0 0 2 0 3 3
 *
 * There is no majority element, since even though 0 occurs 4/8 times, this
 * isn't strictly greater than half the elements.
 *
 * The algorithm for finding the majority element is remarkably simple, but its
 * correctness is not immediately obvious.  The algorithm works as follows.  At
 * each step, we maintain our "guess" of what the majority element will be, and
 * also a counter.  We then scan across the array.  At each point, if the new
 * element matches our current guess we increment the counter, and otherwise
 * we decrement it.  If the counter is ever zero, then on the next element we
 * change the counter to 1 and pick the next element as our guess.  Finally, we
 * output the guessed element.  For example, here is the algorithm running on
 * the earlier input.  The topmost row shows the input, below that our guess,
 * and below that the counter:
 *
 *  INPUT    0 1 0 0 2 0 3
 *  GUESS   ? 0 ? 0 0 0 0 0
 *  COUNTER 0 1 0 1 2 1 2 1
 *
 * Since our guess at the end is zero, we output zero.  This algorithm is
 * due to Boyer and Moore and is described in their paper "MJRTY - A Fast
 * Majority Vote Algorithm."
 *
 * There are many ways to think about why this algorithm works.  One good
 * intuition is to think of the algorithm as breaking the input down into lots
 * of stretches of consecutive copies of particular values.  Incrementing the
 * counter then corresponds to marking that multiple copies of the same value
 * were found, while decrementing it corresponds to some other sequences of
 * values "canceling out" the accumulation of values of a particular type.
 *
 * A formal proof of correctness of this algorithm (based on the proof in
 * Boyer and Moore's paper) relies on a key lemma.  In this section, we'll
 * let C be the number that is currently a candidate for the majority element,
 * K be its count after some number of steps, and N be the number of total
 * elements.
 *
 * Lemma 1: For any i, 1 <= i <= N, after i steps of the algorithm, the
 * elements in the range [1, i] can rearranged into two groups A and B such
 * that A is K copies of C, and B is a collection of elements with at most
 * i / 2 copies of any one element.
 *
 * Let's hold off of the proof of this lemma for now, and show that if it
 * holds and there is a majority element, the algorithm must be correct.  Using
 * the above lemma, note that when the algorithm terminates, there must be some
 * element C that was chosen with some count K.  Assume for the sake of
 * contradiction that C is not the majority element; then there is some other
 * element C' that must be the majority element.  Consequently, there are at
 * least n / 2 elements of the range equal to C'.  Let's consider where they
 * are.  By the above lemma, all the elements of the input can be broken up
 * into groups A and B, where everything in group A has value C and at most
 * |B| / 2 elements of |B| have value C'.  Since |A| = K and |A| + |B| = N,
 * this means that there are at most (N - K) / 2 copies of C', contradicting
 * the fact that C' is the actual majority element.  We have reached a
 * contradiction, and so C must be the majority element at the end of the
 * algorithm's run.
 *
 * We can now prove the claim of the lemma by induction on i.  As a base case,
 * if i = 1, then K = 1 and C is the first element of the range.  Then we can
 * let A be the singleton element and B be the empty set, which trivially obeys
 * the criteria of the lemma.  For the inductive step, assume that for some i
 * the claim holds and consider the execution of the algorithm on step i + 1.
 * Let A and B be the sets A and B from the ith step.  Then we consider three
 * possible cases:
 *
 * 1. On entry to this step, K = 0.  Then after this step finishes, K = 1
 *    and C is the newest element.  This means that on entry to this step,
 *    A was the empty set and B was some set where no element appeared more
 *    than i/2 times in B.  If we then let A' be the singleton set containing
 *    the new element and B' = B, then these sets satisfy the requirements of
 *    the lemma and the claim holds.
 * 2. On entry to this step, K > 0 and the new element matches the current
 *    majority element.  Then we can add this element to A to get a new set
 *    A' meeting the lemma's requirements, so the claim holds.
 * 3. On entry to this step, K > 0, and the new element does not match the
 *    current majority element.  This means that the new K is one minus the
 *    previous K, but the candidate majority element does not change.  If
 *    we then move one element from A into B, then place the new element into
 *    the set B, then the updated A and B will satisfy the lemma's claims.
 *    This is tedious but simple to check, so I'll leave it as an exercise
 *    to the reader. :-)
 *
 * In the case where there is no majority element, the element produced by the
 * algorithm will be arbitrary.  We can then check whether we have the majority
 * element by performing a linear scan over the input range and counting the
 * frequency of the element.






  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值