引言
在无聊刷题时无意间碰到一个population count的题目,要求找出输入的矢量中有多少个1(二进制)。题目如下:
A “population count” circuit counts the number of '1’s in an input vector. Build a population count circuit for a 255-bit input vector.
Popcount算法
经过搜索,找到一种比较好的算法,采用的分治的思想,时间复杂度较低,为log(N)。
这种算法叫 shift and add,从名字可以直观的想到,该算法主要用到的shift,and,add操作。
算法解析
该算法主要利用两点:
一 根据最直观的做法,把每一位相加后就可以得到结果,这个过程我们可以利用分治+并行加法来优化;
二 对于n位整数,最多有n个1,而n必定能由n位二进制数来表示,因此我们在求出某k位中1的个数后,可以将结果直接存储在这k位中,不需要额外的空间。
以4位二进制数abcd为例,最终结果是a+b+c+d,循环的话需要4步加法
那么我们让abcd相邻的两个数相加,也就是 a+b+c+d=[a+b]+[c+d]
0 b 0 d
0 a 0 c
e f g h
ef=a+b gh=c+d 而 0b0d=(abcd)&0101,0a0c=(abcd)>>1 &0101
ef gh再相邻的两组相加
00 ef
gh
i j k l
ijkl=ef+gh gh=(efgh)&& 0011 ,ef=(efgh)>>2 & 0011
依次递推,需要log(N)次。
算法实现
/* ===========================================================================
* Problem:
* The fastest way to count how many 1s in a 32-bits integer.
*
* Algorithm:
* The problem equals to calculate the Hamming weight of a 32-bits integer,
* or the Hamming distance between a 32-bits integer and 0. In binary cases,
* it is also called the population count, or popcount.[1]
*
* The best solution known are based on adding counts in a tree pattern
* (divide and conquer). Due to space limit, here is an example for a
* 8-bits binary number A=01101100:[1]
* | Expression | Binary | Decimal | Comment |
* | A | 01101100 | | the original number |
* | B = A & 01010101 | 01000100 | 1,0,1,0 | every other bit from A |
* | C = (A>>1) & 01010101 | 00010100 | 0,1,1,0 | remaining bits from A |
* | D = B + C | 01011000 | 1,1,2,0 | # of 1s in each 2-bit of A |
* | E = D & 00110011 | 00010000 | 1,0 | every other count from D |
* | F = (D>>2) & 00110011 | 00010010 | 1,2 | remaining counts from D |
* | G = E + F | 00100010 | 2,2 | # of 1s in each 4-bit of A |
* | H = G & 00001111 | 00000010 | 2 | every other count from G |
* | I = (G>>4) & 00001111 | 00000010 | 2 | remaining counts from G |
* | J = H + I | 00000100 | 4 | No. of 1s in A |
* Hence A have 4 1s.
*
* [1] http://en.wikipedia.org/wiki/Hamming_weight
*
* ===========================================================================
*/