Bit Count: Parallel Counting - MIT HAKMEM

Q: Find the number of set bits in a given integer

Sol: Parallel Counting: MIT HAKMEM Count
HAKMEM (Hacks Memo) is a legendary collection of neat mathematical and programming hacks contributed mostly by people at MIT and some elsewhere. This source is from the MIT AI LABS and this brilliant piece of code orginally in assembly was probably conceived in the late 70’s.

int BitCount(unsigned int u)
{
unsigned int uCount;

uCount = u - ((u >> 1) & 033333333333) - ((u >> 2) & 011111111111);
return ((uCount + (uCount >> 3)) & 030707070707) % 63;
}

Lets take a look at the theory behind this idea.

Take a 32bit number n;
n = a31 * 231 + a30 * 230 +…..+ ak * 2k +….+ a1 * 2 + a0;

Here a0 through a31 are the values of bits (0 or 1) in a 32 bit number. Since the problem at hand is to count the number of set bits in the number, simply summing up these co-efficients would yeild the solution. (a0 + a1 +..+ a31 ).

How do we do this programmatically?

Take the original number n and store in the count variable.
count=n;

Shift the orignal number 1 bit to the right and subtract from the orignal.
count = n - (n >>1);

Now Shift the original number 2 bits to the right and subtract from count;
count = n - (n>>1) - (n>>2);

Keep doing this until you reach the end.
count = n - (n>>1) - (n>>2) - … -( n>>31);

Let analyze and see what count holds now.
n = a31 * 231 + a30 * 230 +…..+ ak * 2k +….+ a1 * 2 + a0;

n >> 1 = a31 * 230 + a30 * 229 +…..+ ak * 2k-1 +….+ a1;
n >> 2 = a31 * 229 + a30 * 228 +…..+ ak* 2k-2 +….+ a2

;
..
n >> k = a31 * 2(31-k) + a30 * 2(30-k) +…..+ ak * 2k;

..
n>>31 = a31;

You can quickly see that: (Hint: 2k - 2k-1 = 2k-1
count = n - (n>>1) - (n>>2) - … -( n>>31) =a31+ a30 +..+a0;
which is what we are looking for;

int BitCount(unsigned int u)
{
unsigned int uCount=u;
do
{
u=u>>1;
uCount -= u;
}
while(u);
}

This certainaly is an interesting way to solve this problem. But how do you make this brilliant? Run this in constant time with constant memory!!.

int BitCount(unsigned int u)
{
unsigned int uCount;

uCount = u - ((u >> 1) & 033333333333) - ((u >> 2) & 011111111111);
return ((uCount + (uCount >> 3)) & 030707070707) % 63;
}

For those of you who are still wondering whats going? Basically use the same idea, but instead of looping over the entire number, sum up the number in blocks of 3 (octal) and count them in parallel.

After this statement uCount = n - ((n >> 1) & 033333333333) - ((n >> 2) & 011111111111);
uCount has the sum of bits in each octal block spread out through itself.

So if you can a block of 3 bits

u = a2*22 + a1*2+ a0;
u>>1 = a2*2 + a1;
u>>2 = a2;

u - (u>>1) - (u>>2) is a2+a1+a0 which is the sum of bits in each block of 3 bits.

The nexe step is to grab all these and sum them up:

((uCount + (uCount >> 3)) will re-arrange them in blocks of 6 bits and sum them up in such a way the every other block of 3 has the sum of number of set bits in the original number plus the preceding block of 3. The only expection here is the first block of 3. The idea is not to spill over the bits to adjacent blocks while summing them up. How is that made possible. Well, the maximum number of set bits in a block of 3 is 3, in a block of 6 is 6. and 3 bits can represent upto 7. This way you make sure you dont spill the bits over. To mask out the junk while doing a uCount>>3. Do and AND with 030707070707. THe only expection is the first block as I just mentioned.

What does ((uCount + (uCount >> 3)) & 030707070707) hold now?
Its 2^0 * (2^6 - 1) * sum0 + 2^1 * (2^6 - 1) * sum1 + 2^2 * (2^6 - 1) * sum2 + 2^3 * (2^6 - 1) * sum3 + 2^4 * (2^6 - 1) * sum4 + 2^5 * (2^3 - 1) * sum5
where sum0 is the sum of number of set bits in every block of 6 bits starting from the ‘low’ position.
What we need is sum0 + sum1 + sum2 + sum3 + sum4 + sum5;
2^6-1 is 63. Clearly a modulo with 63 will get you what you want.

Remember, that this works only with 32 bits numbers, For 64-bit numbers (well I will wait for comments or do this in the next post).


检查错误原因 creating directory /data/primary/gpseg0 ... ok creating subdirectories ... ok selecting default max_connections ... 750 selecting default shared_buffers ... 125MB selecting default timezone ... Asia/Shanghai selecting dynamic shared memory implementation ... posix creating configuration files ... ok creating template1 database in /data/primary/gpseg0/base/1 ... child process was terminated by signal 9: Killed initdb: removing data directory "/data/primary/gpseg0" 2023-06-08 08:53:53.568563 GMT,,,p22007,th-604637056,,,,0,,,seg-10000,,,,,"LOG","00000","skipping missing configuration file ""/data/primary/gpseg0/postgresql.auto.conf""",,,,,,,,"ParseConfigFile","guc-file.l",563, 20230608:16:54:12:021728 gpcreateseg.sh:VM-0-5-centos:gpadmin-[INFO]:-Start Function BACKOUT_COMMAND 20230608:16:54:12:021728 gpcreateseg.sh:VM-0-5-centos:gpadmin-[INFO]:-End Function BACKOUT_COMMAND 20230608:16:54:12:021728 gpcreateseg.sh:VM-0-5-centos:gpadmin-[INFO]:-Start Function BACKOUT_COMMAND 20230608:16:54:12:021728 gpcreateseg.sh:VM-0-5-centos:gpadmin-[INFO]:-End Function BACKOUT_COMMAND 20230608:16:54:12:021728 gpcreateseg.sh:VM-0-5-centos:gpadmin-[FATAL][0]:-Failed to start segment instance database VM-0-5-centos /data/primary/gpseg0 20230608:16:54:12:019435 gpinitsystem:VM-0-5-centos:gpadmin-[INFO]:-End Function PARALLEL_WAIT 20230608:16:54:12:019435 gpinitsystem:VM-0-5-centos:gpadmin-[INFO]:-End Function PARALLEL_COUNT 20230608:16:54:12:019435 gpinitsystem:VM-0-5-centos:gpadmin-[INFO]:-Start Function PARALLEL_SUMMARY_STATUS_REPORT 20230608:16:54:12:019435 gpinitsystem:VM-0-5-centos:gpadmin-[INFO]:------------------------------------------------ 20230608:16:54:12:019435 gpinitsystem:VM-0-5-centos:gpadmin-[INFO]:-Parallel process exit status 20230608:16:54:12:019435 gpinitsystem:VM-0-5-centos:gpadmin-[INFO]:------------------------------------------------ 20230608:16:54:12:019435 gpinitsystem:VM-0-5-centos:gpadmin-[INFO]:-Total processes marked as completed = 0 20230608:16:54:12:019435 gpinitsystem:VM-0-5-centos:gpadmin-[INFO]:-Total processes marked as killed = 0 20230608:16:54:12:019435 gpinitsystem:VM-0-5-centos:gpadmin-[WARN]:-Total processes marked as failed = 1 <<<<< 20230608:16:54:12:019435 gpinitsystem:VM-0-5-centos:gpadmin-[INFO]:------------------------------------------------ 20230608:16:54:12:019435 gpinitsystem:VM-0-5-centos:gpadmin-[INFO]:-End Function PARALLEL_SUMMARY_STATUS_REPORT FAILED:VM-0-5-centos~6000~/data/primary/gpseg0~2~0
最新发布
06-09
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值