Instead of counting different of the pair of array which has O(n^2) time complexity, we decide to count different of each bit, i.e. accumulate different of each bit. More specify, res += (n - count) * count.
Why it works?
for each bit, we are counting the total combination, e.g. in first bit, we have first and second number has 0 bit and third number has 1 bit, so the total combination is 2 * 1(first&third, second&third). It is equal to count the pair of each number.
update:
radix sort, remember, DO NOT judge bit & to 1 explicitly, i.e., do write ((x & 1) == 1)