The effect of a mispredicted branch can be very high, but the branch prediction logic found in modern processors is very good at discerning regular patterns and long-term trends for the different branch instructions. Therefore not all programs branches will slow a program down. However, branch prediction is only reliable for regular patterns. Many tests in a program are completely unpredictable, dependent on arbitrary features of the data, such as whether a number is negative or positive.
For inherently unpredictable cases, programs performance can be greatly enhanced if the compiler is able to generate code using conditional data transfers rather than conditional control transfers. This cannot be controlled directly by the C/C++ programmer, but some ways of expressing conditional behavior can be more directly translated into conditional moves than others.
Suppose we are given two arrays of integers a and b, and at each position i, we want to set a[i] to the minimum of a[i] and b[i], and b[i] to the maximum. An imperative style of implementing this function, where we use conditionals to selectively update program state, is to check at each position i and swap the two elements if they are out of order:
// Rearrange two vectors so that for each i, b[i] >= a[i]
void minmax1(int a[], int b[], int n) {
for (int i=0; i!=n; ++i) {
if (a[i] > b[i]) {
int t = a[i];
a[i] = b[i];
b[i] = t;
}
}
}
The measurements for this function on random data show a CPE of around 14.50 for random data, and 3.00-4.00 for predictable data, a clear sign of a high misprediction penalty.
A functional style of implementing this function, where we use conditional operations to compute values and then update the program state with these values, as opposed to the imperative style, is to compute the minimum and maximum values at each position i and then assign these values to a[i] and b[i], respectively:
// Rearrange two vectors so that for each i, b[i] >= a[i]
void minmax2(int a[], int b[], int n) {
for (int i=0; i!=n; ++i) {
int min = a[i] < b[i] ? a[i] : b[i];
int max = a[i] < b[i] ? b[i] : a[i];
a[i] = min;
b[i] = max;
}
}
The measurements for this function show a CPE of around 5.0 regardless of whether the data are arbitrary or predictable. And we should examine the generated assembly code to make sure that it indeed use conditional moves.