Generally Useful Optimizations(1)
(CMU15213)
1. Optimizations
1. Code Motion
-
Reduce frequency with which computation performed
- If it will always produce same result
- Especially moving code out of loop
//Original Code void set_row(double *a, double *b, long i, long n) { long j; for (j = 0; j < n; j++) a[n*i+j] = b[j]; } //Optimized Code long j; int ni = n*i; for (j = 0; j < n; j++) a[ni+j] = b[j];
The fixed calculation n*i is only computed once.
->gcc -01
Optimizations about pic above:
- put the repititious calculation n*i outside the loop
- using pointer *rowp
2. Reduction in Strength
- Replace costly operation with simpler one
- Using addition instead of multiplication.
- Shift, add instead of multiply or divide
- 16*x --> x << 4
- Utility machine dependent
- Depends on cost of multiply or divide instruction.
- Such as: On Intel Nehalem, integer multiply requires 3 CPU cycles
- Recognize sequence of products
Trun Multiplication into Addition since there are some predictable patterns of how this variable ni is going to be updated.
3. Share Common Subexpression
- Reuse portions of expressions
- GCC will do this with –O1
Example about the arrary indexing optimizaiton:
2. Optimization Blocker
1. Procedure Calls
void lower(char *s)
{
size_t i;
for (i = 0; i < strlen(s); i++)
if (s[i] >= 'A' && s[i] <= 'Z')
s[i] -= ('A' - 'a');
}
- Time quadruples when double string length
- Quadratic performance
strlen
executed every iteration.
When calling strlen
…
size_t strlen(const char *s)
{
size_t length = 0;
while (*s != '\0') {
s++;
length++;
}
return length;
}
- Strlen performance
- Only way to determine length of string is to scan its entire length, looking for null character.
- Overall performance, string of length N
- N calls to strlen
- Require times N, N-1, N-2, …, 1
- Overall O(N2) performance
Improving Performance
void lower(char *s)
{
size_t i;
size_t len = strlen(s);
for (i = 0; i < len; i++)
if (s[i] >= 'A' && s[i] <= 'Z')
s[i] -= ('A' - 'a');
}
- Move call to strlen outside of loop
- Since result does not change from one iteration to another
- Form of code motion
Inside the red box, here is a question: Why does it have to keep jumping back and forth between memory and registers over and over again?
-
–>Because in C, one cannot be sure that there isn’t memory aliasing.
-
Memory Aliasing
- –>In C, one can make one memory data sturcture overlay another data structure when two separate parts of the program are referring to the same locations in memory -->Memory Aliasing
When b
gets updated, its changing a
and changing what’s being read in summation. So the compiler would assume aliasing might happen, that is these two memory locations might co-overlap each other,so the complier will carefully write it out and then read it back over and over again.
![image-20211221160550181](https://raw.githubusercontent.com/BravoNathan/cloudimg/main/image/202112211605218.png)
–> Introducing local varibles ☆
Holding it in a temporary one instead of reading and writing in same memory location.