Loop Unrolling
A prime example of an optimization with a speed-space tradeoff is loop unrolling:
- This form of optimization increases the speed of loops by eliminating the “end of loop” condition on each iteration.
- Loop unrolling is an optimization that increases the speed of the resulting executable but also generally increase its size.
- Loop unrolling is also possible when the upper bound of the loop is unknown, provide the start and end conditions are handled correctly.
Scheduling
- The lowest level of optimization is scheduling, in which the compiler determines the best ordering of individual instructions.
- Most CPUs allow one or more new instructions to start executing before others have finished. Many CPUs also support pipelining, where multiple instructions execute in parallel on the same CPU.
- When scheduling is enabled, instructions must be arranged so that their results become available to later instructions at the right time, and to allow for maximum parallel execution.
- Scheduling improves the speed of an executable without increasing its size, but requires additional memory and time in the compilation process itself (due to its complexity).
Optimization Levels
In order to control compilation-time and compiler memory usage, and the trade-offs between speed and space for the resulting executable, GCC provides a range of general optimization levels, as well as individual options for specific types of optimization.
- An optimization level is chosen with the command line option ‘-OLEVEL’, where LEVEL is a number from 0 to 3.
- It is important to remember that the benefit of optimization at the highest levels must be weighed against the cost. The cost of optimization includes greater complexity in debugging, and increased time and memory requirement during compilation.
- For most purpose it is satisfactory to use ‘-O0’ for debugging, and ‘-O2’ for development and deployment.