Re: Optimization - does O3 always generate faster code than O2?
%Therefore, I guess the conclusion (at least for gcc 3) is
%1. -O2 does not mandate loop unrolling;
%2. with -O or -O2, loop unrolling may or may not be turned on.
To clarify, what gcc 3 means may be:
-O2 does not perform loop unrolling unless it is already performed by -O. Therefore, loop unrolling may or may not be performed under -O or -O2.
And it seems -O3 does not turn on loop unrolling either (unless it is already performed ny -O).
Is my understanding correct?
With the wording in the 4.1.1 Manual, I have no clue what it means. In particular, it says "The compiler does not perform loop unrolling or function inlining when you specify -O2." It does not say "-O2 does not perform loop unrolling"; it says "the compiler does not perform loop unrolling". So it seems -O2 will turn off any loop unrolling that is enabled by -O!
Optimization - does O3 always generate faster code than O2?
Is it possible that code generated using the O2 option runs faster than that using O3, for example? Is it posible that an optimzation
technique used by O3 is counter-productive for a particular algorithm?
And is there more detailed explanation (preferably with examples) about each optimization technique used by gcc than that in the GCC manual ?
The article says: "In the GCC manpage, it's clearly written that: -O2 turns on all optional optimizations except for loop unrolling [...]" (In the 4.1.1 manual, the exact wordings are: "The compiler does not perform loop unrolling or function inlining when you specify -O2." which is even more confusing.) True, but -O2 turns on all flags that are turned on by -O. And -O turns on -floop-optimize which "optionally" does loop unrolling. Therefore, I guess the conclusion (at least for gcc 3) is
1. -O2 does not mandate loop unrolling;
2. with -O or -O2, loop unrolling may or may not be turned on.
However, based on the 4.1.1 wordings, there is simply no loop unrolling under -O2, period. It somehow implies that if there is any loop unrolling optionally turned on by -O, -O2 would disable it. That is strange.
And how does -floop-optimize2 works?
GCC 4.1.1 manual :
Enable profile feedback directed optimizations, and optimizations generally profitable only with profile feedback available.
The following options are enabled: -fbranch-probabilities, -fvpt, -funroll-loops, -fpeel-loops, -ftracer.
Unroll loops whose number of iterations can be determined at compile time or upon entry to the loop. -funroll-loops implies -frerun-cse-after-loop. It also turns on complete loop peeling (i.e. complete removal of loops with small constant number of iterations). This option makes code larger, and may or may not make it run faster.
Enabled with -fprofile-use.
Perform loop optimizations: move constant expressions out of loops, simplify exit test conditions and optionally do strength-reduction and loop unrolling as well.
Enabled at levels -O, -O2, -O3, -Os.
Perform loop optimizations using the new loop optimizer. The optimizations (loop unrolling, peeling and unswitching, loop invariant motion) are enabled by separate flags.
That is, -O turns on -floop-optimize which optionally does loop unrolling. On the other hand, -fprofile-use enables -funroll-loops. And none of the -Ox flags turns on -fprofile-use. Also, none of the -Ox flags turns on -floop-optimize2. And it appears that the manual says that once -floop-optimize2 is turned on, loop unrolling is enabled by a separate flag, presumably -funroll-loops, and implies that -floop-optimize2 would "disable" -floop-optimize because -floop-optimize2 would force the loop optimization techniques be individually turned on. It follows that if I do this:
gcc -O -fprofile-use myprog.c or
gcc -O -floop-optimize2 myprog.c
No loop optimization is performed because any loop optimization that would otherwise be turned on by -O is turned off by -floop-optimize2. If I want to do loop unrolling using the so-called "new loop optimizer", and also benefit from other optimization (except loop optimzation) offered by -O, then I need to do this:
gcc -O -floop-optimize2 -funroll-loops myprog.c
This would do:
1. optimization (except loop optimzation) offered by -O
2. loop unrolling offered by the "new loop optimizer"
but would not do any other loop optimization.
Is my understanding correct?