GCC 4.2 introduced “
-march=native”. With this option, GCC automatically optimizes for local computer.
I used to think GCC would simply replace it with an appropriate “
-march=....” based on results of
CPUID, so, for example, it would be equivalent to “
-march=core2” on a Core 2 CPU.
Actually GCC does more than I guessed. We can observe what GCC actually replaces it with:
gcc -march=native -c -o /dev/null -x c -
in one terminal;
- Open another terminal and type
ps af | grep cc1
Then we will see how carefully GCC detects our CPU.
I tried this using GCC 4.4.1 on a computer equipped with a Core i3 CPU, and it turns out “
-march=core2 -mcx16 -msahf -mpopcnt -msse4.2 --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=256 -mtune=core2
on this computer. It says:
- We can use SSE4.2 instructions,
SAHF/LAHF(in 64-bit mode),
POPCNT, in addition to all instructions available on a Core 2;
- The size of L1 and L2 caches are 32KiB and 256KiB, respectively; each L1 cache line is 64 bytes;
- Use instructions that are most suitable for Core 2 (Sure Core i3 is different from Core 2, but it is too new to have specific optimization support in compilers).
So, the conclusion: always use “
-march=native” when compiling codes for local use.