When compiling code for ARM processors using the gcc toolchain, it is important to set the correct CPU target in order to generate optimized code. The CPU target tells gcc what ARM architecture and features to target, so it can take advantage of hardware capabilities while avoiding unsupported instructions. This article provides an overview of how to set CPU targets for ARM compilation with gcc and considerations when choosing a target.
ARM CPU Targets
The ARM architecture has evolved over many years, with new extensions added progressively to the instruction set. The ARM compiler in gcc includes multiple CPU targets for the various ARM architecture profiles and implementations. The key gcc CPU target options include:
- armv7-a – For Cortex-A series processors
- armv7-r – For Cortex-R series processors
- armv7-m – For Cortex-M series processors
- armv6 – For ARM11 cores without Thumb-2 technology
- armv5te – For earlier ARM9/ARM10 cores
Within each architecture profile, there are additional options for specific CPU variants. For example, Cortex-A targets include cortex-a5, cortex-a7, cortex-a8, cortex-a9, cortex-a15, cortex-a17 among others. The specific cpu target enables gcc to tune for that processor’s features.
Setting the CPU Target
The gcc and g++ compilers allow setting the ARM cpu target using the -mcpu option. For example: gcc -mcpu=cortex-a9 file.c -o output
This will compile file.c and optimize it for a Cortex-A9 cpu.
The CPU target can also be set at the top of a source file using gcc’s built-in define: #ifdef __ARM_ARCH #if __ARM_ARCH >= 7 #define TARGET_CPU cortex-a9 #else #define TARGET_CPU arm1136jf-s #endif #endif
Then when compiling pass -D__ARM_ARCH__=7 to set the TARGET_CPU define.
Considerations When Choosing a CPU Target
There are some important considerations when selecting which CPU target in gcc to use:
- Choose the target that matches the ARM processor being compiled for, if known. Using the explicit target enables all relevant optimizations for that CPU.
- If the exact CPU is unknown, choose the lowest common target that will work across the potential processors. For example, Cortex-A5 code will run on Cortex-A9, so cortex-a5 can be selected for broad compatibility.
- Newer targets are not always better. For example, a later Cortex-A15 may perform worse than a Cortex-A9 for a code base that doesn’t use advanced Cortex-A15 features.
- Higher targets result in larger code size as more optimization options are enabled. Ensure there is sufficient flash storage for the target code size.
- Some instructions require an ARM architecture version 7 (armv7-a) target even if the processor only supports version 6. Select the minimum architecture version needed.
GCC ARM CPU Target Feature Options
In addition to the CPU target, gcc has options for targeting specific ARM architecture extensions. This allows further fine tuning of the compilation process. Some key options include:
- -mfpu=name – Specify ARM floating-point architecture version (vfpv2, vfpv3, vfpv3-d16, vfpv4, fpv5-d16, neon, neon-vfpv4, neon-fp16, etc)
- -mfloat-abi=option – Select floating-point ABI (hard/softfp)
- -mthumb – Generate thumb instruction set instead of regular 32-bit ARM
- -mthumb-interwork – Allow mix of arm and thumb instructions
- -mtune=name – Tune without enabling extra instructions of cpu name
For example, to generate thumb code optimized for a Cortex-M4 processor with floating point: gcc -mcpu=cortex-m4 -mthumb -mfpu=fpv4-sp-d16 -mfloat-abi=hard file.c -o output
Determining the Optimal CPU Target
To determine the optimal gcc CPU target for an ARM platform, follow these steps:
- Identify the ARM processor version – Search processor datasheets for architecture, CPU name, ARM core revisions etc.
- List processor features – Note key features like Thumb-2, FPU, DSP extensions, ARM architecture version.
- Map features to gcc targets – Use the gcc ARM options documentation to find the best match.
- Test compile with target – Try compiling a code sample with the target on the hardware or emulator.
- Measure performance and code size – Validate if the performance and code size meet requirements.
- Refine target if needed – Try tweaking target options or picking a higher or lower target if issues are seen.
With an iterative approach and performance measurement, an optimal balance of code size and efficiency can be found.
Examples of CPU Target Selection
Here are some examples of how to select a good default gcc CPU target for common ARM processors:
- ARM Cortex-M3 – A Cortex-M3 is a common microcontroller with thumb-2, usually no FPU. Good default is likely
-mcpu=cortex-m3
- ARM11 MPCore – ARM11 is an older design without thumb-2.
-mcpu=arm1136j-s
is a good choice. - Cortex-A8 – A common application processor.
-mcpu=cortex-a8
will enable necessary optimizations. - Cortex-R4 – For real-time Cortex-R series, try
-mcpu=cortex-r4
.
For SoC’s with custom ARM CPU cores, the ARM architecture version and features should be determined to pick the best generic target.
Checking GCC ARM CPU Target Selection
To verify the GCC ARM compiler is selecting the expected target architecture and tuning options, compile with -v -Q
to emit verbose output. The compiler will show the ARM options selected and can be checked versus the intended target.
In addition, cross-reference the assembly output to confirm instructions or features targeted are being generated by the compiler. For example, Thumb-2 specific instructions in the output would indicate Thumb mode is active.
Troubleshooting Issues with ARM CPU Targets
Some common issues that may arise with ARM CPU target selection in gcc and troubleshooting tips include:
- Code size too large – Pick a lower target, ensure unused code is eliminated, or use thumb mode.
- Unsupported instruction errors – Compiler targeted an instruction not on the CPU. Select a lower target.
- Illegal instruction crashes – Same cause as above. Ensure target matches chip.
- Slow performance – Try a higher target, enable floating point or other missing features.
- Features not enabled – Verify target includes all expected capabilities.
Adding compiler option -mwarn-unused
can also help highlight unsupported instruction errors.
Default ARM Targets for Common Compilation Scenarios
For common ARM compilation scenarios, here are some typical default CPU target options to consider as a starting point:
- Bare-metal ARM – Start with
-mcpu=arm7tdmi
and raise target to suit. - Linux ARM user-space –
-mcpu=generic-armv7-a
good baseline, if architecture known compile per-target. - Linux ARM kernel – Start with
-mcpu=arm926ej-s
or-mcpu=arm1136j-s
. - RTOS on Cortex-M – Use
-mcpu=cortex-m3
or similar for Cortex-M target. - Bare-metal on Cortex-R – Try
-mcpu=cortex-r4
and add/remove features as needed.
The specific CPU model or ARM core version should always be used for the target if known.
Conclusion
Selecting the optimal CPU target for ARM code generation with gcc involves finding the best match for the target hardware capabilities. Newer targets are not necessarily better if unused features bloat the code size or slow performance. Validate the compiler target selection by inspecting compilation artifacts and measuring efficiency. Tuning gcc’s extensive options for the ARM architecture can help produce efficient code tailored to the target platform.