SoC
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
  • Arm Cortex M3
  • Contact
Reading: Forcing GCC to generate only Thumb-16 instructions for Cortex-M
SUBSCRIBE
SoCSoC
Font ResizerAa
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Search
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Have an existing account? Sign In
Follow US
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
© S-O-C.ORG, All Rights Reserved.
Arm

Forcing GCC to generate only Thumb-16 instructions for Cortex-M

Ryan Ryan
Last updated: September 17, 2023 8:23 am
Ryan Ryan 6 Min Read
Share
SHARE

The Cortex-M processors from ARM only support the Thumb-16 instruction set. By default, the GCC compiler will generate a mix of 32-bit Thumb-2 and 16-bit Thumb instructions when compiling code for Cortex-M. However, it is possible to force GCC to emit only 16-bit Thumb instructions using compiler flags.

Contents
Why limit code to Thumb-16 on Cortex-M?GCC options to force Thumb-16 code generation-mthumb -mno-thumb-interwork-mthumb -mthumb-interworkVerifying Thumb-16 code generationGCC optimizations with Thumb-16When to avoid forcing Thumb-16 codeThumb-16 instruction issues to watch out forSpecial considerations for C++ codeConclusion

Why limit code to Thumb-16 on Cortex-M?

There are a few reasons why you may want to restrict the generated code to only use the 16-bit Thumb instruction set when targeting Cortex-M processors:

  • Smaller code size – Forcing Thumb-16 can reduce code size substantially compared to allowing Thumb-2 instructions.
  • Avoid unexpected faults – Some Cortex-M models do not support Thumb-2 instructions and will fault if encountered.
  • Workaround errata – On some Cortex-M implementations, certain Thumb-2 instructions need to be avoided due to silicon bugs.
  • Ease debugging – Single-stepping through 16-bit Thumb code on Cortex-M is simpler than mixed Thumb and Thumb-2 code.

For embedded applications where code size and reliability are critical, using only Thumb-16 instructions can be advantageous despite the loss of some optimization opportunities compared to Thumb-2.

GCC options to force Thumb-16 code generation

GCC provides two main options to restrict code generation to only use 16-bit Thumb instructions when compiling for Cortex-M:

-mthumb -mno-thumb-interwork

Passing -mthumb tells GCC to generate Thumb code instead of ARM code. Adding -mno-thumb-interwork disables GCC’s generation of wrapper code to transition between ARM and Thumb states. This prevents GCC from using any 32-bit instructions. gcc -mthumb -mno-thumb-interwork [other options] files

-mthumb -mthumb-interwork

Using -mthumb along with -mthumb-interwork is an alternative approach to get 16-bit Thumb-only code. The interworking support inserts wrappers around called functions to transition into Thumb state. With interworking disabled globally, GCC emits only Thumb-16 instructions. gcc -mthumb -mthumb-interwork [other options] files

Verifying Thumb-16 code generation

It’s important to verify that GCC obeyed the request to restrict code generation to Thumb-16 after compilation. Here are some ways to confirm only 16-bit Thumb instructions were emitted:

  • Check assembly listing – Look for absence of 32-bit Thumb-2 instructions.
  • Disassemble object file – Use objdump or similar to examine instruction encoding.
  • Toggle LED on run – If code runs on Cortex-M, Thumb-2 usage will fault.
  • Examine map file – Code sections should take up half the space compared to Thumb-2.

Failing to verify proper Thumb-16 code generation could result in unexpected crashes or faults at runtime on Cortex-M devices.

GCC optimizations with Thumb-16

Limiting GCC to Thumb-16 often increases code size and reduces performance versus allowing Thumb-2 instructions. However, GCC still applies some optimizations when compiling for Thumb-16:

  • Constant propagation – Replace variables with constant values when known.
  • Common subexpression elimination – Cache duplicate calculations in registers.
  • Code hoisting – Move loop invariants outside of loops.
  • Branch optimizations – Convert branches to less costly instructions.
  • Peephole optimizations – Improve instruction scheduling and pairing.

Higher levels of optimization like -O2 or -O3 can achieve additional gains but increase compile time. Benchmarking is needed to determine if the extra optimization provides worthwhile gains.

When to avoid forcing Thumb-16 code

While limiting compilation to Thumb-16 can be beneficial for Cortex-M, there are also cases where it may not be ideal:

  • Code size not critical – Thumb-2 provides better performance without major size impact.
  • Speed is very important – Thumb-2 has better optimizations than Thumb-16.
  • Cortex-M4/M7 models – These support Thumb-2 so no need to restrict instructions.
  • Lots of floating point code – Thumb-2 has better FP support.
  • Just prototyping – Extra optimization effort not warranted.

In performance sensitive applications where code size is not a major constraint, allowing Thumb-2 instructions can provide a noticeable speed boost.

Thumb-16 instruction issues to watch out for

When limiting compilation to Thumb-16 instructions, there are some code patterns and GCC behavior to keep in mind:

  • Larger switch statements – Table-based switch jumps use Thumb-2 instructions.
  • Structure returns – Functions that return structs will use Thumb-2.
  • Larger loops – Loop setup may require Thumb-2 instructions.
  • Global register variables – Can produce interworking Thumb-2 code.
  • Tail call optimization – This Thumb-2 feature will be disabled.

Identifying and working around these issues ensures the resulting code contains only valid Thumb-16 instructions for Cortex-M.

Special considerations for C++ code

C++ code often requires some extra effort to compile into Thumb-16 instructions compared to plain C code.

  • Use -fno-rtti and -fno-exceptions to disable C++ RTTI and exceptions.
  • Avoid multiple inheritance and virtual methods.
  • Be careful with templates, inline functions, and runtime type identification.
  • Extensive use of C++ exceptions/RTTI may require Thumb-2.

With care taken to avoid C++ features that emit Thumb-2 instructions, it is possible to generate relatively efficient Thumb-16 code from C++ for Cortex-M.

Conclusion

Forcing GCC to emit only 16-bit Thumb instructions requires using -mthumb along with either -mno-thumb-interwork or -mthumb-interwork. This approach can produce smaller and more reliable code for Cortex-M at the cost of reduced optimization opportunities compared to Thumb-2. With appropriate benchmarking and testing, Thumb-16 code can deliver an optimal blend of size and performance for resource constrained Cortex-M applications.

Newsletter Form (#3)

More ARM insights right in your inbox

 


Share This Article
Facebook Twitter Email Copy Link Print
Previous Article Vendor SDK/driver bugs causing Hard Faults
Next Article Differences between Thumb-16 and Thumb-2 instruction sets
Leave a comment Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

2k Followers Like
3k Followers Follow
10.1k Followers Pin
- Sponsored-
Ad image

You Might Also Like

How much memory does the Cortex-M3 have?

The Cortex-M3 is an ARM processor core that is targeted…

5 Min Read

What is the MMU of the ARM processor?

The MMU (Memory Management Unit) is a key component of…

8 Min Read

What are Helium vector instructions in Arm Cortex-M series?

Helium vector instructions are a new set of SIMD instructions…

7 Min Read

What registers to save in the ARM C calling convention?

The ARM C calling convention defines how functions should be…

7 Min Read
SoCSoC
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
Welcome Back!

Sign in to your account