SoC
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
  • Arm Cortex M3
  • Contact
Reading: Forcing GCC to generate only Thumb-16 instructions for Cortex-M
SUBSCRIBE
SoCSoC
Font ResizerAa
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Search
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Have an existing account? Sign In
Follow US
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
© S-O-C.ORG, All Rights Reserved.
Arm

Forcing GCC to generate only Thumb-16 instructions for Cortex-M

Ryan Ryan
Last updated: September 17, 2023 8:23 am
Ryan Ryan 6 Min Read
Share
SHARE

The Cortex-M processors from ARM only support the Thumb-16 instruction set. By default, the GCC compiler will generate a mix of 32-bit Thumb-2 and 16-bit Thumb instructions when compiling code for Cortex-M. However, it is possible to force GCC to emit only 16-bit Thumb instructions using compiler flags.

Contents
Why limit code to Thumb-16 on Cortex-M?GCC options to force Thumb-16 code generation-mthumb -mno-thumb-interwork-mthumb -mthumb-interworkVerifying Thumb-16 code generationGCC optimizations with Thumb-16When to avoid forcing Thumb-16 codeThumb-16 instruction issues to watch out forSpecial considerations for C++ codeConclusion

Why limit code to Thumb-16 on Cortex-M?

There are a few reasons why you may want to restrict the generated code to only use the 16-bit Thumb instruction set when targeting Cortex-M processors:

  • Smaller code size – Forcing Thumb-16 can reduce code size substantially compared to allowing Thumb-2 instructions.
  • Avoid unexpected faults – Some Cortex-M models do not support Thumb-2 instructions and will fault if encountered.
  • Workaround errata – On some Cortex-M implementations, certain Thumb-2 instructions need to be avoided due to silicon bugs.
  • Ease debugging – Single-stepping through 16-bit Thumb code on Cortex-M is simpler than mixed Thumb and Thumb-2 code.

For embedded applications where code size and reliability are critical, using only Thumb-16 instructions can be advantageous despite the loss of some optimization opportunities compared to Thumb-2.

GCC options to force Thumb-16 code generation

GCC provides two main options to restrict code generation to only use 16-bit Thumb instructions when compiling for Cortex-M:

-mthumb -mno-thumb-interwork

Passing -mthumb tells GCC to generate Thumb code instead of ARM code. Adding -mno-thumb-interwork disables GCC’s generation of wrapper code to transition between ARM and Thumb states. This prevents GCC from using any 32-bit instructions. gcc -mthumb -mno-thumb-interwork [other options] files

-mthumb -mthumb-interwork

Using -mthumb along with -mthumb-interwork is an alternative approach to get 16-bit Thumb-only code. The interworking support inserts wrappers around called functions to transition into Thumb state. With interworking disabled globally, GCC emits only Thumb-16 instructions. gcc -mthumb -mthumb-interwork [other options] files

Verifying Thumb-16 code generation

It’s important to verify that GCC obeyed the request to restrict code generation to Thumb-16 after compilation. Here are some ways to confirm only 16-bit Thumb instructions were emitted:

  • Check assembly listing – Look for absence of 32-bit Thumb-2 instructions.
  • Disassemble object file – Use objdump or similar to examine instruction encoding.
  • Toggle LED on run – If code runs on Cortex-M, Thumb-2 usage will fault.
  • Examine map file – Code sections should take up half the space compared to Thumb-2.

Failing to verify proper Thumb-16 code generation could result in unexpected crashes or faults at runtime on Cortex-M devices.

GCC optimizations with Thumb-16

Limiting GCC to Thumb-16 often increases code size and reduces performance versus allowing Thumb-2 instructions. However, GCC still applies some optimizations when compiling for Thumb-16:

  • Constant propagation – Replace variables with constant values when known.
  • Common subexpression elimination – Cache duplicate calculations in registers.
  • Code hoisting – Move loop invariants outside of loops.
  • Branch optimizations – Convert branches to less costly instructions.
  • Peephole optimizations – Improve instruction scheduling and pairing.

Higher levels of optimization like -O2 or -O3 can achieve additional gains but increase compile time. Benchmarking is needed to determine if the extra optimization provides worthwhile gains.

When to avoid forcing Thumb-16 code

While limiting compilation to Thumb-16 can be beneficial for Cortex-M, there are also cases where it may not be ideal:

  • Code size not critical – Thumb-2 provides better performance without major size impact.
  • Speed is very important – Thumb-2 has better optimizations than Thumb-16.
  • Cortex-M4/M7 models – These support Thumb-2 so no need to restrict instructions.
  • Lots of floating point code – Thumb-2 has better FP support.
  • Just prototyping – Extra optimization effort not warranted.

In performance sensitive applications where code size is not a major constraint, allowing Thumb-2 instructions can provide a noticeable speed boost.

Thumb-16 instruction issues to watch out for

When limiting compilation to Thumb-16 instructions, there are some code patterns and GCC behavior to keep in mind:

  • Larger switch statements – Table-based switch jumps use Thumb-2 instructions.
  • Structure returns – Functions that return structs will use Thumb-2.
  • Larger loops – Loop setup may require Thumb-2 instructions.
  • Global register variables – Can produce interworking Thumb-2 code.
  • Tail call optimization – This Thumb-2 feature will be disabled.

Identifying and working around these issues ensures the resulting code contains only valid Thumb-16 instructions for Cortex-M.

Special considerations for C++ code

C++ code often requires some extra effort to compile into Thumb-16 instructions compared to plain C code.

  • Use -fno-rtti and -fno-exceptions to disable C++ RTTI and exceptions.
  • Avoid multiple inheritance and virtual methods.
  • Be careful with templates, inline functions, and runtime type identification.
  • Extensive use of C++ exceptions/RTTI may require Thumb-2.

With care taken to avoid C++ features that emit Thumb-2 instructions, it is possible to generate relatively efficient Thumb-16 code from C++ for Cortex-M.

Conclusion

Forcing GCC to emit only 16-bit Thumb instructions requires using -mthumb along with either -mno-thumb-interwork or -mthumb-interwork. This approach can produce smaller and more reliable code for Cortex-M at the cost of reduced optimization opportunities compared to Thumb-2. With appropriate benchmarking and testing, Thumb-16 code can deliver an optimal blend of size and performance for resource constrained Cortex-M applications.

Newsletter Form (#3)

More ARM insights right in your inbox

 


Share This Article
Facebook Twitter Email Copy Link Print
Previous Article Vendor SDK/driver bugs causing Hard Faults
Next Article Differences between Thumb-16 and Thumb-2 instruction sets
Leave a comment Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

2k Followers Like
3k Followers Follow
10.1k Followers Pin
- Sponsored-
Ad image

You Might Also Like

Explanation of (Cortex-M3) STM32F1 Boot Modes and Memory Mapping

The STM32F1 series of microcontrollers based on the Cortex-M3 core…

7 Min Read

What is EK TM4C123GXL?

The EK TM4C123GXL is a low-cost evaluation board based on…

6 Min Read

Why Are Arm Processors So Popular?

Arm processors have become ubiquitous in modern technology, powering everything…

12 Min Read

Dangers of Using Bit Banding for Peripheral Register Access in ARM Cortex M3

Bit banding is a useful feature in ARM Cortex M3…

5 Min Read
SoCSoC
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
Welcome Back!

Sign in to your account