When mixing ARM and Thumb code on the Cortex M3, it is important to be aware of ARM/Thumb interworking issues that can arise. Careful coding practices can avoid problems with interworking and keep your code running efficiently.
ARM and Thumb Instruction Sets
The Cortex M3 processor supports two instruction sets – the 32-bit ARM set and the 16-bit Thumb set. The Thumb set has higher code density while the ARM set has more powerful instructions. The processor automatically switches between the two instruction sets using interworking branches.
Interworking happens transparently in hardware using BX instructions. When executing a BX instruction, the processor examines bit 0 to determine whether to switch instruction sets. BX with bit 0 set will switch to Thumb, while BX with bit 0 clear stays in ARM mode.
Interworking Branches
The Cortex M3 supports two types of interworking branches: indirect and state change. Indirect branches use the BX instruction to switch instruction sets based on a register value. State change branches use the BLX instruction to both change instruction sets and perform a branch.
For example: BX R1 ; R1 contains target address BLX R2 ; R2 contains target address
The processor handles interworking branches by checking bit 0 of the target address to determine the instruction set for the destination. If bit 0 is set, the branch will switch to Thumb mode before performing the branch.
Interworking Procedure Calls
Interworking is commonly used for procedure calls between ARM and Thumb code. By convention, Thumb procedures have their least significant bit set to 1 in their addresses. When performing a BL or BLX call to a Thumb procedure, the compiler will set bit 0 to signal a switch to Thumb mode.
For example: ; Call Thumb procedure BL thumb_proc ; Thumb procedure address thumb_proc: …
The BL instruction to thumb_proc will switch to Thumb mode before branching. The procedure can then execute in 16-bit Thumb mode for better code density.
Interworking Caveats
While interworking between ARM and Thumb code takes place automatically, there are some caveats to be aware of:
- Branches that do not switch mode incur a 5 cycle penalty on Cortex M3. Direct ARM to ARM or Thumb to Thumb branches are faster.
- Switching instruction sets flushes the processor pipeline. This can reduce performance.
- Thumb code cannot use PC-relative addressing, since the PC contains an ARM address.
- Tail-chaining optimizations cannot be performed across ARM/Thumb boundaries.
Guidelines for Efficient Interworking
Follow these guidelines to avoid performance issues with ARM/Thumb interworking on Cortex M3:
- Use BLX instead of BL when branching to Thumb code from ARM code. BLX performs an interworking switch but does not incur a branch penalty.
- Avoid frequent switching back and forth between ARM and Thumb. Performance is best when code stays in one mode for longer sequences.
- Use indirect Thumb branches via BX rather than BLX when branching within Thumb code. This avoids flushing the pipeline.
- Place performance-critical ARM code in ARM sections of memory and Thumb code in Thumb sections. This reduces mode switching.
- Assign ARM and Thumb procedures in separate address ranges. This makes it easy to identify interworking branches.
Debugging Interworking Issues
If encountering strange code execution or crashes with mixed ARM and Thumb code, interworking bugs may be the culprit. Here are some debugging tips for interworking issues:
- Examine disassembly to check if branches are using the expected instruction set encoding.
- Toggle bit 0 of branch target addresses and look for changes in behavior. Bit 0 controls interworking mode switches.
- Use debugger to step through problematic branches and verify instruction set transitions.
- Insert asm comments to label ARM vs Thumb code regions and mark interworking call sites.
- Check assembler directives and compiler options controlling interworking code generation.
Interworking Example
Here is an example of ARM code calling a Thumb procedure correctly using BLX and BL instructions: ; ARM code MOV R0, #1 ; ARM instruction BLX thumb_proc ; Call Thumb procedure ; Thumb code thumb_proc: PUSH {R1} ; Thumb ADD R1, R0, #2 POP {R1} BX LR ; Return to ARM
Key points:
- thumb_proc address has bit 0 set to indicate Thumb code.
- BLX switches to Thumb mode before branching.
- BX LR returns to ARM code.
- Correct interworking avoids pipeline flushes and penalties.
Conclusion
ARM and Thumb interworking on Cortex M3 can deliver benefits like smaller code size and better performance. However, interworking also introduces complexity that can lead to subtle code issues if not managed properly. Following best practices for interworking code generation and organization can help avoid these problems.
Understanding how interworking branches work and identifying them clearly in disassembly and debugging is key. Performance profiling to minimize unnecessary mode switches is also recommended. With good techniques, ARM/Thumb interworking can be used efficiently and safely on Cortex M3 designs.