The Cortex-M1 processor from ARM can execute both Thumb and ARM instructions. The processor starts up in Thumb mode by default, but it is possible to switch between Thumb and ARM modes at runtime. This allows mixing 16-bit Thumb and 32-bit ARM instructions to optimize code size and performance.
Thumb vs ARM Instructions
Thumb instructions are 16-bit, allowing higher code density than 32-bit ARM instructions. Thumb code is approximately 65% the size of ARM code. Thumb mode also reduces memory bandwidth since instructions are half the size.
However, Thumb has fewer registers available and limited instruction set support compared to ARM mode. Some operations require more instructions in Thumb mode. ARM instructions can potentially have better performance due to more registers and a richer instruction set.
Switching Modes
The Cortex-M1 implements the Thumb instruction set architecture. This means execution starts in Thumb mode after reset. The processor remains in Thumb mode until a switch to ARM mode is performed.
Switching from Thumb to ARM mode is done using the BX instruction with a register containing an ARM instruction address. This performs a branch and exchange, changing the instruction set state bit to switch modes.
To switch back to Thumb mode, the instruction address least significant bit is set to 1 when branching. This indicates a Thumb address, changing the mode bit back. /* Assembly code example */ MOVS R0, #0x1 // R0 = 0x00000001, ARM mode address BX R0 // Branch and exchange, switch to ARM mode … // ARM instructions MOVS R0, #0x2 // R0 = 0x00000002, Thumb mode address BX R0 // Branch and exchange, switch back to Thumb mode
Startup Mode Configuration
There are a few ways to control the initial instruction set mode on Cortex-M1 reset:
Option 1: Vector Table Offset
The processor resets to the address pointed to by the vector table offset register. This value determines the reset mode:
- An odd offset indicates a Thumb address, resetting to Thumb mode
- An even offset indicates an ARM address, resetting to ARM mode
/* Assembly code example */ .word 0x0000001 // Vector table entries … .word reset_handler // Reset vector … reset_handler: // Startup code
In this example, the vector table offset register is initialized to an odd value. This causes reset in Thumb mode and execution of the Thumb reset handler.
Option 2: Vector Table Entry
The reset vector table entry can contain the address of either a Thumb or ARM reset handler. This determines the initial mode on reset:
- Thumb reset handler address = Thumb mode
- ARM reset handler address = ARM mode
/* Assembly code example */ .word 0x00000000 // Vector table offset (even value) … .word reset_handler // Reset vector, Thumb address … reset_handler: // Startup code in Thumb mode
Here the vector table offset is even, but the reset handler address is Thumb. So after reset the processor enters Thumb mode.
Option 3: CORTEX_M1_AIRCR Register
The Application Interrupt and Reset Control Register has a VECTRESET bit to indicate reset mode:
- 0 = Thumb mode on reset
- 1 = ARM mode on reset
This overrides the vector table settings described above. The register can be initialized at startup: /* C code example */ #define SCS_AIRCR (*(volatile uint32_t*)0xE000ED0C) // AIRCR Address int main(void) { SCS_AIRCR = SCS_AIRCR | (1<<11); // Set VECTRESET bit 11 // Rest of startup }
Setting the VECTRESET bit forces ARM mode on reset, regardless of the vector table configuration.
Mixed Mode Code
For Cortex-M1, it is typically best to stay in Thumb mode and use ARM code minimally. The Thumb instruction set is sufficient for most tasks. ARM mode may be needed for specialized operations or optimized math routines.
Some tips for mixed Thumb/ARM code:
- Minimize switches between modes – perform Thumb/ARM transitions as infrequently as possible
- Use ARM code in performance critical sections
- Use Thumb code for more compact size in non-critical paths
- Place Thumb-to-ARM switch code in Thumb sections
- Place ARM-to-Thumb switch code in ARM sections
Here is an example of mixed mode code for Cortex-M1: /* Assembly code example */ .thumb thumb_func: // Thumb code arm_func: BX LR // Return to Thumb .thumb main: BL arm_func // Switch to ARM // More Thumb code .arm arm_func: // ARM code BX R1 // Return to Thumb
This implements main() in Thumb mode, calls an ARM mode function arm_func(), then returns to Thumb code. The mode transitions are handled by the BX instructions in each function.
Instruction Fetching
The Cortex-M1 fetches instructions via its instruction interface. This contains a single 32-bit read-only bus, ICODE, which is used for both Thumb and ARM instruction fetches.
When executing Thumb code, the processor performs 16-bit fetches on the ICODE bus. The minimum ICODE bus size is therefore 16 bits.
For ARM code execution, 32-bit fetches are performed. So ICODE bus width must be 32 bits.
If mixing Thumb and ARM code, the ICODE interface needs to support both 16-bit and 32-bit instruction fetch transactions. A 32-bit bus width is required.
ICODE bus bandwidth usage depends on instruction set mode. Maximum theoretical bandwidth in Thumb mode is 1 word per cycle. In ARM mode, maximum bandwidth is 1 word every 2 cycles.
So ARM mode code may require higher ICODE bus bandwidth compared to equivalent Thumb code. This should be considered in system design.
The instruction cache on Cortex-M1 can help reduce average ICODE bandwidth by prefetching and buffering instructions. Its operation is transparent regardless of Thumb or ARM modes.
ICODE Interface
The ICODE bus interface consists of:
- Address bus – 32-bit address for instruction fetch requests
- Data bus – 32-bit instruction data for returning fetch results
- Control signals – Handshaking signals like ICODE peripheral ready and processor read enable
The processor drives the address and read enables during instruction fetches. The peripheral returns instructions on the data bus with handshake signaling.
For efficient instruction delivery, peripherals like Flash memory controllers can support the variable width accesses required by Cortex-M1 in Thumb/ARM modes.
Using a 32-bit ICODE data bus optimizes performance for mixed mode code by supporting full ARM bandwidth. The peripheral just needs to return 16-bit Thumb instructions in the lower half-word for 16-bit accesses.
Performance Considerations
The choice between Thumb and ARM modes involves a tradeoff between code density and performance:
- Thumb code is smaller but has reduced throughput due to limited registers and single 16-bit instructions.
- ARM code has larger size but can execute more operations per cycle thanks to 32-bit instructions and extra registers.
For real-time applications, the execution timing of code sections should also be evaluated. Thumb code takes more cycles but has predictable timing. ARM execution time may vary more due to pipelines and out-of-order effects.
In general, processor bandwidth utilization improves when using ARM instead of Thumb. However, ARM code density suffers. A blend of compact Thumb code along with ARM optimizations in key areas can help balance size and speed.
The Cortex-M1 allows mixing Thumb and ARM modes to take advantage of both instruction sets. Profile guided analysis during development can determine where ARM boosts performance over Thumb implementations.
Summary
The Cortex-M1 supports both 16-bit Thumb and 32-bit ARM instruction sets. It begins execution in Thumb mode after reset but can change modes during runtime.
Startup mode can be configured through the vector table offset, reset handler address, or AIRCR register VECTRESET bit.
Thumb code provides high density but has reduced performance versus ARM. A combination of the two instruction sets works best on Cortex-M1 in most cases.
The instruction fetch interface needs a 32-bit data bus to support full ARM bandwidth. However, instruction caching can help minimize average bus utilization.
Balancing ARM and Thumb usage along with an efficient ICODE interface allows building both compact and high performance Cortex-M1 systems.