Setting Up Thumb vs ARM Instruction Fetching on Cortex-M1

The Cortex-M1 processor from ARM can execute both Thumb and ARM instructions. The processor starts up in Thumb mode by default, but it is possible to switch between Thumb and ARM modes at runtime. This allows mixing 16-bit Thumb and 32-bit ARM instructions to optimize code size and performance.

Contents

Thumb vs ARM Instructions Switching Modes Startup Mode Configuration Option 1: Vector Table Offset Option 2: Vector Table Entry Option 3: CORTEX_M1_AIRCR Register Mixed Mode Code Instruction Fetching ICODE Interface Performance Considerations Summary

Thumb vs ARM Instructions

Thumb instructions are 16-bit, allowing higher code density than 32-bit ARM instructions. Thumb code is approximately 65% the size of ARM code. Thumb mode also reduces memory bandwidth since instructions are half the size.

However, Thumb has fewer registers available and limited instruction set support compared to ARM mode. Some operations require more instructions in Thumb mode. ARM instructions can potentially have better performance due to more registers and a richer instruction set.

Switching Modes

The Cortex-M1 implements the Thumb instruction set architecture. This means execution starts in Thumb mode after reset. The processor remains in Thumb mode until a switch to ARM mode is performed.

Switching from Thumb to ARM mode is done using the BX instruction with a register containing an ARM instruction address. This performs a branch and exchange, changing the instruction set state bit to switch modes.

To switch back to Thumb mode, the instruction address least significant bit is set to 1 when branching. This indicates a Thumb address, changing the mode bit back. /* Assembly code example */ MOVS R0, #0x1 // R0 = 0x00000001, ARM mode address BX R0 // Branch and exchange, switch to ARM mode … // ARM instructions MOVS R0, #0x2 // R0 = 0x00000002, Thumb mode address BX R0 // Branch and exchange, switch back to Thumb mode

Startup Mode Configuration

There are a few ways to control the initial instruction set mode on Cortex-M1 reset:

Option 1: Vector Table Offset

The processor resets to the address pointed to by the vector table offset register. This value determines the reset mode:

An odd offset indicates a Thumb address, resetting to Thumb mode

An even offset indicates an ARM address, resetting to ARM mode

/* Assembly code example */ .word 0x0000001 // Vector table entries … .word reset_handler // Reset vector … reset_handler: // Startup code

In this example, the vector table offset register is initialized to an odd value. This causes reset in Thumb mode and execution of the Thumb reset handler.

Option 2: Vector Table Entry

The reset vector table entry can contain the address of either a Thumb or ARM reset handler. This determines the initial mode on reset:

Thumb reset handler address = Thumb mode
ARM reset handler address = ARM mode

/* Assembly code example */ .word 0x00000000 // Vector table offset (even value) … .word reset_handler // Reset vector, Thumb address … reset_handler: // Startup code in Thumb mode

Here the vector table offset is even, but the reset handler address is Thumb. So after reset the processor enters Thumb mode.

Option 3: CORTEX_M1_AIRCR Register

The Application Interrupt and Reset Control Register has a VECTRESET bit to indicate reset mode:

0 = Thumb mode on reset
1 = ARM mode on reset

This overrides the vector table settings described above. The register can be initialized at startup: /* C code example */ #define SCS_AIRCR (*(volatile uint32_t*)0xE000ED0C) // AIRCR Address int main(void) { SCS_AIRCR = SCS_AIRCR | (1<<11); // Set VECTRESET bit 11 // Rest of startup }

Setting the VECTRESET bit forces ARM mode on reset, regardless of the vector table configuration.

Mixed Mode Code

For Cortex-M1, it is typically best to stay in Thumb mode and use ARM code minimally. The Thumb instruction set is sufficient for most tasks. ARM mode may be needed for specialized operations or optimized math routines.

Some tips for mixed Thumb/ARM code:

Minimize switches between modes – perform Thumb/ARM transitions as infrequently as possible
Use ARM code in performance critical sections
Use Thumb code for more compact size in non-critical paths

Place Thumb-to-ARM switch code in Thumb sections
Place ARM-to-Thumb switch code in ARM sections

Here is an example of mixed mode code for Cortex-M1: /* Assembly code example */ .thumb thumb_func: // Thumb code arm_func: BX LR // Return to Thumb .thumb main: BL arm_func // Switch to ARM // More Thumb code .arm arm_func: // ARM code BX R1 // Return to Thumb

This implements main() in Thumb mode, calls an ARM mode function arm_func(), then returns to Thumb code. The mode transitions are handled by the BX instructions in each function.

Instruction Fetching

The Cortex-M1 fetches instructions via its instruction interface. This contains a single 32-bit read-only bus, ICODE, which is used for both Thumb and ARM instruction fetches.

When executing Thumb code, the processor performs 16-bit fetches on the ICODE bus. The minimum ICODE bus size is therefore 16 bits.

For ARM code execution, 32-bit fetches are performed. So ICODE bus width must be 32 bits.

If mixing Thumb and ARM code, the ICODE interface needs to support both 16-bit and 32-bit instruction fetch transactions. A 32-bit bus width is required.

ICODE bus bandwidth usage depends on instruction set mode. Maximum theoretical bandwidth in Thumb mode is 1 word per cycle. In ARM mode, maximum bandwidth is 1 word every 2 cycles.

So ARM mode code may require higher ICODE bus bandwidth compared to equivalent Thumb code. This should be considered in system design.

The instruction cache on Cortex-M1 can help reduce average ICODE bandwidth by prefetching and buffering instructions. Its operation is transparent regardless of Thumb or ARM modes.

ICODE Interface

The ICODE bus interface consists of:

Address bus – 32-bit address for instruction fetch requests
Data bus – 32-bit instruction data for returning fetch results
Control signals – Handshaking signals like ICODE peripheral ready and processor read enable

The processor drives the address and read enables during instruction fetches. The peripheral returns instructions on the data bus with handshake signaling.

For efficient instruction delivery, peripherals like Flash memory controllers can support the variable width accesses required by Cortex-M1 in Thumb/ARM modes.

Using a 32-bit ICODE data bus optimizes performance for mixed mode code by supporting full ARM bandwidth. The peripheral just needs to return 16-bit Thumb instructions in the lower half-word for 16-bit accesses.

Performance Considerations

The choice between Thumb and ARM modes involves a tradeoff between code density and performance:

Thumb code is smaller but has reduced throughput due to limited registers and single 16-bit instructions.
ARM code has larger size but can execute more operations per cycle thanks to 32-bit instructions and extra registers.

For real-time applications, the execution timing of code sections should also be evaluated. Thumb code takes more cycles but has predictable timing. ARM execution time may vary more due to pipelines and out-of-order effects.

In general, processor bandwidth utilization improves when using ARM instead of Thumb. However, ARM code density suffers. A blend of compact Thumb code along with ARM optimizations in key areas can help balance size and speed.

The Cortex-M1 allows mixing Thumb and ARM modes to take advantage of both instruction sets. Profile guided analysis during development can determine where ARM boosts performance over Thumb implementations.

Summary

The Cortex-M1 supports both 16-bit Thumb and 32-bit ARM instruction sets. It begins execution in Thumb mode after reset but can change modes during runtime.

Startup mode can be configured through the vector table offset, reset handler address, or AIRCR register VECTRESET bit.

Thumb code provides high density but has reduced performance versus ARM. A combination of the two instruction sets works best on Cortex-M1 in most cases.

The instruction fetch interface needs a 32-bit data bus to support full ARM bandwidth. However, instruction caching can help minimize average bus utilization.

Balancing ARM and Thumb usage along with an efficient ICODE interface allows building both compact and high performance Cortex-M1 systems.

Setting Up Thumb vs ARM Instruction Fetching on Cortex-M1

Thumb vs ARM Instructions

Switching Modes

Startup Mode Configuration

Option 1: Vector Table Offset

Option 2: Vector Table Entry

Option 3: CORTEX_M1_AIRCR Register

Mixed Mode Code

Instruction Fetching

ICODE Interface

Performance Considerations

Summary

More ARM insights right in your inbox

Leave a Reply Cancel reply

You Might Also Like

Are there any practical differences between the Arm M0 and M3 for the C programmer?

What is the difference between ARM Cortex-A55 and A76?

Cortex M0: How to Make the Default crt0.o Startup for GCC

What are the advantages of Cortex-M?