The Arm Cortex-M processors support several pre-indexed addressing modes that allow efficient access to arrays and structured data. These addressing modes automatically increment or decrement an index register as part of the address calculation, reducing code size and increasing performance when accessing sequential data structures or arrays.
Overview of Pre-Indexed Addressing
In pre-indexed addressing modes, the index register value is added to or subtracted from the base address before the memory access, and the updated index value is written back to the index register. This automatically updates the index for the next iteration, eliminating separate instructions to manage the index value.
For example, to access an array of 32-bit values using the index register R1: LDR R0, [R2, R1] ; Read array element into R0 ADD R1, R1, #4 ; Increment index to next element LDR R3, [R2, R1] ; Read next array element into R3
With pre-indexed addressing, this can be done more efficiently as: LDR R0, [R2, R1, LSL #2]! ; Read array element and increment R1 LDR R3, [R2, R1, LSL #2]! ; Read next element and increment R1
The index register R1 is automatically incremented by 4 bytes (the element size) after it is used in the address calculation.
Pre-Indexed Load and Store Instructions
The main Arm Cortex-M pre-indexed addressing modes are provided by the LDR and STR instructions. The key pre-indexed forms are: LDR Rd, [Rn, Rm, shift]! STR Rd, [Rn, Rm, shift]!
These perform the memory access using the address [Rn + shifted(Rm)] and then update Rm with Rm + shift amount. This provides efficient array access without needing to manually update the index register.
The shift field can be omitted or specified as LSL #n for a logical left shift of 0-3 bits. This allows indexing by 1, 2, 4 or 8 bytes.
For example: LDR R1, [R2, R5]! ; Read from base + index and increment R5 by 1 LDR R3, [R4, R8, LSL #3]! ; Read from base + index*8 and increment R8 by 8
Pre-Indexed Loading Multiple Registers
The LDM instruction can load multiple registers from memory using pre-indexed addressing. For example: LDM R1!, {R2, R3, R4} ; Load R2-R4 from [R1], increment R1 by 12 bytes
This provides an efficient way to populate registers from a data structure.
Pre-Indexed Base Register Update
The pre-indexed addressing modes can optionally update the base register instead of the index register by using a writeback! LDR R2, [R1, R3]! ; Load from [R1 + R3], increment R1 by 4 bytes LDR R5, [R6, #8]!! ; Load from [R6 + 8], increment R6 by 8
This can be useful when traversing data structures with a moving base pointer.
Pre-Indexed Addressing Modes Summary
To summarize, the key pre-indexed addressing modes in Arm Cortex-M are:
- LDR/STR with index register update (!)
- LDM for loading multiple registers
- Optional base register update (!!) instead of index
- Support for 1, 2, 4, 8 byte indexing
These optimized addressing modes reduce code size and improve performance when accessing:
- Arrays
- Structures
- Linked lists
- Other sequential data structures
Software overhead for updating indices is eliminated, allowing more efficient code to be generated.
Using Pre-Indexed Addressing Efficiently
Here are some tips for using pre-indexed addressing modes effectively in Arm Cortex-M code:
- Use for array access and data structure traversal instead of manual index updates
- Initialize index registers properly – they are updated automatically!
- Use LDM to load multiple registers from arrays or structures
- Consider using base register update (!!) for moving base pointers
- Choose appropriate element size scaling (LSL #n)
However, there are also some cautions when using pre-indexed addressing:
- Not ideal for random access of array elements
- Index registers may need to be saved/restored if interrupts used
- Can make disassembly harder to understand
So pre-indexing is very useful for sequential accesses and structured data, but not always appropriate for more general code.
Pre-Indexed vs Post-Indexed Addressing
The Cortex-M instruction set also provides post-indexed addressing modes using the same LDR/STR/LDM instructions. With post-indexing, the index register update happens after the memory access instead of before.
Post-indexing can be useful when you want to access the original address before updating the index. However, pre-indexing is generally better when sequentially accessing arrays or iterating through data structures.
The post-indexed addressing forms use a different syntax: LDR R1, [R2], #4 ; Load from [R2], then increment R2 by 4 LDR R3, [R4, R5], R5 ; Load using [R4 + R5], then increment R5
So in summary:
- Pre-indexed: index updated first, then memory access
- Post-indexed: memory access first, then index updated
Choose appropriately depending on whether you need the original address or updated address for the next access.
Pre-Indexing and the Stack Pointer
Pre-indexed addressing modes work very well with the stack pointer (SP) to efficiently push and pop data to/from the stack. For example: PUSH {R4-R8} ; Pre-indexed writeback to SP POP {R4-R7} ; Pre-indexed writeback to SP
This builds upon the existing Full Descending Stack operation of the SP to optimize stack operations. The Cortex-M stack grows down from high addresses to low addresses.
Some key advantages of using pre-indexed push/pop with the SP:
- Single instruction to push or pop multiple registers
- Atomic push/pop operation
- Automatically updates SP
- Efficient for function prologs and epilogs
So in summary, pre-indexed addressing with the SP provides very efficient stack manipulation in Cortex-M assembly code or compiler output.
Pre-Indexing in C Code
While pre-indexing is commonly used directly in assembly code, C compilers for Cortex-M can also make use of it when appropriate. For example: int buf[64]; for (int i = 0; i < 64; i++) { buf[i] = i; // Array access }
A C compiler may compile the array access in the loop using LDR with pre-indexed writeback to optimize indexing and avoid extra instructions to manage the counter i.
So pre-indexing can be leveraged implicitly by compiling C code, as well as explicitly in hand-written assembly.
Pre-Indexing and DSP Instructions
On Cortex-M processors that support DSP instructions, pre-indexed addressing can also be used effectively with DSP load/store instructions like LDRD and STRD. For example: LDRD R2, R3, [R1], #8 ; Dual load with index update STRD R2, R3, [R4], #-8 ; Dual store with index update
This allows updating pointer registers during DSP operations like FIR filtering, matrix multiplication, FFTs, and more. Pre-indexing helps keep the code simple and efficient.
Conclusion
In summary, pre-indexed addressing is an important and powerful feature of the Arm Cortex-M instruction set. It enables simple and efficient access to structured and array data without the overhead of managing index registers explicitly.
Pre-indexing with load/store instructions and LDM provides significant performance and code size benefits for sequential data access. Combined with the stack pointer, it also enables very fast stack manipulation.
Software engineers should leverage pre-indexed addressing when appropriate for Cortex-M programming. It can be used directly in assembly code, as well as being utilized automatically by a good C compiler.