The ARM Cortex-M3 is a 32-bit reduced instruction set computing (RISC) processor designed for embedded applications. It features the ARMv7-M architecture and Thumb-2 instruction set which provides a balance between high performance, low cost, and low power consumption.
Instruction Set Overview
The Cortex-M3 instruction set is based on the Thumb-2 technology which combines variable length 16-bit and 32-bit instructions. Thumb-2 extends the previous 16-bit Thumb instruction set with additional 32-bit instructions to improve code density and performance. Key features of the Thumb-2 instruction set include:
- 16-bit and 32-bit instructions interworking freely
- Uniform 32-bit instruction encoding reduces code size
- Low overhead conditional execution
- Load/store architecture with single cycle access
In Thumb-2, the 16-bit and 32-bit instructions can be freely intermixed within a program. The 32-bit instructions are encoded with uniform formatting to simplify decoding and reduce code size. Conditional execution is optimized to require only a single bit in the instruction to specify the condition. The load/store architecture provides single cycle access for most instructions resulting in high performance.
Data Processing Instructions
Data processing instructions operate on registers and provide basic arithmetic, logical, and move operations. These include:
- Addition and subtraction (ADD, SUB)
- Bitwise logic (AND, ORR, EOR, BIC)
- Bit clearing (CLZ, RBIT)
- Comparison (CMP, CMN, TST)
- Moves (MOV, MVN)
The instructions can take register or immediate operands. Conditional execution is supported by appending condition flags (EQ, NE, GT, etc.) to the instruction. Multiply operations (MUL, MLS) are provided to accelerate digital signal processing algorithms.
Load/Store Instructions
Data transfer between memory and registers is handled by load/store instructions. Supported instructions include:
- Byte, halfword, and word loads/stores (LDRB, LDRH, LDR)
- Signed and unsigned extended loads (LDRSB, LDRSH, LDRS)
- Block data transfer (LDM, STM)
- Addressing modes (offset, pre/post-indexed, literal)
Loads perform sign or zero extension as required. Stores transfer register contents to memory. Block operations allow multiple registers to be transferred with a single instruction. Flexible addressing modes are supported including register offset, pre/post-indexed, and PC-relative literal.
Branch Instructions
Branch instructions alter the instruction flow by changing the PC to a target address. The Cortex-M3 supports conditional and unconditional branches:
- Conditional branches (Bcc)
- Subroutine calls (BL, BLX)
- Returns from subroutines (BX)
The conditional branches use a condition code to specifiy whether to branch or not. Subroutine calls save the return address to the link register (LR). Returns are accomplished by loading the PC from LR. Position independent code can be created using PC-relative addressing.
Exceptions and Interrupts
The Cortex-M3 provides built-in exception and interrupt handling to respond quickly to events. This includes:
- Nested Vectored Interrupt Controller (NVIC)
- SysTick timer for OS context switching
- BusFault, MemManage, and HardFault exceptions
- SVCall and PendSV for supervisor calls
The NVIC allows configuring priority and vectors for each interrupt. SysTick generates periodic interrupts for real-time OS context switching. Fault exceptions identify problematic instruction execution. SVCall and PendSV provide low-latency OS privilege operations.
Exclusive Access Instructions
Atomic read-modify-write operations are supported using exclusive access instructions:
- Load-Exclusive and Store-Exclusive (LDREX, STREX)
- Clear-Exclusive (CLREX)
An exclusive load reserves a memory address for atomic access. The store conditional will succeed only if no other access occurred. CLREX cancels a previous exclusive access. This enables thread-safe operations on shared data structures.
System Control Instructions
System control instructions provide control over architecture features:
- Execution priority (MSR, MRS)
- CPS instruction changes processor state
- WFI and WFE for power management
- SEV and ISB synchronization
The PRIMASK register disables interrupts when set. CPS selects between Thread and Handler modes. WFI and WFE enter low power states. Synchronization is provided by Send Event and Instruction Synchronization Barrier.
Coprocessor Instructions
Coprocessor instructions provide access to optional architecture extensions:
- Floating point unit (VFP)
- Memory protection unit (MPU)
- Nested Vectored Interrupt Controller (NVIC)
- Debug control (DCB, DWT)
The VFP coprocessor adds single and double precision floating point. The MPU provides memory access control. NVIC and debug modules are accessed using dedicated CP instructions.
Thumb-2 Instruction Encoding
The Thumb-2 instruction set uses a variable length encoding with both 16-bit and 32-bit instructions.
- 16-bit instructions have a 1101 prefix in bits 15:12
- 32-bit instructions have a 1101 prefix in bits 31:28
This allows the two instruction lengths to be intermixed freely. 32-bit instructions have a regular format for more efficient decoding. The top 4 bits encode the instruction class.
Conclusion
The Cortex-M3 Thumb-2 instruction set provides an optimal blend of high code density and good performance. The load/store architecture with mostly single cycle instruction enables fast exception response. Overall, the Thumb-2 ISA is well-suited for embedded applications requiring a balance of cost, performance, and power efficiency.