The ARM Cortex-M is a group of 32-bit RISC ARM processor cores licensed by Arm Holdings. The Cortex-M cores are designed for embedded applications requiring low cost, low power consumption, and high performance. Cortex-M processors feature a streamlined instruction set optimized for minimal memory footprint and low-cost manufacturability. They have become very popular in the embedded market, being incorporated into microcontrollers from many major semiconductor companies.
Key Features of Cortex-M
Some of the key features of Cortex-M processors include:
- Simplified ARMv7-M architecture – Cortex-M cores implement a simplified version of the ARMv7-M architecture, omitting certain features to reduce complexity and cost.
- Thumb-2 instruction set – All instructions are encoded as 32-bit Thumb-2 instructions, enabling high code density.
- Up to 15 stage pipeline – Advanced Cortex-M cores use a streamlined pipeline up to 15 stages long for greater efficiency.
- Memory Protection Unit (MPU) – An optional MPU provides memory access control and segmentation.
- Nested Vectored Interrupt Controller (NVIC) – The NVIC enables low latency interrupt handling.
- Wake-up Interrupt Controller (WIC) – The WIC allows wake up from low power modes on interrupt.
- Single-cycle digital signal processing instructions – DSP instructions enable efficient signal processing.
Cortex-M Family
There are several processors in the Cortex-M family, offering different combinations of features, performance and power efficiency:
Cortex-M0/M0+
The Cortex-M0 and Cortex-M0+ are ultra low power cores designed for microcontrollers and deeply embedded applications. Key features:
- Up to 48 MHz operation
- 3-stage pipeline
- No MPU or DSP extensions
- M0+ adds more single-cycle instructions for better performance
Cortex-M1
The Cortex-M1 focuses on soft real-time performance in mid-range embedded applications. Key features:
- Up to 150 MHz operation
- 6-stage pipeline
- Optional MPU and DSP extensions
Cortex-M3
The Cortex-M3 delivers significantly higher performance than Cortex-M0/M1 cores. Key features:
- Up to 200 MHz operation
- 13-stage pipeline
- Optional MPU and DSP extensions
- Low latency interrupts
Cortex-M4
Building on the M3, the Cortex-M4 features a floating point unit for signal processing. Key features:
- Up to 250 MHz operation
- 15-stage pipeline
- Integrated single-precision floating point unit
- Optional MPU and DSP extensions
Cortex-M7
The Cortex-M7 is the highest performance M-series core, intended for advanced embedded applications. Key features:
- Up to 300 MHz operation
- 12-stage pipeline
- Superscalar dual-issue for higher throughput
- Floating point unit
- Optional MPU and DSP extensions
Microarchitecture
The microarchitecture of Cortex-M processors centers around a simple 3-stage integer pipeline. Instruction flow through the pipeline is as follows:
- Fetch Stage – Instructions are fetched from memory into the pipeline.
- Decode Stage – Instructions are decoded into control signals.
- Execute Stage – Instructions execute and write results to the register file.
On more advanced cores, additional decode and execute stages are added to increase performance. Cortex-M4 adds a 5-stage floating point pipeline for hardware acceleration of floating point math.
Pipeline Control
Pipeline control logic manages hazards and stalls to ensure correct instruction execution. Key mechanisms include:
- Bypass paths – Forward results between pipeline stages to avoid data hazards.
- Interlocking – Stall pipeline to avoid structural and data hazards.
- Branch prediction – Predict direction of branches to avoid pipeline stalls.
Memory Subsystem
The memory subsystem consists of separate instruction and data buses to allow simultaneous instruction and data accesses. Tightly-coupled memories can provide low latency access for performance-critical code and data. A Memory Protection Unit (MPU) can enforce privilege and memory access rules.
Interrupts and Exceptions
The Nested Vectored Interrupt Controller (NVIC) allows ultra low latency interrupt handling in hardware. Interrupt handlers execute out of ARMv7M exception vectors located in code memory for performance.
Instruction Set Architecture
Cortex-M processors implement the Thumb-2 instruction set, a highly dense subset of the ARM instruction set. Thumb-2 provides:
- 16-bit and 32-bit instructions
- Uniform 32-bit address space
- Load/store architecture
- Conditional execution
16-bit Thumb instructions provide basic functionality with high density. 32-bit instructions enable more complex operations and addressing modes. The unified 32-bit address space simplifies linking and loading.
Core Register File
The Cortex-M register file contains 32 x 32-bit general purpose registers R0-R12, stack pointer (SP), link register (LR), program counter (PC) and special registers for program status (PSR). R13/SP and R14/LR have banked copies for fast exception handling.
Operating Modes
The processor implements Thread Mode for execution and Handler Mode when servicing exceptions. Mode is indicated by the least significant bit of the program status register (PSR[0]).
Instruction Types
Major instruction types include:
- Data processing – Arithmetic, logical and move operations.
- Loads and stores – Move data between registers and memory.
- Branches – Change flow of execution with jumps.
- Coprocessor – Optional floating point and DSP instructions.
Addressing Modes
Thumb-2 supports a flexible set of addressing modes for accessing operands in registers and memory:
- Register – Operand is a processor register.
- Immediate – Operand is a constant value.
- Literal – PC-relative address computed at compile time.
- Register offset – Address computed by adding register and offset.
Development Tools
Cortex-M processors are supported by a full range of development tools from Arm and partners. These include:
- Compilers – Arm Compiler 6, GCC, LLVM
- Debuggers – Arm DDT, Eclipse IDEs
- Emulators – Arm Versatile Express
- Modeling – Fast Models, Cycle Models
- Operating Systems – Arm Mbed, FreeRTOS, Micrium uC/OS
High quality development tools enable software engineers to productively write and debug Cortex-M applications in C and assembly language.