The Cortex-M3 processor from ARM is designed to be an extremely low power 32-bit microcontroller. Its power efficiency comes from multiple architectural design choices that aim to reduce power consumption during both active operation and idle periods.
Power Saving Modes
The Cortex-M3 has several low power modes that allow peripherals and CPU functionality to be selectively turned off when not needed. This allows the current draw to be minimized during idle periods.
- Sleep mode – CPU is stopped while peripherals remain active.
- Deep sleep mode – Both CPU and peripherals are stopped but RAM is powered.
- Standby mode – Only RAM retention logic is powered on.
- Shutdown mode – Entire chip goes into reset state drawing only nanoamps of current.
Transitioning between these modes is very fast, allowing the Cortex-M3 to enter and exit sleep states quickly to service interrupts as needed. This minimizes the amount of time the processor needs to be fully powered on.
Clock Gating
The Cortex-M3 implements extensive clock gating within its microarchitecture. Individual logic blocks can have their clocks disabled when idle to prevent unnecessary switching that would waste power. For example, the clocks for the floating point unit or MMU can be gated off completely if the application doesn’t use those features.
In active mode, the clocks are also gated off between cycles any time register transfers do not need to occur. This fine-grained clock gating significantly reduces dynamic power consumption.
Operating Voltage
The Cortex-M3 is designed to operate at voltages as low as 1.2V which allows lower voltage transistors to be used. Operating at lower voltages exponentially reduces the power consumption according to the equation P=CV2f. The core can even operate down to 1.0V in some cases depending on the silicon process used.
Memory Architecture
The Cortex-M3 implements the ARMv7-M architecture which is optimized for microcontroller applications. For example, it includes bit banding which allows single bit access to any addressable memory location. This avoids the need to do read-modify-write sequences for bit operations which saves power.
The tight coupling of Flash memory also avoids having to access external memory for instructions which would require significantly higher power. Flash memory is also designed to support low power sleep modes.
Efficient Pipeline
The 3-stage pipeline of the Cortex-M3 is very simple and avoids power hungry techniques like branch prediction. Branches always take 3 cycles so efficient conditional code can be written. The simple pipeline keeps logic to a minimum.
There are also some microarchitectural techniques like partial forwarding and early decoding to improve efficiency. But in general, the pipeline is designed to be lean and avoid complex power-hungry logic.
Multicore Support
The Cortex-M3 implements the ARM MPCore multicore architecture. This allows multiple cores to be included in a single chip and dynamically shut down when not needed. Unused cores can be powered off entirely to conserve power.
Multicore also enables better power management techniques like eliminating the need to wake up the main processor for some real-time tasks or low energy Bluetooth functions.
Peripheral Architecture
The AMBA 3 AHB-Lite bus used in the Cortex-M3 platform is designed for low power operation. Burst transfers are supported to avoid lots of individual transfers. Idle cycles can be inserted into long transfers to allow clock gating on idle blocks.
DMA support also avoids wasting CPU cycles on memory transfers. Smart peripherals with DMA engines can autonomously manage tasks without CPU intervention.
Silicon Process
By being designed for small silicon geometries down to 40nm and below, the Cortex-M3 takes advantage of the power saving and performance improvements of smaller transistor sizes. Smaller feature sizes result in lower capacitance, lower operating voltages and lower active power.
Implemented in a low power silicon process optimized for battery operation, the Cortex-M3 can achieve impressively low power numbers for an 32-bit processor. For example, the STM32L series using a 130nm low power process can achieve down to 200nA in standby mode.
tickless Operation
The Cortex-M3 architecture supports tickless operation by allowing the system timer SYSTICK to be shut off when not needed. This prevents the CPU from being woken up on regular intervals when idle just to update the tick counter. Instead, the wakeups can be configured to only occur when really needed to service some system event.
Wait For Interrupt Instruction
The Cortex-M3 ISA includes a WFI instruction that places the processor into a low power state until the next interrupt occurs. This can minimize the active power significantly whenever the CPU needs to wait for some external event. By quickly going into WFI mode instead of burning cycles polling, power consumption is reduced.
Low Power Peripherals
Many standard peripherals like GPIO pins, timers, serial, SPI, I2C, and ADC have special low power modes to minimize current draw. I/O pins can be configured to hold state or drive low power levels. Clocks can be gated off. DMA can manage transfers while the CPU sleeps.
Specialized peripherals like Real Time Clock module with dedicated battery, low power watchdog timer, and low energy Bluetooth controller all enable ultra low power operation.
Power Management Controller
Some Cortex-M3 implementations include an integrated power management controller that can manage the various power modes, exercise clock gating control, and adjust voltages to optimize for low power. Having the power controller integrated and aware of the SoC state provides additional efficiency.
Summary
In summary, the Cortex-M3 employs both extensive architectural design choices specifically for low power operation as well as leading edge silicon process technology. Multiple low power sleep states, clock gating, low voltage operation, simple pipeline, multicore support, efficient bus architecture, integrated power management, and low power peripherals all contribute to making the Cortex-M3 an extremely power efficient 32-bit processor well suited for embedded and battery powered applications.