The Arm Cortex-M0 is a 32-bit ARM processor core licensed by Arm Holdings. It is aimed at microcontroller applications that require a low power, low cost CPU with good real-time performance. The Cortex-M0 is the smallest and most energy efficient ARM processor available, making it well suited for IoT and wearable devices.
One way to measure the performance of a CPU like the Cortex-M0 is by using the Dhrystone benchmark. The Dhrystone test provides a standard set of integer and string operations that are timed to determine the number of Dhrystone MIPS (DMIPS) a processor can achieve. DMIPS refers to the number of million Dhrystone instructions per second that can be executed. The higher the DMIPS, the better the integer and string handling performance of the CPU.
Cortex-M0 Dhrystone Performance
Published data on the Dhrystone performance of the Cortex-M0 processor shows it achieves roughly 0.9 DMIPS per MHz. This means a Cortex-M0 CPU running at 50 MHz would score around 45 DMIPS. Some specific DMIPS measurements reported for Cortex-M0 chips include:
- NXP LPC1114FN28 – 48 MHz Cortex-M0 scores 43 DMIPS
- Silicon Labs EFM32TG822 – 25 MHz Cortex-M0 scores 26 DMIPS
- STM32F030R8 – 48 MHz Cortex-M0 scores 40 DMIPS
These results can vary slightly between different Cortex-M0 implementations from vendors like NXP, STMicroelectronics and others. But generally the Cortex-M0 microarchitecture aims to deliver roughly 1 DMIPS per 1 MHz.
Cortex-M0 Design Efficiency
The Cortex-M0 is designed to be extremely efficient in terms of power and area. Built on a von Neumann architecture, it has a compact 3-stage pipeline and only 27 CPU registers. The small silicon footprint allows the Cortex-M0 to be targeted for ASICs and FPGAs. Some key architectural features include:
- 3-stage pipeline – Fetch, Decode, Execute
- Fixed 2-cycle multiply
- Hardware divide not supported
- Thumb-2 instruction set
- Single-cycle I/O
- No cache support
- Embedded Trace Macrocell (ETM)
- Memory Protection Unit (MPU)
By removing hardware multiply/divide and cache support, the Cortex-M0 minimizes transistor count and power consumption. The 3-stage pipeline, Thumb-2 instructions, and single-cycle I/O also improve efficiency. For memory intensive applications, the MPU allows protecting secure regions of code and data.
Cortex-M0 vs Cortex-M1
The Cortex-M1 is Arm’s higher performance alternative to the M0 for microcontrollers. Key differences include:
- M1 has longer 5-stage pipeline
- M1 has higher clock speeds up to 150 MHz
- M1 supports hardware divide
- M1 has larger 36×32-bit register file
- M1 includes a SysTick timer
- M1 adds bit-banding capability
These enhancements allow the Cortex-M1 to achieve up to 3.25 DMIPS per MHz, significantly higher than the Cortex-M0. However, the M1 also requires more silicon area and has higher power draw. So the M0 remains popular for low-power, cost-sensitive applications needing good real-time responsiveness.
Cortex-M0 Benchmarks
In addition to Dhrystone, other common benchmarks used to test Cortex-M0 performance include:
- CoreMark-PRO – Tests CPU and memory
- DMIPS – Integer and string performance
- Whetstone – Floating point operations
- LINPACK – Linear algebra floating point
- Dhrystone – Integer and string handling
- SysMark – CPU, FPU, memory, I/O
For real-world applications, factors like interrupt handling, context switching responsiveness, and interfacing peripherals also impact the user experience when running on a Cortex-M0. Lower level profiling of the pipeline, stalls, cache misses, and branches taken can provide further optimization insights.
Cortex-M0 Competitors
As one of the most popular 32-bit embedded microcontroller cores, the Cortex-M0 has many competitors including:
- PIC32MX1xx/2xx – Microchip Technology
- AVR Xmega – Microchip Technology
- MSP430 – Texas Instruments
- STM8 – STMicroelectronics
- RISC-V – Open source ISA
The PIC32, AVR, MSP430 and STM8 are all 8/16-bit microcontroller units serving a similar market. RISC-V is an open standard ISA that could gain traction. But Arm’s Cortex series has the advantage of being a widely supported architecture across the industry.
Cortex-M0 Use Cases
The Cortex-M0 targets simple, low cost embedded applications including:
- Internet of Things (IoT) devices
- Wearables and hearables
- Home automation and appliances
- Toys and gaming
- Industrial sensors and equipment
- Medical devices and instruments
Its low power profile and real-time capabilities make the M0 a good fit for battery-powered intelligent sensors needing wireless connectivity. The Cortex-M0 is extensively used in WiFi, Bluetooth, Zigbee, Thread and other IoT products thanks to its cost/power/performance mix.
Conclusion
The Arm Cortex-M0 achieves roughly 1 DMIPS per MHz in the Dhrystone benchmark making it well suited for low-power microcontroller applications. Its compact 3-stage pipeline enables an efficient 32-bit architecture while minimizing silicon area and power draw. The Cortex-M0’s combination of efficiency, real-time responsiveness and low cost has led to its widespread adoption in IoT, wearables, home automation, industrial devices and other embedded use cases.