ARM’s Cortex-M series of processor cores has become ubiquitous in the embedded systems market over the last 15 years. From tiny microcontrollers to more powerful applications processors, the Cortex-M cores have enabled rapid development of low-power, high-performance designs across a diverse range of markets including industrial, automotive, consumer electronics and IoT.
The Origins of the Cortex-M
The origins of the Cortex-M series can be traced back to ARM’s 8-bit and 16-bit cores that were popular in the 1990s and early 2000s. Many of these legacy cores like the ARM7TDMI and ARM9TDMI formed the foundation for ARM’s dive into the 32-bit embedded market with the Cortex-M series.
In 2004, ARM announced the first Cortex-M3 core, designed from the ground up as a high-performance 32-bit processor for deeply embedded applications. It featured the ARMv7-M architecture (later called the Cortex-M architecture) which included features like a streamlined instruction set, efficient Thumb-2 code compression, a NVIC interrupt controller and optional memory protection unit (MPU).
The performance and power efficiency of the Cortex-M3 made it an instant success in a wide range of embedded products. Its Thumb-2 instruction set allowed developers to achieve close to 1 DMIPS/MHz performance. The M3’s 3-5 stage pipeline enabled operation at up to 200 MHz while maintaining low power consumption. Importantly, it maintained code compatibility with existing ARM7 and ARM9Thumbs cores.
Cortex-M Evolution and Expansion
Following the success of the Cortex-M3, ARM continued expanding the Cortex-M family down and up the performance spectrum throughout the late 2000s. This enabled scalable solutions for embedded designers needing different performance points.
In 2007, ARM announced the Cortex-M0 – a low-cost chip targeting ultra-low power and size constrained applications like wearables and wireless sensors. It featured a 2-stage pipeline that enabled efficient operation between 30-100 MHz. The M0 provided an entry point for simple 8/16-bit designs needing to migrate to 32-bit.
At the higher end, ARM unveiled the Cortex-M4 in 2010, integrating optional floating point (FP) and digital signal processing (DSP) instructions. This enabled advanced math-intensive software in embedded analytics, motor control, industrial automation and similar applications. The M4 featured a 3-5 stage pipeline like the M3 but with higher max clock speeds up to 250 MHz.
As ARM added new cores, they took care to maintain compatibility between the Cortex-M family. This allowed developers to scale their designs between different performance points with minimal software changes. ARM also introduced consistency between the cores in terms of features like debug, memory architecture, bus interfaces and more.
Advanced Capabilities and the Cortex-M7
By the early 2010s, ARM was seeing increasing demand for advanced capabilities like memory protection, redundancy and high speed connectivity in embedded devices. In response, they continued expanding the Cortex-M family’s capabilities.
The Cortex-M23 core unveiled in 2012 specifically targeted ultra low power IoT applications with advanced connectivity needs. It was ARM’s first MPU-equipped Cortex-M core, allowing OS-based designs to securely isolate software modules in memory.
For high-performance embedded computing needs, ARM announced the Cortex-M7 in 2015. It pushed the performance envelope of the Cortex-M series, integrating a superscalar dual-issue pipeline and optional FP/DSP instructions. The M7 also boasted ARM’s highest coreMark/MHz score for a Cortex-M processor.
For safety-critical applications like industrial control and automotive, ARM unveiled the Cortex-M33 core in 2016. It was the first Cortex-M core to integrate ARMv8-M architecture features like increased code density optimizations and memory safety functionality.
The Machine Learning-Capable Cortex-M55
By the late 2010s, ARM observed soaring demand for machine learning capabilities at the edge. Applications ranging from voice assistants to predictive maintenance needed to run ML models locally on power-constrained devices.
To address this, ARM announced the Cortex-M55 in 2019 – the first Cortex-M processor with machine learning acceleration. Alongside the high-performance Cortex-M33 CPU, the M55 features a dedicated micro-NPU for efficient inference. It enables over 10x better efficiency for running neural networks compared to earlier Cortex-M cores.
The dual-core M55 design allows embedded systems to combine critical control tasks with ML workloads on low-power edge nodes. ARM is also adding machine learning software to its CMSIS libraries to simplify application development.
The Future of the Cortex-M Series
While ARM’s legacy cores drove much of the early growth in Cortex-M adoption, its newer cores like the M33 and M55 account for most volume today. ARM’s roadmap indicates they will continue expanding the capabilities of the Cortex-M series while maintaining a consistent architecture.
Ongoing areas of focus include security, functional safety, machine learning, advanced connectivity, application specific acceleration and heterogeneous computing. ARM is also putting increased emphasis on software and tooling ecosystems to simplify development.
With over 100 billion chips shipped to date in products ranging from fitness bands to industrial robots, the Cortex-M series has been instrumental in enabling the global proliferation of smart connected devices. Its continued evolution will help fuel innovation across the intelligent edge for years to come.