The ARM Cortex-M3 and Cortex-M33 are two popular ARM processor cores designed for embedded and IoT applications. Both offer high performance and efficiency in a small footprint, making them well-suited for constrained and battery-powered devices.
The key difference between the M3 and M33 is that the M33 includes Arm’s TrustZone technology for enhanced security. This allows secure and non-secure applications to run in isolation on the same core. The M33 also adds branch target identification, improving protection against Return Oriented Programming (ROP) attacks. Overall, the M33 builds on the strengths of the M3 with advanced security features.
Architecture
The Cortex-M3 and M33 share a similar underlying architecture based on the Arm v7-M architecture profile. Key architectural features include:
- 3-stage pipeline – Fetch, Decode, Execute
- Advanced single-cycle multiply instructions
- Low latency interrupt handling
- Optional MPU for memory protection
- Efficient Thumb-2 instruction set
This efficient RISC pipeline enables high performance while minimizing power consumption. Both cores can reach 1.25 DMIPS/MHz and support DSP instructions for digital signal processing tasks.
However, there are some microarchitecture differences. The M33 adds speculative branch prediction and deeper pipeline stages compared to the simpler M3 design. The M33 also includes a Micro Trace Buffer (MTB) for instruction trace capability.
Performance
In terms of raw performance, the Cortex-M33 generally outperforms the M3 thanks to microarchitecture improvements. Here are some comparative benchmarks:
- DMIPS: 1.25 DMIPS/MHz (M3), 1.8 DMIPS/MHz (M33)
- CoreMark: 1.57 CoreMark/MHz (M3), 2.34 CoreMark/MHz (M33)
- Dhrystone: 0.77 DMIPS/MHz (M3), 1.01 DMIPS/MHz (M33)
Real-world performance will depend on the specific chip implementation and clock speed. But overall, the M33 can execute more instructions per cycle leading to higher throughput for many workloads.
Memory System
Both processors support Harvard architecture with separate instruction and data buses. This allows simultaneous access to program and data memory.
The M3 has a 3-stage memory pipeline. The M33 improves this with a 4-stage pipeline, reducing stalls for memory intensive applications. The M33 also supports more outstanding memory requests for higher bandwidth.
In terms of addressable memory, the M3 supports up to 4GB of memory in its 32-bit address space. The M33 increases this to 8TB with its extended 39-bit address space.
Instruction Set
The Cortex-M3 and M33 utilize Arm’s Thumb-2 instruction set. This provides a balance of high code density with improved performance compared to previous Thumb-only designs.
Both instruction sets are almost identical. The key additions in the M33 are:
- TrustZone security extensions – Adds SMC, SMRS, SMSC instructions
- BTI – Branch Target Identification
- Complex number extensions
All other instructions between the M3 and M33 are compatible. This allows for easy migration of code between the two cores.
TrustZone Security
The major difference between the Cortex-M3 and M33 is the addition of TrustZone security technology in the M33. This allows secure and non-secure states to be established.
TrustZone provides isolation and protection for secure code, data and peripherals. This prevents access from non-secure states, laying the foundation for trusted execution environments.
Software can utilize a new Secure Gateway to safely transition between secure and non-secure states. The secure state has access to all resources, while non-secure access is limited to its permitted resources and memory regions.
Overall, TrustZone enables robust security crucial for sensitive applications like financial transactions, content protection, authentication and more.
Branch Target Identification (BTI)
To complement TrustZone, the M33 also includes Branch Target Identification (BTI) instructions for control flow integrity. This defends against Return Oriented Programming (ROP) attacks.
BTI works by tagging valid branch targets and having the processor check tags on branch instructions. This detects and prevents jumps to unauthorized code sequences.
With TrustZone and BTI, the M33 significantly improves security compared to the M3.
Power Efficiency
Power efficiency is a critical metric for embedded devices. The Cortex-M3 and M33 both utilize Arm’s power control technology for low power consumption.
Key power features include:
- Multiple low power sleep modes
- Dynamic voltage and frequency scaling
- Integrated power gates
- Clock gating
The M3 static power is 61 μW/MHz on a 40nm process. The M33 reduces this slightly to 60 μW/MHz on 28nm. Dynamic power depends on workload but ranges from 130-200 μW/MHz.
Overall, the M33 achieves better energy efficiency over the M3 through architectural enhancements. For example, the improved 4-stage memory pipeline reduces idle wait times for better efficiency.
Development Tools
Both the Cortex-M3 and M33 can be programmed using Arm’s Eclipse-based MDK toolkit. This provides an IDE, debugger, and compilers for C and assembly code development.
CMSIS libraries provide standardized interfaces to simplify software development across different microcontrollers. Both cores are supported by extensive CMSIS libraries.
For TrustZone development, M33 kits include TZ extensions for secure application development and debugging.
Licensing and Cost
The Cortex-M3 and Cortex-M33 processor IP can be licensed from Arm for integration into custom chips and SoCs. Typical license fees are based on volume, process node and other factors.
In general, the M33 license costs around 30-50% more than the M3. The higher cost reflects the more advanced microarchitecture and addition of TrustZone.
However, overall SoC cost depends heavily on other components beyond just the core license fee. Large production volumes can amortize the upfront IP cost.
Comparison Summary
In summary, the Cortex-M33 builds on the strong foundation of the M3 with enhancements in security, performance, efficiency and memory support:
- Security – Adds TrustZone and BTI for trusted execution
- Performance – Higher clock speeds, throughput and efficiency
- Memory – Larger address space, improved pipeline
- Power – Lower static power consumption
The M33 maintains code compatibility with added features useful for IoT and connected applications. Overall, it provides a compelling upgrade path for demanding embedded designs.