SoC
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
  • Arm Cortex M3
  • Contact
Reading: Is Neon available with Cortex-M or Cortex-A series?
SUBSCRIBE
SoCSoC
Font ResizerAa
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Search
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Have an existing account? Sign In
Follow US
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
© S-O-C.ORG, All Rights Reserved.
Arm

Is Neon available with Cortex-M or Cortex-A series?

Holly Lindsey
Last updated: September 12, 2023 1:10 pm
Holly Lindsey 6 Min Read
Share
SHARE

The short answer is no, ARM’s Neon SIMD instruction set extension is not available on Cortex-M series processors. Neon is only supported on certain Cortex-A series application processors aimed at higher performance requirements.

Contents
Introduction to ARM’s Neon TechnologyCortex-M Series and NeonRole of Cortex-M and Cortex-A ProcessorsFinal Thoughts

Introduction to ARM’s Neon Technology

Neon is ARM’s single instruction multiple data (SIMD) architecture extension for the ARMv7 architecture and newer ARM processor cores. It provides SIMD processing capabilities to Cortex-A series application processors, enabling improved performance for multimedia, signal processing, and other computationally intensive workloads.

Neon supports 64-bit and 128-bit SIMD vector processing, allowing operations to be performed on multiple data elements concurrently using a single instruction. This can significantly boost performance for workloads that exhibit data parallelism.

Some of the key features of Neon include:

  • 128-bit SIMD vector processing
  • Support for 8, 16, 32 and 64-bit integer and single-precision floating point data types
  • Saturated arithmetic and rounding operations
  • Advanced SIMD load/store instructions for aligned and unaligned access
  • Matrix multiplication operations
  • 2D convolution acceleration
  • Cryptographic acceleration functions

Neon is implemented as an optional extension in Cortex-A series processors like Cortex-A8, Cortex-A9, Cortex-A15, Cortex-A53 etc. The inclusion of Neon is optional and is determined by the chip designer based on the intended application domain and performance requirements.

Cortex-M Series and Neon

The Cortex-M series of processors from ARM are efficient low power microcontroller cores designed for embedded and IoT applications. They prioritize power efficiency, deterministic real-time performance, and minimized silicon area over raw processing performance.

Unlike application processors, Cortex-M series cores are in-order execution pipelines without advanced microarchitectural features like superscalar execution, out-of-order execution, branch prediction etc. They also have relatively simple memory subsystem designs compared to high performance application processors.

As a result, Cortex-M series processors do not support Neon or any other SIMD instruction set extensions. The key reasons are:

  • In-order pipelines cannot take advantage of instruction level parallelism provided by SIMD
  • Lack of advanced microarchitectural features limits performance scalability
  • Embedded microcontroller applications often do not need high math performance
  • Neon increases core complexity, silicon area and power consumption
  • Software complexity from new instruction set architecture

Enabling Neon requires significant microarchitectural changes and optimizations that go against the design goals of simplicity, efficiency and real-time determinism for Cortex-M series. The power and area overhead is difficult to justify given most microcontroller applications do not need SIMD acceleration.

For the rare cases where higher math performance may be needed, Cortex-M can offload processing to dedicated math accelerators and DSPs optimized for signal processing workloads.

Role of Cortex-M and Cortex-A Processors

The Cortex-M and Cortex-A series have very different design goals and target applications. This leads to different architectural trade-offs regarding performance, power and cost:

  • Cortex-M – Microcontrollers for real-time applications like motor control, industrial automation, IoT sensors etc. Focused on power efficiency, determinism, minimal area.
  • Cortex-A – Application processors for devices like smartphones, tablets, computers. Optimized for high performance and advanced capabilities like computer vision, multimedia, gaming etc.

While Cortex-M forgoes power-hungry capabilities like Neon that are not needed for embedded use cases, Cortex-A application processors include these to address performance-critical application domains.

Neon provides a major performance boost for workloads like image processing, video encoding/decoding, speech recognition, physics simulations, machine learning inferencing etc. These workloads involve large amounts of vector and matrix data parallelism that Neon can efficiently accelerate.

For example, Neon can speed up convolution layers in neural networks by processing multiple input and filter values concurrently. This results in significantly faster deep learning inferencing compared to scalar execution.

By including Neon, Cortex-A series processors like Cortex-A73, Cortex-A76 and Cortex-A77 provide the computational horsepower needed for complex workloads in mobile, desktop and server computing. The power and area trade-offs are acceptable given application performance requirements.

Final Thoughts

In summary, Neon SIMD acceleration is not suitable for the design goals and embedded target applications of Cortex-M class microcontroller cores. The power and complexity overheads cannot be justified.

Neon provides major performance benefits for high performance application processors like Cortex-A series that need to handle advanced workloads like AI inferencing, 3D graphics, image processing etc. The overhead is acceptable given their higher performance requirements.

The division of ARM CPU cores into the efficiency-oriented Cortex-M series and higher performance Cortex-A series allows optimal architectural trade-offs for vastly different use cases from microcontrollers to servers.

So in most cases, it makes sense to pair a Cortex-M microcontroller for real-time control tasks with a Cortex-A application processor for number crunching workloads in an end application system. The strengths of both core types can then be leveraged via coordination over a software interface.

Newsletter Form (#3)

More ARM insights right in your inbox

 


Share This Article
Facebook Twitter Email Copy Link Print
Previous Article How to disable nesting in NVIC Interrupts in ARM Cortex M0+?
Next Article What is the difference between ARM MVE and neon?
Leave a comment Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

2k Followers Like
3k Followers Follow
10.1k Followers Pin
- Sponsored-
Ad image

You Might Also Like

How much memory does the Cortex-M0 have?

The Cortex-M0 is an ARM processor core designed for microcontroller…

8 Min Read

What is the SVC instruction in the arm cortex?

The SVC (Supervisor Call) instruction in ARM Cortex processors is…

7 Min Read

How Many Ports are there in Cortex-M3?

The Cortex-M3 processor from ARM has 37 general purpose I/O…

7 Min Read

ARM Cortex-M7

The ARM Cortex-M7 is a high-performance processor core designed for…

8 Min Read
SoCSoC
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
Welcome Back!

Sign in to your account