SoC
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
  • Arm Cortex M3
  • Contact
Reading: What are Helium vector instructions in Arm Cortex-M series?
SUBSCRIBE
SoCSoC
Font ResizerAa
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Search
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Have an existing account? Sign In
Follow US
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
© S-O-C.ORG, All Rights Reserved.
Arm

What are Helium vector instructions in Arm Cortex-M series?

David Moore
Last updated: September 18, 2023 2:25 am
David Moore 7 Min Read
Share
SHARE

Helium vector instructions are a new set of SIMD instructions introduced in Arm Cortex-M55 that provide significant performance improvements for signal processing, machine learning, and digital signal control applications. The key benefit of Helium instructions is that they enable parallel processing of up to 16 8-bit integers or 8 16-bit integers per clock cycle on Cortex-M55 cores. This allows developers to achieve much higher performance for workloads involving vector math, matrix operations, FFTs, convolutions, and other computational tasks on Cortex-M series microcontrollers.

Contents
Overview of Helium Vector ExtensionHelium Vector RegistersHelium Instruction SetBenefits of HeliumUsing Helium in C CodeProcessor SupportConclusion

Overview of Helium Vector Extension

The Arm Helium technology is a new vector extension for the Cortex-M processor family. It provides a set of 128-bit wide vector registers and associated SIMD instructions that operate on these registers.

Key features of Helium include:

  • 16 x 8-bit integer registers, with operations for addition, subtraction, multiplication, shifting, etc.
  • 8 x 16-bit integer registers for arithmetic and logical ops
  • Vector load/store instructions for efficient data transfer
  • Dot product instructions for ML workloads
  • Intrinsic functions for C programmers

Helium is implemented as an optional extension within the Armv8.1-M architecture (used by Cortex-M33 onwards). The first microcontroller core to support Helium is the Cortex-M55.

Compared to earlier SIMD extensions like DSP instructions, Helium offers much higher parallelism (16 8-bit ops per cycle instead of 2) and a larger register file (16 vector registers instead of 4 or 8). This dramatically boosts performance on key workloads.

Helium Vector Registers

The Helium extension provides sixteen 128-bit wide vector registers named V0-V15. These operate as:

  • Sixteen 8-bit integer registers V0.8-V15.8
  • Eight 16-bit integer registers V0.16-V7.16

The vector registers can be accessed as smaller register slices like 8-bit or 16-bit as needed by the instructions. The registers provide the operands for the Helium SIMD instructions that work on these registers.

In addition, there is a 4-bit saturation flag, Q flag, that controls saturation behavior of some arithmetic instructions. This allows clamping results to data type range instead of overflow/wraparound.

Helium Instruction Set

Helium provides a range of vector instructions that operate on the V registers. Key instruction categories include:

  • Arithmetic – Add, subtract, multiply, absolute difference etc. Supports saturation option.
  • Logical – Bitwise AND, OR, XOR, NOT etc.
  • Shift – Logical and arithmetic shift left/right by immediate amount
  • Dot Product – Dot product of two V register contents
  • Load/Store – Load or store one V register from memory
  • Table Lookup – Lookup values from a table in memory
  • Permute/Zip/Uzip – Permute vector contents like transpose a matrix

By combining these instructions, most common vector and matrix operations can be implemented efficiently. The intrinsics provide higher level access to these instructions from C code.

Benefits of Helium

Here are some of the major benefits provided by the Helium vector extension to Cortex-M processors:

  • Higher Performance – Up to 16 operations per cycle improves throughput for parallel workloads
  • Power Efficiency – Better utilization of core resources reduces energy per operation
  • Easy to use – Intrinsic functions integrate seamlessly with C/C++ code
  • Small Code Size – Compact ISA implementation suitable for MCUs
  • Scalable – Single architecture scales from M-profile to higher performance cores

In particular, Helium enables acceleration of:

  • Digital signal processing algorithms (filtering, FFTs etc.)
  • Computer vision and image processing
  • Machine learning inference using neural networks
  • Sensor fusion in IoT and edge devices
  • Control algorithms and predictive maintenance
  • Any workload involving vector/matrix math

This allows Cortex-M cores to achieve much higher throughput on these workloads while maintaining low cost and power efficiency.

Using Helium in C Code

To use the Helium instructions in C/C++ code, Arm provides a set of intrinsic functions that map directly to the Helium ISA. Some examples are:

  • vhadd – Horizontal vector add
  • vadd – Vector add
  • vldr – Vector load
  • vstr – Vector store
  • vzip – Zip vectors
  • vmax – Element-wise vector maximum

Here is a simple example for vector addition: #include “arm_helium.h” void add_vectors(uint8_t *res, uint8_t *a, uint8_t *b) { v8 uint8_t va = vld1(a); v8 uint8_t vb = vld1(b); v8 uint8_t vc = vadd(va, vb); vst1(res, vc); }

This loads two 8-bit integer vectors, adds them, and stores the result. The intrinsic handles the details of mapping this to the Helium ISA.

Arm also provides reference implementations of common functions like matrix multiply, FIR filters, softmax etc. built using the intrinsics. These can be used to quickly implement complex algorithms without dealing directly with intrinsics.

Processor Support

Currently, Helium vector extension is supported only in the Cortex-M55 processor announced in 2021. Cortex-M55 is the first implementation of the Armv8.1-M architecture.

Cortex-M55 combines an advanced DSP/ML accelerator with Cortex-M33 for high performance signal and data processing. The Helium unit in Cortex-M55 provides significant speedups for workloads optimized with Helium intrinsics.

Arm has stated that Helium will be adopted across the M-profile roadmap over time. So we can expect future Cortex-M cores beyond M55 to support Helium as well.

Helium is enabled through the Armv8.1-M architecture. So any Armv8.1-M compatible core can implement Helium extensions in the future.

Conclusion

In summary, Helium vector instructions provide SIMD parallel processing capabilities to Cortex-M series processors, unlocking much higher performance and efficiency. The combination of compact ISA, easy programming through intrinsics, and scalability across the M-profile family enables new applications in signal processing, computer vision, control systems and machine learning.

As Helium gets adopted in more microcontrollers, it will become a key differentiating feature for the Cortex-M processors compared to competing architectures. The ability to accelerate advanced algorithms involving vector math while maintaining deterministic real-time performance allows Arm to target a wide range of embedded applications with Cortex-M series cores.

Newsletter Form (#3)

More ARM insights right in your inbox

 


Share This Article
Facebook Twitter Email Copy Link Print
Previous Article What are Double-Precision (DP) floating-point instructions in Arm Cortex-M series?
Next Article What are TrustZone security instructions in Arm Cortex-M series?
Leave a comment Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

2k Followers Like
3k Followers Follow
10.1k Followers Pin
- Sponsored-
Ad image

You Might Also Like

Cortex MO Pipeline Stages

The Cortex-M series of ARM processors feature a simplified pipeline…

7 Min Read

Things to Check When Cortex-M1 Enters Hard Fault Early On

When the Cortex-M1 processor encounters a fatal error early in…

9 Min Read

What is ARMv8-M in Arm Cortex-M series?

ARMv8-M refers to the latest architecture version of the Cortex-M…

6 Min Read

Why Cortex-M Requires Its First Word as Initial Stack Pointer?

The Cortex-M processor is an extremely popular 32-bit ARM processor…

6 Min Read
SoCSoC
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
Welcome Back!

Sign in to your account