SoC
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
  • Arm Cortex M3
  • Contact
Reading: ARM FPU Instruction Set
SUBSCRIBE
SoCSoC
Font ResizerAa
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Search
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Have an existing account? Sign In
Follow US
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
© S-O-C.ORG, All Rights Reserved.
Arm

ARM FPU Instruction Set

Graham Kruk
Last updated: September 8, 2023 10:48 am
Graham Kruk 7 Min Read
Share
SHARE

The ARM Floating Point Unit (FPU) provides hardware support for calculations using floating point numbers. The FPU instruction set allows ARM processors to perform mathematical operations efficiently on single precision and double precision floating point values.

Contents
Overview of ARM FPUFPU Data TypesFPU InstructionsData TransferArithmeticComparisonConversionStatus and ControlProgramming with the FPUARM FPU ArchitecturesVFP (Vector Floating Point)VFPv2VFPv3 / VFPv4FPv5Summary

Overview of ARM FPU

The ARM FPU is an optional extension to the ARM instruction set architecture. It provides hardware acceleration for floating point arithmetic, which improves performance compared to doing the computations in software. The FPU operates concurrently with the ARM integer processing pipeline, allowing floating point and integer instructions to execute simultaneously.

There have been several generations of ARM FPU designs over the years. Early implementations focused on single precision (32-bit) floating point, while more recent versions also include double precision (64-bit) capabilities:

  • VFP (Vector Floating Point) – Single precision only
  • VFPv2 – Single and double precision
  • VFPv3 – Enhanced version of VFPv2
  • VFPv4 – Further improvements, ARMv7 architecture
  • FPv5 – Latest implementation, ARMv8 architecture

The FPU registers are separate from the ARM general purpose registers. There are 32 single precision registers (s0-s31) and 32 double precision registers (d0-d31) in a standard VFP implementation. Registers s0-s15 overlay d0-d15 for improved performance when mixing single and double precision code.

FPU Data Types

The ARM FPU supports the following floating point data types:

  • Single precision (32-bit) – Uses the IEEE 754 single precision format. Occupies one FPU register.
  • Double precision (64-bit) – Uses the IEEE 754 double precision format. Occupies two FPU registers.

Floating point values are stored in the FPU registers in a modular format composed of:

  • Sign bit – 1 bit determining positive or negative value.
  • Exponent – 8 bits representing the exponent offset by a bias.
  • Mantissa – 23 bits of precision for single precision, 52 bits for double.

This optimized format allows a wide range of values to be represented efficiently in the FPU registers.

FPU Instructions

The ARM FPU instructions can be grouped into several categories:

Data Transfer

Move data between FPU and ARM registers:

  • FLDMX – Load FPU multiple registers from memory
  • FSTMX – Store FPU multiple registers to memory
  • FMRX – Move ARM register to FPU register
  • FMRX – Move FPU register to ARM register

Arithmetic

Basic arithmetic operations:

  • FADD – Floating point add
  • FSUB – Floating point subtract
  • FMUL – Floating point multiply
  • FDIV – Floating point divide
  • FSQRT – Floating point square root

Comparison

Compare floating point values:

  • FCMP – Floating point compare
  • FCMPE – Floating point compare with exception
  • FCMPZ – Floating point compare with zero
  • FCMPEZ – Floating point compare with zero and exception

These set status flags that can be tested by conditional instructions.

Conversion

Convert between data types:

  • FTOSI – Floating point to signed integer
  • FTOUI – Floating point to unsigned integer
  • FSITO – Signed integer to floating point
  • FUITO – Unsigned integer to floating point
  • FTOSID – Floating point to signed integer with rounding
  • FTOUID – Floating point to unsigned integer with rounding

Status and Control

Manage FPU status flags and control modes:

  • FMXR – Move FPU flags to general purpose register
  • FMRX – Move general purpose register to FPU flags
  • FMSR – Move FPU status register to general purpose register
  • FMRS – Move general purpose register to FPU status register

Programming with the FPU

Here are some key aspects to keep in mind when coding with the ARM FPU:

  • The FPU can operate in parallel with the integer pipeline for optimal performance.
  • Plan data transfers to minimize stalls – load data before it is needed.
  • Maximize throughput by scheduling FPU and integer instructions together.
  • Pay attention to data dependencies and pipeline stalls.
  • Use FPU-specific status flags to optimize conditional code.
  • Enable flush-to-zero and default NaN modes for optimized computations.
  • Allocate variables to appropriate precision to balance performance and precision.

Proper use of the FPU can provide huge performance gains for floating point intensive code. Applications such as 3D graphics, scientific computing, statistics, and digital signal processing benefit greatly from hardware accelerated floating point arithmetic.

ARM FPU Architectures

There have been several generations of ARM FPU implementations over time. Key enhancements include:

VFP (Vector Floating Point)

  • Initial ARM FPU design introduced in ARMv5 architecture.
  • Provided basic single precision floating point support.
  • 32 x 32-bit single precision registers.
  • Pipelined for high throughput.
  • Included in some Cortex-A series processors.

VFPv2

  • Introduced in ARMv6 architecture.
  • Added double precision capabilities.
  • 32 x 32-bit single precision registers.
  • 32 x 64-bit double precision registers.
  • Improved pipelining and multi-processing.

VFPv3 / VFPv4

  • Evolutionary improvements over VFPv2.
  • Faster context switching and register access.
  • Enhanced SIMD processing with 32 doubleword registers.
  • More execution units for higher throughput.
  • Included in Cortex-A5 and newer processor cores.

FPv5

  • Latest FPU in ARMv8 64-bit architecture.
  • Fully IEEE 754-2008 compliant.
  • Improved performance for scalar and SIMD code.
  • Cryptography extensions.
  • In Cortex-A35, A53, A55 and newer 64-bit cores.

Each FPU generation expanded the capabilities and performance of floating point computation on ARM chips. The evolution continues as ARM adds new instructions and capabilities to support emerging workloads.

Summary

The ARM floating point unit provides hardware acceleration for mathematical calculations using single and double precision floating point values. Its specialized FPU registers and pipelined execution improve performance substantially over integer only implementations. Proper utilization of the FPU instruction set and data types can greatly speed up code involving complex math, 3D graphics, signal processing, and scientific computations.

Newsletter Form (#3)

More ARM insights right in your inbox

 


Share This Article
Facebook Twitter Email Copy Link Print
Previous Article ARM Cortex-M0 Clock Speed
Next Article ARM Cortex-M0+ Processor
Leave a comment Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

2k Followers Like
3k Followers Follow
10.1k Followers Pin
- Sponsored-
Ad image

You Might Also Like

Why is there rotate right but not rotate left instruction in cortex m3?

The Cortex-M3 processor implements the ARM Thumb-2 instruction set architecture,…

9 Min Read

Use the same ISR for multiple interrupt sources in Cortex M0+

The Cortex M0+ processor supports handling multiple interrupt sources using…

8 Min Read

Cortex-M0 SysTick Timer

The Cortex-M0 SysTick timer is a simple countdown timer available…

6 Min Read

Which compiler is used for the ARM Cortex-M processors?

The ARM Cortex-M processors are a very popular family of…

7 Min Read
SoCSoC
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
Welcome Back!

Sign in to your account