Double-precision (DP) floating-point instructions refer to operations that process 64-bit double-precision floating-point data types on Arm Cortex-M series processors. These instructions enable more precise computations on fractional values compared to 32-bit single-precision operations. The Cortex-M4 and newer Cortex-M processors have support for double-precision instructions as an optional extension to the base instruction set.
Background on Floating-Point Representations
Floating-point numbers are used to represent fractional values in computing, as opposed to integer values. The floating-point representation uses a fixed number of bits, with some bits allocated for the significand (mantissa) which holds the significant digits of the value, and some bits for the exponent which specifies the power of two by which the significand is multiplied. This allows representing a wide range of very small to very large values with fractional precision.
Single-precision uses 32 bits total, with 24 bits for the significand and 8 bits for the exponent. Double-precision uses 64 bits total, with 53 bits for the significand and 11 bits for the exponent. The greater number of bits allows more precision for fractional values. However, double-precision operations take more processing time and memory.
Cortex-M Processor Floating-Point Support
The Arm Cortex-M0, M0+, M1, M3 and M4 processors have native support for 32-bit single-precision floating-point arithmetic. This includes add, subtract, multiply, divide, square root, compare, convert, and data-processing operations on the 32-bit registers s0-s31. There is also support for single-precision loads, stores and moves.
Starting with Cortex-M4, there is optional support for double-precision operations, enabled via the DSP extension. Cortex-M7, M23, M33, M35P and M55 all support the DSP extension. With this, the processor gets additional 64-bit floating-point registers d0-d15, and instructions that perform arithmetic, data-processing, compare, convert, load, store and move operations on 64-bit double-precision data.
Benefits of Double-Precision for Cortex-M
The double-precision extension brings several benefits for Cortex-M applications:
- Greater precision for fractional computations, reduced rounding errors
- Support for 64-bit IEEE 754 format doubles in code and data
- Ability to port algorithms using doubles from other architectures
- Matches precision of common single and double float types in C/C++ code
- Conforms with arithmetic standards for cryptographic libraries
- Allows high precision math for signal processing, analytics, and scientific applications
- Code density benefits compared to software double-precision emulation
The key advantage is reducing rounding errors and improving precision. With only 24 bits in the single-precision significand, rounding errors can accumulate over long computational sequences and simulations. Having 53 bits of precision with double reduces this. Double-precision is especially useful when the application relies on very small differences between floating-point values.
Double-Precision Instruction Set
Here are some examples of common double-precision floating-point instructions available on Cortex-M processors with the DSP extension:
- FLD, FST – Load and store 64-bit doubles to/from memory
- FMRD, FMDR – Move between double-precision and single-precision registers
- FADDD, FSUBD, FMULD, FDIVD – Double add, subtract, multiply, divide
- FCMPD, FCMPED – Double-precision compare
- FTOUIS, FTOUID – Convert double to various integer sizes
- FSITOD, FUITOD – Convert integer to double
- FABSD, FNEGD – Double absolute value and negation
- FSQRTD – Double-precision square root
This expands the capability from 32-bit to 64-bit arithmetic operations. The FPU in Cortex-M can perform parallel single and double-precision operations. Single-precision uses the S-registers while double-precision uses the new D-registers. Having both types enables mixed-precision math.
Enabling Double-Precision in Cortex-M
Using double-precision floating-point on Cortex-M requires:
- Processor that includes DSP extension (M4 and above)
- Toolchain support for double-precision operations
- Code that utilizes the double-precision registers and data types
The processor DSP extension is optional – so the specific Cortex-M device must include it. The compiler and libraries used must also have support for double-precision types and operations. Finally, the application code needs to declare double variables, and use the double-precision registers and instruction mnemonics. This may require changes to utilize the FPU D-registers rather than S-registers.
Performing Common Double-Precision Operations
Here are examples of how some common double-precision floating-point operations can be performed on supported Cortex-M processors:
Load Double from Memory
double var; FLD D0, [R1] // Load double from memory address in R1 into D0
Store Double to Memory
double var; FSTD D1, [R2] // Store D1 to memory address in R2
Double Addition
double a, b, c; FADDD D0, D1, D2 // D0 = D1 + D2
Double Multiplication
double x, y, z; FMULD D3, D4, D5 // D3 = D4 * D5
Double Division
double num, denom, result; FDIVD D6, D7, D8 // D6 = D7 / D8
Similar code patterns work for other double arithmetic, conversion, compare and data processing operations.
Limitations of Double-Precision on Cortex-M
While the DSP extension adds useful double-precision support, there are some limitations to keep in mind:
- Not available on Cortex-M0, M0+, M1, M3 processors
- DSP extension is optional – needs to be included in device
- Higher latency vs single-precision operations
- No hardware divide unit – division is multicycle algorithmic
- Only 16 64-bit registers available (d0-d15)
- Code density impact vs 32-bit code
Double-precision is best suited for applications that require the higher precision and can tolerate the performance and code size impact compared to single-precision. It improves the accuracy of floating-point intensive code on Cortex-M.
Conclusion
The double-precision floating-point extension provides support for 64-bit double data types and arithmetic operations on newer Arm Cortex-M processors. This enables more precise computations on fractional values for applications that require higher accuracy and range. Using the right compiler tools and libraries, developers can leverage double-precision to reduce rounding errors, match common data types, and port complex algorithms. The DSP extension is a valuable addition for Cortex-M processors that are used in computationally intensive tasks.