SoC
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
  • Arm Cortex M3
  • Contact
Reading: When to Use Hardware vs Software Floating Point with Arm Cortex M?
SUBSCRIBE
SoCSoC
Font ResizerAa
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Search
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Have an existing account? Sign In
Follow US
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
© S-O-C.ORG, All Rights Reserved.
Arm

When to Use Hardware vs Software Floating Point with Arm Cortex M?

Mike Johnston
Last updated: October 5, 2023 9:19 am
Mike Johnston 10 Min Read
Share
SHARE

When deciding whether to use hardware or software floating point with Arm Cortex M processors, there are a few key factors to consider. Hardware floating point support provides faster floating point math performance, while software floating point gives more flexibility and portability. The choice depends on the application requirements and constraints.

Contents
Introduction to Floating Point on Arm Cortex MBenefits of Hardware Floating PointBenefits of Software Floating PointFloating Point Hardware Support in Arm Cortex MSoftware Floating Point LibrariesFloating Point Code SizeSoftware Floating Point PrecisionFloating Point Code OptimizationError HandlingFloating Point BenchmarksPower and CostSoftware vs Hardware TradeoffsRecommended Usage GuidelinesConclusion

Introduction to Floating Point on Arm Cortex M

Floating point numbers represent real numbers with a fraction and exponent, allowing a wide range of values to be represented. Single precision floats use 32 bits, with 1 sign bit, 8 exponent bits, and 23 mantissa bits. Double precision uses 64 bits, with 1 sign, 11 exponent, and 52 mantissa bits.

Arm Cortex M processors like Cortex-M4 and Cortex-M7 include optional hardware floating point units (FPUs). With an FPU, floating point operations can be performed in hardware quickly. Without an FPU, floating point math must be done in software using integer operations, which is much slower.

Benefits of Hardware Floating Point

Using the built-in FPU provides significant performance benefits for floating point intensive code:

  • Hardware floating point is 10-100x faster than equivalent software routines
  • Speedup applies to common operations like add, subtract, multiply, divide
  • Hardware parallelizes operations; software is sequential
  • Special values like infinities and NaNs handled in hardware
  • Hardware includes accelerated transcendental functions
  • Hardware maintains precision without needing large intermediate values
  • Code size is reduced by using compact hardware instructions

For applications doing a lot of floating point math, like digital signal processing, 3D graphics, or sensor fusion, the hardware FPU will provide a major performance boost and faster execution times. The hardware is optimized specifically for floating point.

Benefits of Software Floating Point

While hardware floating point has performance advantages, software implementations have benefits around flexibility and portability:

  • Works on any Cortex M, without requiring FPU support
  • Code is portable between devices with and without FPU
  • Can use same code on lower cost chips without hardware FP
  • Precision and error handling can be customized in software
  • Software routines can be modified and optimized
  • Small code footprint for basic operations
  • Avoids increase in chip cost, power usage from hardware FPU

Software floating point allows floating point code to work across any Arm Cortex M device. This provides flexibility in product development, letting you reuse code on lower cost microcontrollers missing the FPU. Software also gives more control over floating point precision and errors.

Floating Point Hardware Support in Arm Cortex M

The level of floating point support varies across the Arm Cortex M product line:

  • Cortex-M0/M0+ – No floating point hardware
  • Cortex-M3 – Optional single precision FPU
  • Cortex-M4 – Optional single precision FPU
  • Cortex-M7 – Optional single and double precision FPU
  • Cortex-M23 – Optional single precision FPU
  • Cortex-M33 – Mandatory single precision FPU
  • Cortex-M35P – Optional single precision FPU

Higher end Cortex M cores add hardware floating point options. The most capable FPU support is on Cortex-M7, with optional single and double precision. Cortex-M33 is the first with mandatory single precision FPU. Software floating point is needed as a fallback for cores without FPUs.

Software Floating Point Libraries

To enable software floating point on Arm Cortex M, libraries are available with optimized routines written in C:

  • Newlib-nano – open source library from Arm, BSD licensed
  • ARMCompiler 6 – proprietary library from Arm
  • RISC-V Compiler-RT – clang/LLVM float library, BSD licensed
  • Berkely SoftFloat – BSD licensed pure software floating point
  • Cephes Math Library – transcendental functions

These provide software implementations of float add, subtract, multiply, divide, comparison operations, type conversions, and math functions like sine, cosine, log, exponentiation. By linking in a software float library, code can perform floating point on any Cortex M.

Floating Point Code Size

Software floating point code takes up more size than hardware floating point. Here are some typical instruction counts for common operations:

  • Float add – 1 instruction (FPU), ~100 instructions (software)
  • Float multiply – 1 instruction (FPU), ~200 instructions (software)
  • Float sin – 10-20 instructions (FPU), ~300 instructions (software)
  • Float exp – 10-20 instructions (FPU), ~400 instructions (software)

Exact instruction counts depend on the implementation. But hardware floating point requires far fewer instructions than software routines for most operations. This reduces code size.

Software Floating Point Precision

With software floating point, precision is customizable based on application needs:

  • Single precision – 32 bit floats
  • Double precision – 64 bit floats
  • Custom precisions – e.g. 40 bit floats
  • Configurable mantissa/exponent sizes

The FPU only supports single and double precision in hardware. But with software, custom float sizes are possible for applications needing higher or lower precision. Precision affects accuracy, performance and memory usage.

Floating Point Code Optimization

There are optimization techniques to improve software floating point performance on Arm Cortex M:

  • Use hardware integer operations for add, subtract, multiply
  • Optimize division and remainder using constants
  • Lookup tables for trig, log, exp instead of calculations
  • Loop unrolling, function inlining to reduce overhead
  • Assembly optimizations in critical functions
  • Use MPU to ensure deterministic execution times

While software floating point is slower, various methods like lookup tables, DIY math, and assembly can help improve performance. Hardware FPUs use similar techniques internally.

Error Handling

Floating point hardware and software handle errors differently:

  • FPU follows IEEE 754 spec for exceptions
  • Software can implement custom error handling
  • Software lets you control precision loss behavior
  • FPU handles some errors asynchronously
  • Software exceptions can be caught directly by code

The FPU will set exception bits defined in IEEE 754 spec on errors. But software floating point lets errors be detected in code immediately when they occur. This allows full control over error handling.

Floating Point Benchmarks

Here are sample benchmark results for 32-bit float operations on Cortex-M7 with FPU vs. software float:

OperationFPU CyclesSoftware Cycles
Add3100
Multiply3350
Divide161000
Sqrt20300
Sin60700
Exp1101200

The hardware FPU provides around 10-100x speedup across basic and transcendental operations. Exact ratios depend on the software library used.

Power and Cost

The FPU increases chip cost, complexity, and power usage. Cortex M cores with FPUs have:

  • Higher gate counts – FPU is over 20k gates
  • Increased silicon area used
  • Added power usage even when FPU not used
  • Higher cost per unit for FPU versions

For low power or size constrained applications, avoiding the FPU can reduce system power and cost overheads. The impact varies based on specific Arm chip being used.

Software vs Hardware Tradeoffs

Here is a summary of the key tradeoffs between hardware and software floating point:

Hardware FPUSoftware Float
PerformanceMuch fasterSlower, but optimizable
PrecisionFixed single, doubleConfigurable precision
Code sizeMuch smallerLarger code
Error handlingDefined by IEEE 754Customizable
PortabilityOnly works with FPUWorks on any Cortex M
Power/CostHigherLower without FPU

The right choice depends on if the benefits of hardware speed and size outweigh the need for software flexibility and portability for a project.

Recommended Usage Guidelines

Based on the tradeoffs, here are some general guidelines on when to use hardware vs software floating point with Cortex M:

  • Use FPU for heavy floating point code to boost performance
  • Use FPU if code size constraints make software impractical
  • Use software float for portability across Cortex M devices
  • Use software if FPU cost or power are prohibitive
  • Use software float for custom precision needs
  • Use software if error handling requirements differ from IEEE 754

For performance critical applications doing significant floating point, favor using the FPU to speed up execution. In other cases where flexibility or portability are priorities, software floating point may be the better choice.

Conclusion

Hardware and software floating point both have benefits for Arm Cortex M chips. Hardware FPUs provide extremely fast floating point, while software gives portability and precision configurability. For lightweight floating point uses, software may be suitable, but for intensive processing, the massive speedup of hardware floating point is hard to ignore. By understanding the tradeoffs, developers can choose the best floating point approach for their particular application and constraints.

Newsletter Form (#3)

More ARM insights right in your inbox

 


Share This Article
Facebook Twitter Email Copy Link Print
Previous Article Implementing Floating Point Math on Cortex-M3
Next Article Options for Floating Point Math on Cortex M Without FPUs
Leave a comment Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

2k Followers Like
3k Followers Follow
10.1k Followers Pin
- Sponsored-
Ad image

You Might Also Like

ARM Cortex M Registers

The ARM Cortex-M is a group of 32-bit RISC ARM…

6 Min Read

Application Binary Interface Examples

An application binary interface (ABI) defines how application programs can…

8 Min Read

What is the stack pointer in the ARM Cortex-M4?

The stack pointer in the ARM Cortex-M4 is a register…

11 Min Read

Is Raspberry Pi 4 ARM or arm64?

The Raspberry Pi 4 is based on the ARM Cortex-A72…

8 Min Read
SoCSoC
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
Welcome Back!

Sign in to your account