When to Use Hardware vs Software Floating Point with Arm Cortex M?

When deciding whether to use hardware or software floating point with Arm Cortex M processors, there are a few key factors to consider. Hardware floating point support provides faster floating point math performance, while software floating point gives more flexibility and portability. The choice depends on the application requirements and constraints.

Contents

Introduction to Floating Point on Arm Cortex M Benefits of Hardware Floating Point Benefits of Software Floating Point Floating Point Hardware Support in Arm Cortex M Software Floating Point Libraries Floating Point Code Size Software Floating Point Precision Floating Point Code Optimization Error Handling Floating Point Benchmarks Power and Cost Software vs Hardware Tradeoffs Recommended Usage Guidelines Conclusion

Introduction to Floating Point on Arm Cortex M

Floating point numbers represent real numbers with a fraction and exponent, allowing a wide range of values to be represented. Single precision floats use 32 bits, with 1 sign bit, 8 exponent bits, and 23 mantissa bits. Double precision uses 64 bits, with 1 sign, 11 exponent, and 52 mantissa bits.

Arm Cortex M processors like Cortex-M4 and Cortex-M7 include optional hardware floating point units (FPUs). With an FPU, floating point operations can be performed in hardware quickly. Without an FPU, floating point math must be done in software using integer operations, which is much slower.

Benefits of Hardware Floating Point

Using the built-in FPU provides significant performance benefits for floating point intensive code:

Hardware floating point is 10-100x faster than equivalent software routines
Speedup applies to common operations like add, subtract, multiply, divide

Hardware parallelizes operations; software is sequential
Special values like infinities and NaNs handled in hardware
Hardware includes accelerated transcendental functions

Hardware maintains precision without needing large intermediate values
Code size is reduced by using compact hardware instructions

For applications doing a lot of floating point math, like digital signal processing, 3D graphics, or sensor fusion, the hardware FPU will provide a major performance boost and faster execution times. The hardware is optimized specifically for floating point.

Benefits of Software Floating Point

While hardware floating point has performance advantages, software implementations have benefits around flexibility and portability:

Works on any Cortex M, without requiring FPU support
Code is portable between devices with and without FPU

Can use same code on lower cost chips without hardware FP
Precision and error handling can be customized in software
Software routines can be modified and optimized

Small code footprint for basic operations
Avoids increase in chip cost, power usage from hardware FPU

Software floating point allows floating point code to work across any Arm Cortex M device. This provides flexibility in product development, letting you reuse code on lower cost microcontrollers missing the FPU. Software also gives more control over floating point precision and errors.

Floating Point Hardware Support in Arm Cortex M

The level of floating point support varies across the Arm Cortex M product line:

Cortex-M0/M0+ – No floating point hardware
Cortex-M3 – Optional single precision FPU

Cortex-M4 – Optional single precision FPU
Cortex-M7 – Optional single and double precision FPU
Cortex-M23 – Optional single precision FPU

Cortex-M33 – Mandatory single precision FPU
Cortex-M35P – Optional single precision FPU

Higher end Cortex M cores add hardware floating point options. The most capable FPU support is on Cortex-M7, with optional single and double precision. Cortex-M33 is the first with mandatory single precision FPU. Software floating point is needed as a fallback for cores without FPUs.

Software Floating Point Libraries

To enable software floating point on Arm Cortex M, libraries are available with optimized routines written in C:

Newlib-nano – open source library from Arm, BSD licensed
ARMCompiler 6 – proprietary library from Arm

RISC-V Compiler-RT – clang/LLVM float library, BSD licensed
Berkely SoftFloat – BSD licensed pure software floating point
Cephes Math Library – transcendental functions

These provide software implementations of float add, subtract, multiply, divide, comparison operations, type conversions, and math functions like sine, cosine, log, exponentiation. By linking in a software float library, code can perform floating point on any Cortex M.

Floating Point Code Size

Software floating point code takes up more size than hardware floating point. Here are some typical instruction counts for common operations:

Float add – 1 instruction (FPU), ~100 instructions (software)

Float multiply – 1 instruction (FPU), ~200 instructions (software)
Float sin – 10-20 instructions (FPU), ~300 instructions (software)
Float exp – 10-20 instructions (FPU), ~400 instructions (software)

Exact instruction counts depend on the implementation. But hardware floating point requires far fewer instructions than software routines for most operations. This reduces code size.

Software Floating Point Precision

With software floating point, precision is customizable based on application needs:

Single precision – 32 bit floats

Double precision – 64 bit floats
Custom precisions – e.g. 40 bit floats
Configurable mantissa/exponent sizes

The FPU only supports single and double precision in hardware. But with software, custom float sizes are possible for applications needing higher or lower precision. Precision affects accuracy, performance and memory usage.

Floating Point Code Optimization

There are optimization techniques to improve software floating point performance on Arm Cortex M:

Use hardware integer operations for add, subtract, multiply

Optimize division and remainder using constants
Lookup tables for trig, log, exp instead of calculations
Loop unrolling, function inlining to reduce overhead

Assembly optimizations in critical functions
Use MPU to ensure deterministic execution times

While software floating point is slower, various methods like lookup tables, DIY math, and assembly can help improve performance. Hardware FPUs use similar techniques internally.

Error Handling

Floating point hardware and software handle errors differently:

FPU follows IEEE 754 spec for exceptions
Software can implement custom error handling

Software lets you control precision loss behavior
FPU handles some errors asynchronously
Software exceptions can be caught directly by code

The FPU will set exception bits defined in IEEE 754 spec on errors. But software floating point lets errors be detected in code immediately when they occur. This allows full control over error handling.

Floating Point Benchmarks

Here are sample benchmark results for 32-bit float operations on Cortex-M7 with FPU vs. software float:

Operation	FPU Cycles	Software Cycles
Add	3	100
Multiply	3	350
Divide	16	1000
Sqrt	20	300
Sin	60	700
Exp	110	1200

The hardware FPU provides around 10-100x speedup across basic and transcendental operations. Exact ratios depend on the software library used.

Power and Cost

The FPU increases chip cost, complexity, and power usage. Cortex M cores with FPUs have:

Higher gate counts – FPU is over 20k gates
Increased silicon area used

Added power usage even when FPU not used
Higher cost per unit for FPU versions

For low power or size constrained applications, avoiding the FPU can reduce system power and cost overheads. The impact varies based on specific Arm chip being used.

Software vs Hardware Tradeoffs

Here is a summary of the key tradeoffs between hardware and software floating point:

	Hardware FPU	Software Float
Performance	Much faster	Slower, but optimizable
Precision	Fixed single, double	Configurable precision
Code size	Much smaller	Larger code
Error handling	Defined by IEEE 754	Customizable
Portability	Only works with FPU	Works on any Cortex M
Power/Cost	Higher	Lower without FPU

The right choice depends on if the benefits of hardware speed and size outweigh the need for software flexibility and portability for a project.

Recommended Usage Guidelines

Based on the tradeoffs, here are some general guidelines on when to use hardware vs software floating point with Cortex M:

Use FPU for heavy floating point code to boost performance
Use FPU if code size constraints make software impractical
Use software float for portability across Cortex M devices

Use software if FPU cost or power are prohibitive
Use software float for custom precision needs
Use software if error handling requirements differ from IEEE 754

For performance critical applications doing significant floating point, favor using the FPU to speed up execution. In other cases where flexibility or portability are priorities, software floating point may be the better choice.

Conclusion

Hardware and software floating point both have benefits for Arm Cortex M chips. Hardware FPUs provide extremely fast floating point, while software gives portability and precision configurability. For lightweight floating point uses, software may be suitable, but for intensive processing, the massive speedup of hardware floating point is hard to ignore. By understanding the tradeoffs, developers can choose the best floating point approach for their particular application and constraints.

When to Use Hardware vs Software Floating Point with Arm Cortex M?

Introduction to Floating Point on Arm Cortex M

Benefits of Hardware Floating Point

Benefits of Software Floating Point

Floating Point Hardware Support in Arm Cortex M

Software Floating Point Libraries

Floating Point Code Size

Software Floating Point Precision

Floating Point Code Optimization

Error Handling

Floating Point Benchmarks

Power and Cost

Software vs Hardware Tradeoffs

Recommended Usage Guidelines

Conclusion

More ARM insights right in your inbox

Leave a Reply Cancel reply

You Might Also Like

Step-by-Step Guide to Cortex-M0 PendSV Exception Handling

ARM Cortex-M4 Processor Specification

What are the exception numbers for the Cortex-M4 processor?

Is the Cortex-M ARMv8?