The ARM Cortex M3 processor has limited built-in support for floating point operations. While the Cortex M3 CPU core itself does not contain floating point hardware, the processor includes an optional coprocessor interface that allows a floating point coprocessor to be added. This provides Cortex M3 based microcontrollers with the ability to perform floating point math while minimizing cost and power consumption.
Cortex M3 Floating Point Coprocessor
The most common way floating point capability is added to a Cortex M3 processor is by including a floating point coprocessor in the microcontroller design. This coprocessor connects to the CPU core via the Cortex M3 coprocessor interface. The interface provides a standardized way for the CPU to offload floating point instructions and data to the coprocessor for execution.
ARM offers a coprocessor called the FPv4-SP that implements the ARMv7-M floating point instruction set. This coprocessor provides single precision (32-bit) floating point operations compliant with the IEEE 754 standard. The FPv4-SP contains a register file, operand shifters, a floating point adder and other hardware to handle floating point operations independently from the Cortex M3 CPU.
Microcontroller manufacturers commonly integrate the FPv4-SP or similar coprocessor into their Cortex M3 based MCU designs to give them floating point abilities. For example, the STM32F103xx series from STMicroelectronics includes the FPv4-SP coprocessor in most of its mid- to high-end Cortex M3 MCU variants.
Software Support
To take advantage of a Cortex M3’s floating point coprocessor, the software toolchain must include appropriate support. This includes the compiler, assembler, linker and libraries.
For compilers, ARM’s armcc and gcc both support generating floating point code for Cortex M3. The compiler is configured to target the FPv4-SP instruction set and inserts coprocessor interface related instructions in the generated assembly code.
Likewise, the assembler and linker need to support the FPv4-SP instruction set when compiling handwritten assembly code. Assembly code can directly use the floating point coprocessor instructions for maximum efficiency in time critical routines.
The C standard library that comes with the toolchain also needs floating point support. Newlib, ARM’s standard C library implementation for Cortex-M, includes software floating point if no hardware floating point is available. But it can be configured to use FPv4-SP instructions for better performance.
Enabling Floating Point in Code
For compilers to generate floating point instructions, floating point support must be enabled. In C/C++ code, this is done by including the <math.h>
header and compiling with a floating point enabled toolchain.
The <math.h>
header exposes floating point math functions like sin(), cos(), sqrt()
etc. When compiled for the Cortex M3 + FPv4-SP, the compiler will generate hardware floating point instructions to implement these math functions.
For assembly code, the IDE or assembler can be configured to allow assembly instructions from the FPv4-SP instruction set. This allows hand optimized fast floating point routines to be created at the assembly level.
Floating Point Register Set
The FPv4-SP floating point coprocessor contains a bank of 32 single precision (32 bit) registers, S0 to S31. These registers store floating point values and can be accessed using the dedicated VFP load/store instructions.
The FPUSR system register provides status and control functionality for the floating point unit. Bits in the FPCCR control register enable or disable floating point exceptions and rounding modes.
Data Types
The Cortex M3 with FPv4-SP supports single precision floating point data types in hardware. This includes:
- float – 32-bit single precision
- double – 64-bit double precision supported in software
The float type maps directly to the 32-bit single precision registers and operations of the FPv4-SP coprocessor. This provides efficient storage of 32-bit floats and fast single precision arithmetic.
The double type provides 64-bit double precision floating point but is implemented in software. Double precision operations compile to software library routines to perform the calculations sequentially using the 32-bit float registers.
Floating Point Instructions
The Cortex M3 floating point instruction set includes load, store, move, arithmetic, and conversion operations. These provide functionality comparable to the floating point support in Cortex-A series processors but optimized for the Cortex-M profile.
Key FPv4-SP instruction groups include:
- Load and store – LFM/SFM transfer between memory and FP registers
- Data movement – FMRX/FMRX between ARM and FP registers
- Arithmetic – FMADD, FMSUB, FNMSUB, FNMADD for multiply-accumulate type operations
- Comparisons – FCMP, FCMPE, FCMPZ, FCMPEZ
- Conversions – FTOSI, FTOUI, SITOF, UITOF
These instructions operate on the 32-bit float registers to deliver high performance floating point calculations with low power consumption and code size compared to software libraries.
Floating Point Exceptions
The FPv4-SP supports reporting of floating point exceptions through the FP EXC bit in the FPSCR status register. This flags when errors like divide by zero or overflow occur during floating point operations.
By default floating point exceptions are ignored and do not halt execution. But they can be enabled to generate a usage fault exception for error handling in code.
Performance
For floating point intensive software, utilizing FPv4-SP hardware acceleration can provide significant performance improvements over software floating point libraries. Typical speedups of 4X to 10X are possible depending on the algorithms used.
Software libraries use integer operations to simulate floating point in software, which requires many more clock cycles per operation. The dedicated floating point hardware on FPv4-SP executes most instructions in just 1-3 cycles for much higher throughput.
For real-time DSP algorithms or complex math functions, the difference between hardware assisted and pure software floating point is even more substantial. Hardware acceleration enables advanced analytics and signal processing previously not practical on Cortex-M3.
Energy Efficiency
In addition to performance benefits, the FPv4-SP floating point unit reduces power consumption for floating point math compared to software libraries. Floating point instructions take fewer CPU clock cycles leading to less energy used.
Software floating point routines often require significantly higher CPU clock frequencies to achieve adequate performance. Hardware acceleration with FPv4-SP enables the CPU to run at lower frequencies since floating point is offloaded.
The dedicated floating point datapath also draws less current than general purpose CPU logic for most operations. Together this provides major improvements in energy efficiency making floating point capabilities more practical for power sensitive devices.
Conclusion
The optional FPv4-SP floating point coprocessor enables high performance single precision floating point math on Cortex M3 microcontrollers. With compiler and library support, the floating point hardware is mostly transparent to developers allowing familiar APIs and data types to be used efficiently.
For microcontroller applications requiring floating point capabilities like signal processing, control systems, and data analytics, the Cortex M3 with FPv4-SP coprocessor provides an optimal combination of low cost, low power, and high performance.