SoC
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
  • Arm Cortex M3
  • Contact
Reading: How to Enable the FPU in Cortex-M4 Microcontrollers?
SUBSCRIBE
SoCSoC
Font ResizerAa
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Search
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Have an existing account? Sign In
Follow US
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
© S-O-C.ORG, All Rights Reserved.
Arm

How to Enable the FPU in Cortex-M4 Microcontrollers?

Scott Allen
Last updated: October 5, 2023 9:56 am
Scott Allen 7 Min Read
Share
SHARE

The Cortex-M4 processor includes a single precision floating point unit (FPU) that can significantly improve performance for applications using floating point math. However, the FPU is disabled by default and must be explicitly enabled before it can be used. This article provides a step-by-step guide on how to enable the FPU in Cortex-M4 based microcontrollers.

Contents
Overview of the Cortex-M4 FPUEnabling the FPU in Cortex-M4 Devices1. Enable FPU Access in ACR2. Enable Lazy Stacking for ExceptionsModifying Compiler SettingsGNU ARM ToolchainARM Compiler 5IAR Embedded WorkbenchChanges to Source CodeTips for Using the FPUDebugging and Profiling the FPUConclusion

Overview of the Cortex-M4 FPU

The FPU in the Cortex-M4 is an implementation of the ARMv7E-M architecture. It supports single precision (32-bit) floating point data types and operations compliant with the IEEE 754 standard. Key features of the Cortex-M4 FPU include:

  • Supports up to 2.14 GFLOPS at 210MHz
  • Operates on 32-bit single precision floating point values
  • Provides full hardware support for converting between float and integer values
  • Implements commonly used mathematical operations like add, subtract, multiply, divide, square root, etc.
  • Uses registers s0-s31 for floating point values
  • Shares system resources like buses, memory, and peripherals with the CPU core

The FPU significantly boosts performance of code using floating point math. Typical speedups are 3x-10x depending on usage. This makes it very beneficial for DSP algorithms, 3D graphics, control systems, and other applications using floating point calculations.

Enabling the FPU in Cortex-M4 Devices

The Cortex-M4 FPU is disabled by default out of reset. To use the FPU, it must be explicitly enabled by setting the correct option bits. This is usually done by the processor boot code during system initialization. There are two main steps:

  1. Enable FPU access in the Auxiliary Control Register
  2. Enable lazy stacking for efficient exception handling

The steps need to be performed in order. Enabling the FPU without lazy stacking will result in undefined behavior. The following sections explain the steps in more detail.

1. Enable FPU Access in ACR

The Auxiliary Control Register (ACR) controls access permissions to various system resources in Cortex-M4 processors. There is a dedicated FPU enable bit that must be set to allow FPU instructions to execute.

To enable the FPU, set bit 20 in the ACR register: // Enable FPU in ACR ACR |= 0x00100000; The ACR is generally configured very early during boot up even before the .data section is initialized. This is done so any floating point variables declared in .data can be accessed correctly.

2. Enable Lazy Stacking for Exceptions

By default, the Cortex-M4 will save floating point state on every exception which can incur significant overhead. Lazy stacking allows optimization of this process by only saving FPU state right before a floating point instruction.

To enable lazy stacking, set bit 18 in the CONTROL register: // Enable lazy stacking for exceptions CONTROL |= 0x00040000; This causes minimal overhead for exceptions occurring during integer code execution. The CONTROL register configuration is done after the ACR but before any use of the FPU.

Modifying Compiler Settings

After enabling the FPU in hardware, compiler settings need to be modified to generate code using floating point instructions. This requires configuring the compiler to:

  • Use hardware floating point calling convention
  • Use FPU registers instead of soft-float emulation
  • Perform FPU-specific optimizations

Exact compiler settings depend on which toolchain you are using. Some common examples are shown below:

GNU ARM Toolchain

arm-none-eabi-gcc -mcpu=cortex-m4 -mthumb -mfloat-abi=hard -mfpu=fpv4-sp-d16

ARM Compiler 5

armcc –cpu=Cortex-M4.fp –fpmode=fast

IAR Embedded Workbench

–cpu_mode thumb –fpu=VFPv4SP

Consult your compiler documentation for exact options. The key flags are enabling hardware float ABI and selecting the fpv4-sp FPU architecture.

Changes to Source Code

Aside from compiler settings, the following source code changes should be made when using the FPU:

  • Use float instead of double for floating point values
  • Declare math functions like sin(), cos(), etc from math.h instead of cmath
  • Link against libm.a for hardware implementations of math functions
  • Surround float to integer conversions with __enable_irq() and __disable_irq() to prevent corruption

With these changes, the existing code should work correctly using the hardware FPU without any behavioral differences.

Tips for Using the FPU

Here are some additional tips for working with the Cortex-M4 FPU:

  • Minimize switching between float and integer code to reduce lazy stacking overhead.
  • Use float liberally in performance critical code since FPU is much faster.
  • Split floating point and integer variables into separate structs/classes for better performance.
  • Measure cycle counts between soft-float emulation and FPU to quantify performance gains.
  • Monitor stack usage since lazy stacking increases stack burden.
  • Enable FPU early during debug sessions so hardware breakpoints work correctly.

Debugging and Profiling the FPU

It can take some effort to efficiently utilize the Cortex-M4 FPU. Here are some techniques for debugging and profiling floating point code:

  • Set breakpoints on floating point instructions like VADD, VDIV, etc.
  • Single step through code to verify correct registers are used.
  • Print out Emulation vs FPU cycle counts for code segments.
  • Generate assembly listing to analyze compiler output.
  • Check lazy stacking behavior via exceptions and monitor CONTROL register.
  • Measure FPU impact on interrupt latency and context switching.
  • Use debugger to view and modify FPU register contents.

With careful debugging and profiling, the true performance benefits of the Cortex-M4 FPU can be realized.

Conclusion

Enabling the FPU in Cortex-M4 microcontrollers requires configuring the ACR and CONTROL registers in addition to compiler settings. This activates floating point hardware support for significant performance gains in math-heavy code. With the FPU enabled, existing code can benefit from hardware acceleration with only minor source modifications. Overall, the Cortex-M4 FPU is an extremely useful feature for applications leveraging floating point calculations.

Newsletter Form (#3)

More ARM insights right in your inbox

 


Share This Article
Facebook Twitter Email Copy Link Print
Previous Article What is FPU in Cortex-M4?
Next Article How much memory does the Cortex-M4 have?
Leave a comment Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

2k Followers Like
3k Followers Follow
10.1k Followers Pin
- Sponsored-
Ad image

You Might Also Like

Reverse Engineering ARM Binaries

Reverse engineering ARM binaries involves taking apart and analyzing ARM…

8 Min Read

ARM Debug Interface v5 Architecture Specification

The ARM Debug Interface v5 Architecture Specification defines a standard…

8 Min Read

What is arm Cortex-M23?

The ARM Cortex-M23 is a 32-bit processor core designed for…

5 Min Read

What are the disadvantages and benefits of ARM’s CMSIS?

The Cortex Microcontroller Software Interface Standard (CMSIS) is a vendor-independent…

6 Min Read
SoCSoC
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
Welcome Back!

Sign in to your account