SoC
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
  • Arm Cortex M3
  • Contact
Reading: What causes hard fault in arm cortex?
SUBSCRIBE
SoCSoC
Font ResizerAa
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Search
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Have an existing account? Sign In
Follow US
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
© S-O-C.ORG, All Rights Reserved.
Arm

What causes hard fault in arm cortex?

Eileen David
Last updated: September 13, 2023 6:40 am
Eileen David 8 Min Read
Share
SHARE

A hard fault on an ARM Cortex processor is an unrecoverable error that causes the processor to enter an exception state and halt normal program execution. Hard faults indicate serious problems like hardware failures, memory faults, or invalid instruction execution that cannot be handled gracefully by the system. Identifying the root cause of a hard fault is key to resolving issues and restoring proper functionality.

Contents
Invalid memory accessUnaligned memory accessInteger divide by zeroInvalid instructions and opcode issuesStack overflowsFloating point exceptionsBus faultsUndefined exceptionsDebug eventsOS task switching errorsPower, clock and EMI issuesIdentifying Root Cause

There are several potential causes of hard faults on ARM Cortex chips:

Invalid memory access

One major cause of hard faults is invalid memory accesses. This occurs when code attempts to read or write to restricted regions of memory or access memory using invalid addresses. Examples include:

  • Accessing null pointer addresses
  • Reading or writing outside array bounds
  • Executing code from invalid addresses
  • Stack overflow errors corrupting the stack memory

Memory faults generate a MemManage exception which escalates to a hard fault if unhandled. Enabling the Memory Management Unit (MMU) and programming it correctly is key to avoiding invalid memory access faults.

Unaligned memory access

Unaligned memory accesses attempt to read or write data on addresses that are not integer multiples of the data size. For example, a 32-bit read from address 0x123 would be unaligned. The Cortex-M3 and Cortex-M4 do not support unaligned accesses which will lead to a hard fault on those processors.

Aligning data structures properly and avoiding type-casting structs can prevent this issue. Setting the SCB_CCR.UNALIGN_TRP bit can also trap unaligned accesses and prevent a hard fault.

Integer divide by zero

Division by zero is an illegal operation which causes a hard fault exception on ARM Cortex chips. This includes the SDIV and UDIV divide instructions operating on a zero denominator at runtime. Rigorously checking operands to avoid division by zero prevents such hard faults.

Invalid instructions and opcode issues

Execution of undefined or invalid opcodes can generate a UsageFault exception that escalates to a hard fault. Potential causes include:

  • Memory corruption changing instruction opcodes
  • Jumping to non-executable memory addresses
  • Improper code modifications via JTAG/SWD
  • Unsupported coprocessor instruction exceptions
  • Disabled extension opcodes like SIMD/DSP when running legacy code

Enabling the MPU to limit instruction execution to verified memory regions can mitigate invalid opcode related hard faults.

Stack overflows

The processor stack contains return addresses, function parameters, and local variables allocated on subroutine calls. Stack overflows due to excessive nesting, recursive calls, large stack allocations etc. can overwrite other memory regions. This causes a MemManage fault escalating to a hard fault exception.

Stack overflows can be avoided by:

  • Increasing the stack size appropriately
  • Profiling stack usage to catch overflow issues
  • Minimizing large stack allocations
  • Avoiding infinite loops and runaway recursion

Floating point exceptions

The Cortex-M4 and Cortex-M7 cores include hardware floating point units. Floating point code may generate exceptions like divide-by-zero, underflow, overflow, invalid operation etc. These are escalated to UsageFault or BusFault exceptions, causing a hard fault if unhandled.

Proper input validation and checking return codes after FP instructions can catch these exceptions early before they escalate to hard faults.

Bus faults

Bus faults indicate an error occurred during instruction or data bus transactions. These could arise from:

  • External memory errors – ECC errors, timing violations
  • Flash memory errors – ECC errors, access timing issues
  • System bus contention with peripherals leading to wait state violations
  • Memory controller configuration issues – incorrect timing parameters

Bus faults can be debugged by checking memory interfaces and buses for electrical or timing issues. The ARM CoreSight components like ETM trace can help record bus transactions leading up to the fault.

Undefined exceptions

Undefined exceptions (UND faults) occur on attempt to execute an undefined instruction for the current processor state. For example:

  • Attempting to execute ARM instruction on a Thumb-only core
  • Conditional instruction that fails its condition code check
  • Changed processor state to ARM, then executing undefined Thumb instruction

Avoiding intermixing of ARM and Thumb instructions and checking condition flags can prevent undefined exceptions.

Debug events

The debug module can trigger debug events like breakpoints, watchpoints, vector catches etc. They generate a debug exception which escalates to a hard fault if left unhandled. Properly disabling debug mode before code release prevents these. The FAULTMASK register can also be used to suppress debug induced hard faults if debug is enabled.

OS task switching errors

In RTOS based systems, task switching can sometimes trigger hard faults. Common causes include:

  • Stack overflow during task switch corrupting stack memory
  • Switching tasks while interrupts are disabled
  • Task priorities causing deadlock and stalling the scheduler
  • Trying to switch to invalid or non-existing tasks

Analysis of the task switching patterns and scheduler state helps isolate OS related hard faults.

Power, clock and EMI issues

Incorrect power or clock configurations can also lead to hard faults. Examples include:

  • Brownout issues corrupting processor state during voltage drops
  • PLL losing lock due to board noise or poor layout
  • Clock glitches during system state changes
  • Excessive Electromagnetic Interference (EMI) disrupting processor operation

Careful review of power supply stability, clock trees, and board layout is needed to identify potential faults from these sources.

Identifying Root Cause

When a hard fault exception occurs, the ARM Cortex processor halts execution and enters the hard fault handler. Register and stack contents provide crucial clues on the fault origin:

  • HFSR – HardFault Status Register indicates source of hard fault
  • CFSR – Configurable Fault Status Register gives fault status of MMFSR, BFSR, UFSR
  • MMFAR – Memory Manage Fault Address Register indicates fault address for memory related faults
  • BFAR – Bus Fault Address Register indicates fault address for bus faults
  • PC – Program Counter indicates instruction that triggered the fault
  • LR – Link register points to calling function’s return address
  • Stacked registers and local variables help recreate full context

Trace outputs from CoreSight components like Embedded Trace Macrocell (ETM) or Data Watchpoint and Trace (DWT) unit can also provide detailed history of program flow, data access, bus transactions etc. leading up to the fault event.

For hard faults during development, debuggers like Segger Ozone, Eclipse IDE, and proprietary IDEs provide debug, inspection and tracing tools. For faults after deployment, on-chip profiling via CoreSight STM or System Trace Macrocell (STM) can prove invaluable.

With the root cause identified, developers can apply fixes like firmware upgrades, hardware design changes, or software patches to resolve underlying issues and prevent future hard fault occurrences.

In summary, hard faults on ARM Cortex processors can arise from a range of software and hardware issues. Thoughtful programming and robust system design can eliminate many common causes. Duplicate hard faults point to systemic underlying problems that require dedicated investigation, analysis and remediation to address.

Newsletter Form (#3)

More ARM insights right in your inbox

 


Share This Article
Facebook Twitter Email Copy Link Print
Previous Article What is arm usage fault?
Next Article What is the purpose of the hard fault exception in ARM Cortex-M?
Leave a comment Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

2k Followers Like
3k Followers Follow
10.1k Followers Pin
- Sponsored-
Ad image

You Might Also Like

How to activate Eventrecorder together with RTX5 using Cortex-M0?

Activating the Eventrecorder together with RTX5 on a Cortex-M0 microcontroller…

7 Min Read

Where is the Interrupt Vector Table Stored?

The interrupt vector table (IVT) is a key component in…

9 Min Read

ARM Cortex M0 Programming in C

The ARM Cortex-M0 is a 32-bit processor designed for low-cost…

9 Min Read

Is bare metal low level code?

Bare metal code refers to programs that run directly on…

8 Min Read
SoCSoC
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
Welcome Back!

Sign in to your account