The Cortex-M series of ARM processors are extremely popular in embedded systems due to their low cost, low power consumption, and excellent performance. All Cortex-M variants have built-in fault handling capabilities, with the HardFault exception being the catch-all fault handler for undefined faults not handled by other fault exceptions. While the HardFault mechanism is consistent across Cortex-M variants, there are some behavioral differences to be aware of when migrating code between variants or debugging HardFaults.
Root Causes of HardFaults
Some common root causes of HardFault exceptions include:
- Attempting to execute an undefined instruction
- Accessing invalid or misaligned memory addresses
- Dividing by zero
- Corruption of stack pointers leading to stack overflows
- Incorrect exception return behavior
- Faulty exception handlers
The specific HardFault behavior exhibited on these faults can vary between Cortex-M variants.
Handler Mode and Stack Usage
On all Cortex-M variants, the HardFault handler executes in Handler mode using the main stack pointer (MSP). However, if the processor enters HardFault due to an exception return error, the HardFault stack frame may reflect the original exception stack.
For example, if an exception handler returns incorrectly to Thread mode instead of Handler mode, the resulting HardFault will have a main stack frame instead of a process stack frame. This can complicate root cause analysis but provides insight into the origin of the fault.
Automatic Stack Limit Checking
Some Cortex-M variants like the Cortex-M3 and Cortex-M4 automatically detect stack overflows and trigger a HardFault before executing the offending instruction. This prevents stack corruption but limits debug visibility into the faulting code.
Other variants like the Cortex-M0+ do not perform automatic stack checking, so will execute the offending instruction before faulting. This provides more debug visibility at the cost of potential stack corruption before the HardFault.
Fault Status Registers
The HardFault handler can leverage fault status registers like the Fault Status Register (FS) and HardFault Status Register (HFSR) to determine fault origins. For example, the FS register indicates whether a fault occurred in privileged or unprivileged mode, while the HFSR indicates the source of faults like divide-by-zero, invalid PC, or stack overflow.
However, not all Cortex-M variants have the same complement of status registers. For example, the Cortex-M0 and Cortex-M1 lack the HFSR, limiting fault diagnosis. The Cortex-M23 and Cortex-M33 add additional registers like the Debug Fault Status Register (DFSR) for more detailed fault tracing.
Fault Address Registers
Most Cortex-M variants provide Fault Address Registers (FARs) to capture the memory address of bus faults. However, the Cortex-M0, Cortex-M0+, and Cortex-M1 lack FARs, making it difficult to identify invalid memory access without external debug hardware.
FAR availability also varies for stack limit faults. For example, the FAR holds the stack overflow address on the Cortex-M3, but is untouched on the Cortex-M4 which performs silent stack limit checks.
Return Address Registers
The Cortex-M3 onwards provide Return Address Registers (RA) to capture the program counter at which a fault originated. This improves debuggability compared to earlier Cortex-Ms without RAs.
However, RAs are updated differently across variants. For example, the Cortex-M3 RA is updated only if recovering from a previously stacked exception frame. The Cortex-M4 always updates its RA, even for new HardFaults. The Cortex-M7 updates its RA on both stacked and new exceptions.
Exception Link Registers
The Cortex-M4F, Cortex-M7F, and Cortex-M33F variants add Exception Link Registers (ELR) to hold the exception return address, enhancing exception return behavior. This improves recovery from incorrect exception returns compared to variants without ELRs.
The ELRs also enable fault isolation and independent stack usage per exception, improving robustness. Variants without ELRs like the Cortex-M3 couple the link register with the main stack.
Debug Fault Handling
When debugging with breakpoints, debug variants like the Cortex-M3 halt on breakpoint matches before reaching the debugger. However non-debug variants like the Cortex-M0+ trigger a HardFault which invokes the debugger.
So debug Cortex-Ms exhibit HardFault behavior more representative of true faults, compared to non-debug variants where HardFaults are essentially proxies for breakpoints.
HardFault Priority
The Cortex-M0+, Cortex-M1, Cortex-M23, and Cortex-M33 treat HardFault as the highest priority exception, above even the Non-Maskable Interrupt (NMI). This helps ensure HardFaults are serviced immediately.
On other variants, NMI retains top priority, meaning NMIs will preempt HardFault exception handling. This priority difference can alter fault visibility in corner case debug scenarios.
Lockup Fault Detection
Some Cortex-Ms include extensions to detect lockup faults if the processor halts for too long, either due to deadlock or live-lock scenarios. For example, the Cortex-R52 has a Safety Checker for this, while the Cortex-M23 and Cortex-M33 include a Lockup detector.
The lockup detectors trigger a HardFault exception if lockup is detected, allowing recovery action to be taken. Of course, variants without lockup detection cannot detect these halting faults.
Memory Protection Units
Some Cortex-M variants integrate memory protection units (MPUs) to control access permissions for memory regions. Privilege violations or access faults will invoke the MPU fault handler before HardFault, improving debuggability.
On variants without MPUs, protection faults will directly enter HardFault exception handling, lumping the faults together with other causes like undefined instructions.
Exception Tracing
Advanced Cortex-M variants include exception tracing features to capture exception context snapshots via Embedded Trace Macrocell (ETM) or Instrumentation Trace Macrocell (ITM) tracing. For example, the Cortex-M3 supports tracing via the ETM, while the Cortex-M7 can use the ITM.
This exception tracing improves root cause analysis compared to variants without tracing capabilities. The trace snapshots are particularly useful for diagnosing HardFaults originating from exception return errors.
Fault Handling Extensions
Some Cortex-M variants add extensions to enrich HardFault handling capabilities. For example, the Cortex-M23 integrates StackCheck, a software library to help debug stack overflows by capturing stack trace data before the HardFault.
And Cortex-M33 adds the System Protection Unit (SPU) to safely manage escalation from less privileged software. The SPU extensions enhance handling of privilege-related faults.
Fault Isolation Extensions
Advanced variants like the Cortex-R52 introduce fault isolation features that provide more control and configurability over fault handling. Key capabilities include:
- Selective fault handling per exception
- Independent return stacks per exception
- Dedicated stack pointers for interrupts
- More selective HardFault behavior
These give developers more control over tailoring fault handling behaviors compared to the fixed HardFault handling of mainstream Cortex-M variants.
Conclusion
While the HardFault mechanism provides a baseline fault handling capability across all Cortex-M variants, there are a number of behavioral differences in HardFault triggering, register usage, debuggability, extensions, and configurability across variants.
Software architects should carefully assess these differences when selecting Cortex-M processors or migrating code between variants. HardFault debugging may require tuning expectations and debug flows to match the specific strengths and weaknesses of each variant’s HardFault implementation.