Stack overflows are a common source of vulnerabilities in embedded systems using the Arm Cortex-M series of microcontrollers. A stack overflow occurs when a program writes past the allocated stack space, corrupting adjacent memory which may contain code or data critical to the system’s operation. The Arm Cortex-M includes hardware features to help detect stack overflows before they cause system failures.
Cortex-M Stack Organization
The Cortex-M stack grows down from high memory addresses to lower addresses. The processor maintains a stack pointer (SP) register that tracks the current top of the stack. Pushing data onto the stack decrements the SP and popping data increments it. The stack has a fixed region of memory allocated to it, defined by the stack pointer limit registers MSP_LIMIT and PSP_LIMIT. If the SP decrements below the limit address, a stack overflow has occurred.
MSP and PSP Stack Pointers
The Cortex-M has two stack pointers, MSP and PSP. The Main Stack Pointer (MSP) is used for thread mode, while the Process Stack Pointer (PSP) is used for handler mode. Having separate stack pointers allows isolation between threads and interrupt handlers. Both MSP and PSP have associated limit registers that contain the lowest valid stack address. If the stack grows beyond the limit, it triggers a stack overflow fault.
Enabling Stack Limit Checking
To enable stack limit checking, the CONTROL register’s STKOFHFNMIGN bitfield is programmed to a non-zero value. This enables stack overflow exceptions from the MSP or PSP exceeding their limit. The priority of the stack overflow exception is set by the STKOFHFNMIGN field. A low priority around 3-7 is typical, so other faults can preempt a stack overflow.
Setting Stack Limits in Hardware
The MSP_LIMIT and PSP_LIMIT registers define the lower boundary of the main and process stack respectively. These registers are loaded on reset to point to the end of the stack memory regions allocated in the linker script. The stack grows down from the high address towards the limit address. On overflow, the stack pointer drops below the limit, generating an exception.
Setting Stack Limits in Software
The stack limit registers can also be updated dynamically at runtime in software. This allows stack bounds to be adjusted based on dynamic memory allocation. For example, a real-time OS may adjust the stack limits of each thread as they are created and destroyed. Updating the limits during runtime allows maximal use of available memory.
Exception Handling on Stack Overflow
When a stack overflow occurs, the processor enters the stack overflow exception handler pointed to by the STKOFHFNMIGN field. This handler can attempt to unwind the stack to recover space if the overflow was minor. However, for most applications, stack overflow indicates a serious bug and the handler just halts the system to prevent further corruption.
Debugging Stack Overflows
During development, the stack overflow handler can use the debugger to halt on overflow and examine the stack. The offending function with too large a stack frame can be identified. The ARM debugger reads both the exception stack frame and the offending stack frame. Developers can trace through the call sequence leading to the overflow to find the bug.
Stack Overflow Protection Levels
Cortex-M microcontrollers have two levels of stack overflow protection:
- Level 1: STKOFHFNMIGN flag only
- Level 2: STKOFHFNMIGN flag plus optional Stack Protector
Level 1 simply triggers an exception on overflow. Level 2 enhances this with optional compiler-based stack protection like canary values to detect corruption.
Enhanced Stack Overflow Detection
Some Cortex-M variants add enhanced stack overflow support. For example, the Cortex-M23 and Cortex-M33 include optional hardware stack protection with pointer authentication to provide higher security. The Micro TrustZone in Cortex-M23 and Cortex-M33 also allows stack bounds to be defined per TrustZone region for added isolation.
Kernel Protection on Cortex-M Profile Chips
The Cortex-M23, Cortex-M33 and Cortex-M35P add the Kernel Protection Unit (KPU) which allows defining privileged software stacks separately from user application stacks. The KPU adds shadow stack limit registers for kernel mode operation. This better isolates kernel stacks from unprivileged software bugs.
Tradeoffs of Stack Overflow Detection
Stack limit checking does incur some memory and performance overhead. An extra 8 bytes of memory per stack are needed for the limit registers. Checking the limits on stack push/pop can add 2-3 clock cycles of latency per operation. This is acceptable in most embedded applications. However, for memory constrained or latency-sensitive systems, overflow protection may need to be relaxed.
Alternatives to Hardware Checking
If hardware stack limit checking is not available, overflow can be detected in software. This may be done by explicitly checking the stack pointer against a limit during context switches. Also, stack canaries can be used where sentinel values are placed between stack frames and checked on function exit.
Software Fault Prevention Techniques
In addition to stack monitoring in hardware/software, several programming disciplines can prevent overflows:
- Minimize stack frame sizes through efficiency
- Allocate large data structures dynamically instead of on stack
- Validate inputs to prevent buffer overflows
- Use stack canaries to detect corruption
- Static analysis to determine max stack usage
Proactive techniques like this can eliminate bugs before they occur and reduce the need for overflow detection.
Conclusion
Stack overflows are serious bugs in embedded systemsleading to crashes or security vulnerabilities. The Cortex-M hardware provides low overhead protection through stack limit registersand exceptions. This mechanism allows early detection of overflows to minimize damage and prompt debugging. Along with careful programming practices, stack limit checking can greatly improve reliability of Cortex-M embedded designs.