When developing bare metal applications for Arm Cortex-M microcontrollers, determining the maximum stack usage is crucial for allocating sufficient stack memory and avoiding stack overflows. This article provides a detailed guide on techniques and tools for accurately measuring stack usage in bare metal Cortex-M apps.
Introduction to Stack Usage in Cortex-M Apps
In Cortex-M microcontrollers, the stack is a region of memory used for storing temporary data like function parameters, return addresses, and local variables. The stack grows downward from the top of the allocated stack memory region. As functions are called, data is pushed onto the stack. As functions return, data is popped off the stack.
If the stack grows beyond the allocated region, it will overwrite other memory possibly containing code or valuable data. This leads to crashes or unexpected behavior. So it’s important to determine the maximum stack depth during worst-case program execution.
For bare metal Cortex-M apps, the linker does not perform automatic stack usage checks. So we have to manually calculate the worst-case stack requirement and allocate sufficient stack memory at runtime by defining variables or calling functions like __stack_chk_guard_setup().
Key Factors Affecting Stack Usage
The maximum stack usage depends on a variety of factors:
- Recursion levels of function calls
- Number and size of local variables and function parameters
- Interrupt handler requirements
- Usage of stack-consuming library functions
In general, deeply recursive functions with large stack frames consume the most stack space. Interrupt handlers also use the stack, so concurrent interrupts can increase stack usage. Library functions like printf() use the stack internally and contribute to maximum usage.
Techniques for Measuring Stack Usage
Here are some techniques for measuring stack usage in Cortex-M apps:
1. Compiler Warnings
Enabling stack usage warnings during compilation can provide an estimate of usage. For example, with ARM Compiler 6, the –callgraph-info=stack option shows estimated stack usage per function. While not entirely accurate, it provides a quick estimate during development.
2. Linker Map File
The linker map file generated during the build process contains a section showing stack memory usage. This provides the estimated static stack usage of the application compiled for a specific configuration.
3. Debug Session
Debuggers like GDB allow viewing the current stack pointer address during program execution. Capturing the position during worst-case scenarios provides a dynamic measurement of maximum stack usage.
4. Software Stack Overflow Detection
Instrumenting a software-based stack overflow handler using linker-defined symbols provides accurate dynamic detection of maximum stack usage. This involves placing a known value at the end of stack memory and checking it during execution.
5. Dedicated Hardware Tools
Tools like Segger SystemView provide non-intrusive measurement of stack usage by monitoring the stack pointer value using a debug probe. This is more accurate than software-only techniques.
Estimating Stack Requirements
Once maximum usage is measured, add an appropriate margin to determine the total stack size to allocate. Recommended margins are:
- 10% for simple apps with well-understood usage
- 20-30% for complex apps with interrupts and libraries
- 50% or more for deeply recursive code
Failing to add an adequate margin increases the risk of unanticipated stack overflows in deployed applications.
Factors like compiler version and optimization settings can influence stack usage. So measurements and margins may need tweaking across configurations.
Allocating Stack Memory in Cortex-M Apps
For bare metal Cortex-M development, the linker does not set up stack memory for us automatically. Stack allocation must be done manually.
Some options for allocating stack space include:
- Defining a static array variable
- Allocating stack memory at runtime before main()
- Using linker-defined symbols to allow runtime allocation
For example, with ARM toolchains, the stack can be allocated as: uint32_t stack[STACK_SIZE]; int main() { uint32_t stackTop = (uint32_t)stack + STACK_SIZE – 1; __stack_chk_guard_setup(stackTop); …
This places the stack below the guarded end-of-stack symbol defined by the linker. Runtime stack checking can then be added for overflow detection.
Setting Up Stack Limit Checking
To protect against stack overflows, runtime stack limit checks should be enabled using linker-defined symbols. For ARM toolchains, this is done using __stack_chk_guard_setup().
This function takes the calculated stack end address as a parameter. It sets up the Stack Guard value and register to trigger a HardFault on stack overflow. The HardFault handler can then respond appropriately on overflow.
More advanced techniques like Stack Canaries can also be implemented for added stack overflow protection. This places canary values around the stack to detect overwrites earlier before corruption occurs.
Validating Stack Usage in Practice
It’s important to validate stack usage once an application is running on hardware. Techniques include:
- Measuring maximum usage with debug probes
- Monitoring HardFaults triggered by the Stack Guard
- Testing worst-case scenarios like nested interrupts
- Injecting stack overflows to validate robust handling
No amount of upfront estimating can replace actually running tests on real hardware under different conditions. Build testing rigs to simulate worst-case scenarios and maximize chances of catching any issues early.
Optimizing Stack Usage
Some ways to optimize stack usage include:
- Minimizing local variables and passing pointers instead
- Declaring large arrays and buffers static or global instead of local
- Reducing unnecessary function call recursion depth
- Allocating critical data structures dynamically from the heap instead
Choose compiler optimizations like -Os to minimize overall stack usage. Analyze stack usage per function to identify and target hotspots. A little optimization goes a long way in reducing stack requirements.
Conclusion
Carefully calculating and allocating sufficient stack memory is crucial for developing robust bare metal Arm Cortex-M applications. Accurate measurement using software and hardware tools combined with tested margins and runtime limit checking helps mitigate dangerous stack overflows in embedded systems.
With the techniques outlined in this guide, developers can determine worst-case stack usage, allocate adequate memory with overhead, and validate stack behavior during execution. Properly implementing stack management reduces headaches and results in stable and resilient Cortex-M apps.