The Cortex-M1 processor implements the ARMv6-M architecture, which provides a number of features to optimize interrupt latency and task switching when using a real-time operating system like Keil RTX. The key aspects to focus on are proper configuration of the Nested Vectored Interrupt Controller (NVIC), optimal placement of interrupt service routines (ISRs), and utilizing RTX features like preemption thresholds and fast context switching.
Configuring the NVIC
The NVIC on Cortex-M1 supports up to 240 external interrupts that can be configured with programmable priority levels. Proper priority assignment is important to ensure the most time-critical interrupts can preempt lower priority ones. Set the NVIC priorities appropriately in the startup code based on the requirements of the application. Higher priority interrupts can preempt lower priority ones, except when interrupts with the same priority occur at the same time, in which case the lower numbered interrupt will execute first.
The NVIC also allows configuring interrupts as “priority groups” which partitions the priority levels into groups of preemptable and non-preemptable interrupts. Use this to ensure the ultra high priority interrupts like the SysTick timer can always preempt lower priority interrupts, regardless of the priority assignment.
In addition, utilize the NVIC’s interrupt enabling/pending features in the runtime code for time-critical interrupts, rather than relying only on the peripheral’s interrupt enable bits. This allows pending interrupts to be cleared during the ISR, avoiding back-to-back triggers.
Optimizing Interrupt Service Routines
Since ISRs introduce latency in the system it is important to optimize their execution time. The Cortex-M1 provides some helpful mechanisms in this regard.
Set the BASEPRI register to mask off lower priority interrupts during critical sections of code. This avoids additional latency from unnecessary preemption. Use the PRIMASK register to globally disable interrupts for very short code segments.
Place the ISR code in tightly coupled memory like internal SRAM whenever possible, to reduce access time. This is especially beneficial for the highest priority ISRs.
Keep the ISR code as short and fast as possible. Use intrinsics like __disable_irq() and __enable_irq() to protect shared resources rather than doing this manually. Avoid function calls within ISRs unless absolutely necessary.
Return from the ISR using intrinsics like __DSB(), __ISB() and __SEV() to ensure interrupts are re-enabled efficiently without waiting for pending memory accesses.
Utilize RTX features like memory pool management and mail queue APIs rather than dynamic memory allocation or task synchronization primitives in the ISR code.
Configuring RTX Task Switching
The RTX RTOS provides several mechanisms to optimize task switching latency on Cortex-M1.
Enable preemption thresholds for time critical tasks to prevent preemption during a critical section. Assign higher threshold values to tasks that require more time uninterrupted.
Use RTX fast context switching feature which utilizes lazy stacking on interrupt entry to avoid unnecessarily saving unused registers on every context switch.
Increase the RTX_CONF_ISR_FIFO_QUEUE setting from its default of 16 to 32 or higher to optimize queuing of multiple pending interrupts.
Set RTX_CONF_ROUND_ROBIN_ENABLE to 0 to use a priority based scheduler rather than time slice based round robin scheduling.
Assign the RTX kernel tick timer to a high priority interrupt, for example using the SysTick timer, to force regular task switches and reduce latency.
Call os_sys_init_user() before initializing any library code or creating RTOS objects during system startup to optimize RTX resource management.
Assign tasks timing requirements via the .task_delay or .task_period parameters to enable RTX to schedule tasks predictably and efficiently.
Increase the number of RTX thread stack sections to match the number of threads, and size the stacks according to usage requirements to optimize memory.
Measuring Optimization Improvements
To quantify the effects of the optimization techniques, benchmark the interrupt latency, task switching times and overall system jitter before and after the changes using profiling tools like the SEGGER SystemView recorder or DWT cycle counter.
Interrupt latency can be measured from the start of the ISR handler to when it begins servicing the interrupt. Context switch time can be measured by reading the DWT cycle counter value on exit of a task and entry of the next scheduled task.
Jitter is more difficult to measure but capturing a histogram of the differences in expected vs actual task wakeup times and interrupt service intervals will illustrate improvements from optimizations. The optimization goal is to reduce the standard deviation or extremes in these histograms as much as possible.
In a properly optimized Cortex-M1 system, interrupt latency should be under 10 microseconds, context switch times around 5 microseconds, and overall jitter less than 1 microsecond RMS. Significant deviations above these values indicate room for tuning via the techniques described.
Conclusion
Optimizing interrupt response and task switching on Cortex-M1 requires utilizing the full range of tools provided by the architecture, RTOS and compiler. Prioritizing interrupts properly via the NVIC, minimizing ISR latency, and configuring RTX preemption and switching behavior all contribute to improving real-time performance. Quantitative benchmarks before and after applying these optimization techniques will validate the gains.