Cortex-M3 Memory Region Shareability and Cache Policies (Explained)

The Cortex-M3 memory system allows configuring memory regions to be shareable or non-shareable between processors. It also allows configuring cache policies for each region to specify whether the region is cached or bypassed. Proper configuration is important for system performance and determinism.

Contents

Shareability Cache Policies Setup and Configuration Validation Tuning and Optimization Use of MPU Interactions with Debug Effects on Power Conclusion

Shareability

The Cortex-M3 allows configuring memory regions as shareable or non-shareable. Shareable regions can be accessed by multiple processors, while non-shareable regions are local to each processor.

Shareability is configured through the System Control Block (SCB) Type Register for each memory region. Setting a region as non-shareable provides faster access since the local processor has exclusive access. But care must be taken to avoid multiple processors accessing non-shareable regions.

Typically, code and read-only data like constants can be configured as non-shareable, since they are only read by processors. Read-write data like global variables should be shareable to avoid consistency issues. Memory mapped peripherals are usually shareable too.

The benefits of configuring regions as non-shareable are:

Faster access to local regions without arbitration

Avoiding bus contention with other processors
Better real-time determinism for memory access

The tradeoffs are:

Care must be taken to avoid data consistency issues
Startup code must setup regions correctly
Memory map must align with shareability requirements

Cache Policies

The Cortex-M3 contains an optional 2 KB or 4 KB instruction cache, and an optional 1 KB or 2 KB data cache. Cache policy for each region can be configured as:

Non-cacheable – Region is not cached
Write-through – Writes go to cache and memory

Write-back – Writes only go to cache, dirty data is flushed later

Cache policy is configured through the Type Register for each region. Non-cacheable regions bypass the cache completely. This provides deterministic latency but no caching benefits.

Write-through policy writes to both cache and memory simultaneously. Reads can benefit from caching, but writes have longer latency. Dirty data is never present.

Write-back only writes to the cache, and data is flushed to memory later when cache lines are evicted. This provides optimal performance but can result in dirty data in caches.

Key considerations for configuring cache policies:

Code regions can be write-through or non-cacheable

Read-only data like constants can be write-through
RW data and memory mapped peripherals should be non-cacheable
Stacks, heaps can be write-back for performance

The benefits of caching include:

Faster access to cached code and data
Higher hit rates improve performance

Write-back improves write performance

The tradeoffs are:

Cache misses result in slower variable latency

Write-back can result in dirty data in caches
Improper caching can cause bugs and data inconsistencies

Setup and Configuration

The Cortex-M3 memory regions and cache policies are configured via registers in the System Control Block (SCB). This is typically done in startup code.

Key steps for configuration:

Define memory map with regions for code, data, peripherals
Configure Type Register for each region via SCB interface

Set shareability attribute for each region
Set cache policy for each region
Invalidate caches if enabled

Enable caches and MPU if used

Tools like uVision, Keil and middleware like CMSIS provide abstractions to configure memory regions more easily.

Example pseudocode for SCB configuration: // Code region – write-through cacheable, non-shareable SCB->TypeR1 = 0x1 // Write-through, non-shareable // Data region – non-cacheable, shareable SCB->TypeR2 = 0x2 // Non-cacheable, shareable // Peripheral region – non-cacheable, shareable SCB->TypeR3 = 0x2 // Non-cacheable, shareable // Invalidate caches SCB->ICIALLU // Enable caches SCB->CCR |= 0x14 // Enable I and D cache

This configures code and data regions appropriately with the correct shareability and caching. Peripherals are marked as non-cacheable and shareable, which is required.

Validation

Memory region configuration should be validated by:

Reviewing startup code for correct SCB configuration

Checking memory map aligns with requirements
Verifying cache hit/miss rates during execution
Confirming real-time performance with caches on vs off

Testing inter-processor communication for shareable regions
Confirming non-shareable region access is local only

Bugs from improper configuration include:

Data corruption due to non-shareable access
Race conditions on shared data
Crashes due to instruction/data aborts

Cache coherency issues
Increased latency for cache misses

Validation should cover boot testing, runtime monitoring, and system integration testing.

Tuning and Optimization

Based on validation results, the memory configuration can be tuned for better performance:

Increase non-shareable regions to reduce contention
Configure more write-through regions to reduce dirty data

Make stacks/heaps write-back cacheable for speed
Make frequently used code regions non-cacheable to improve determinism

Cache usage and hit/miss rates should be monitored. Additional caching may help for memory bound applications. Or disabling caches could improve real-time determinism.

Latency of shareable vs non-shareable accesses should be measured. More non-shareable regions can be added if latency is critical.

Data cache policies can also be tuned depending on how much dirty data can be tolerated.

Use of MPU

The Cortex-M3 MPU provides an additional layer of memory protection. It can enforce:

Access permissions – read/write/execute
Privilege levels – unprivileged vs privileged mode access
Memory maps – which addresses are accessible

The MPU is configured via MPU registers to setup up to 8 protected regions and their attributes. It should be used to protect and isolate memory.

Important considerations when using MPU:

Enable MPU early in startup before memory access

Setup regions, permissions, privileges correctly
Ensure good coverage of memory map with regions
Minimize region count for performance

Keep privileged mode accesses minimum

The MPU provides an additional layer of defense for memory protection. It can mitigate bugs by limiting damage. Proper configuration is necessary for both security and correct operation.

Interactions with Debug

Debug and trace capabilities can interact with memory regions, caches and MPU in various ways.

Considerations when debugging:

Debug can override memory protections
Watchpoints may not work properly on non-shared regions

Tracing can be limited by MPU permissions
Debug can invalidate or enable caches

Steps to take:

Use debug authentication to control override abilities
Understand MPU effects on watchpoints
Give trace/debug access to needed resources

Test debug carefully with caches on and off

Proper debug configuration and security settings are needed to avoid issues. Debug capabilities should be validated with production memory configuration.

Effects on Power

Memory configuration also impacts power consumption in various ways:

MPU checks consume power
Cache hits reduce memory accesses, saving power
Non-shareable regions reduce bus contention, saving power

Watchpoints, breakpoints can alter power

Steps to optimize power:

Minimize MPU region count

Use appropriate caching to reduce accesses
Increase non-shareable regions if possible
Profile power with debug vs without

Tuning the memory configuration appropriately can lead to measurable power savings. This could be reinvested to enable more caching for better performance.

Conclusion

In summary, proper configuration of Cortex-M3 memory regions, caches and MPU is vital for building effective systems. A good memory map forms the foundation. Shareability, cache policies, MPU protections must be set correctly during startup.

Validation should be thorough since bugs can be severe and hard to isolate. Performance and power can be optimized by tuning the configuration based on profiling. Debug must also be configured properly to match the system’s security needs.

Following best practices for the target use cases will enable developers to maximize the benefits of the Cortex-M3 memory architecture and mitigate any pitfalls.

Cortex-M3 Memory Region Shareability and Cache Policies (Explained)

Shareability

Cache Policies

Setup and Configuration

Validation

Tuning and Optimization

Use of MPU

Interactions with Debug

Effects on Power

Conclusion

More ARM insights right in your inbox

Leave a Reply Cancel reply

You Might Also Like

What is the difference between Cortex-A75 and A76?

Merging FPGA bitstream with Cortex-M1 application hex file

Modifying Stack Pointer (SP) and Program Counter (PC) in Cortex-M1

ARM Cortex M0 Delay Function