The Cortex-M3 memory system allows configuring memory regions to be shareable or non-shareable between processors. It also allows configuring cache policies for each region to specify whether the region is cached or bypassed. Proper configuration is important for system performance and determinism.
Shareability
The Cortex-M3 allows configuring memory regions as shareable or non-shareable. Shareable regions can be accessed by multiple processors, while non-shareable regions are local to each processor.
Shareability is configured through the System Control Block (SCB) Type Register for each memory region. Setting a region as non-shareable provides faster access since the local processor has exclusive access. But care must be taken to avoid multiple processors accessing non-shareable regions.
Typically, code and read-only data like constants can be configured as non-shareable, since they are only read by processors. Read-write data like global variables should be shareable to avoid consistency issues. Memory mapped peripherals are usually shareable too.
The benefits of configuring regions as non-shareable are:
- Faster access to local regions without arbitration
- Avoiding bus contention with other processors
- Better real-time determinism for memory access
The tradeoffs are:
- Care must be taken to avoid data consistency issues
- Startup code must setup regions correctly
- Memory map must align with shareability requirements
Cache Policies
The Cortex-M3 contains an optional 2 KB or 4 KB instruction cache, and an optional 1 KB or 2 KB data cache. Cache policy for each region can be configured as:
- Non-cacheable – Region is not cached
- Write-through – Writes go to cache and memory
- Write-back – Writes only go to cache, dirty data is flushed later
Cache policy is configured through the Type Register for each region. Non-cacheable regions bypass the cache completely. This provides deterministic latency but no caching benefits.
Write-through policy writes to both cache and memory simultaneously. Reads can benefit from caching, but writes have longer latency. Dirty data is never present.
Write-back only writes to the cache, and data is flushed to memory later when cache lines are evicted. This provides optimal performance but can result in dirty data in caches.
Key considerations for configuring cache policies:
- Code regions can be write-through or non-cacheable
- Read-only data like constants can be write-through
- RW data and memory mapped peripherals should be non-cacheable
- Stacks, heaps can be write-back for performance
The benefits of caching include:
- Faster access to cached code and data
- Higher hit rates improve performance
- Write-back improves write performance
The tradeoffs are:
- Cache misses result in slower variable latency
- Write-back can result in dirty data in caches
- Improper caching can cause bugs and data inconsistencies
Setup and Configuration
The Cortex-M3 memory regions and cache policies are configured via registers in the System Control Block (SCB). This is typically done in startup code.
Key steps for configuration:
- Define memory map with regions for code, data, peripherals
- Configure Type Register for each region via SCB interface
- Set shareability attribute for each region
- Set cache policy for each region
- Invalidate caches if enabled
- Enable caches and MPU if used
Tools like uVision, Keil and middleware like CMSIS provide abstractions to configure memory regions more easily.
Example pseudocode for SCB configuration: // Code region – write-through cacheable, non-shareable SCB->TypeR1 = 0x1 // Write-through, non-shareable // Data region – non-cacheable, shareable SCB->TypeR2 = 0x2 // Non-cacheable, shareable // Peripheral region – non-cacheable, shareable SCB->TypeR3 = 0x2 // Non-cacheable, shareable // Invalidate caches SCB->ICIALLU // Enable caches SCB->CCR |= 0x14 // Enable I and D cache
This configures code and data regions appropriately with the correct shareability and caching. Peripherals are marked as non-cacheable and shareable, which is required.
Validation
Memory region configuration should be validated by:
- Reviewing startup code for correct SCB configuration
- Checking memory map aligns with requirements
- Verifying cache hit/miss rates during execution
- Confirming real-time performance with caches on vs off
- Testing inter-processor communication for shareable regions
- Confirming non-shareable region access is local only
Bugs from improper configuration include:
- Data corruption due to non-shareable access
- Race conditions on shared data
- Crashes due to instruction/data aborts
- Cache coherency issues
- Increased latency for cache misses
Validation should cover boot testing, runtime monitoring, and system integration testing.
Tuning and Optimization
Based on validation results, the memory configuration can be tuned for better performance:
- Increase non-shareable regions to reduce contention
- Configure more write-through regions to reduce dirty data
- Make stacks/heaps write-back cacheable for speed
- Make frequently used code regions non-cacheable to improve determinism
Cache usage and hit/miss rates should be monitored. Additional caching may help for memory bound applications. Or disabling caches could improve real-time determinism.
Latency of shareable vs non-shareable accesses should be measured. More non-shareable regions can be added if latency is critical.
Data cache policies can also be tuned depending on how much dirty data can be tolerated.
Use of MPU
The Cortex-M3 MPU provides an additional layer of memory protection. It can enforce:
- Access permissions – read/write/execute
- Privilege levels – unprivileged vs privileged mode access
- Memory maps – which addresses are accessible
The MPU is configured via MPU registers to setup up to 8 protected regions and their attributes. It should be used to protect and isolate memory.
Important considerations when using MPU:
- Enable MPU early in startup before memory access
- Setup regions, permissions, privileges correctly
- Ensure good coverage of memory map with regions
- Minimize region count for performance
- Keep privileged mode accesses minimum
The MPU provides an additional layer of defense for memory protection. It can mitigate bugs by limiting damage. Proper configuration is necessary for both security and correct operation.
Interactions with Debug
Debug and trace capabilities can interact with memory regions, caches and MPU in various ways.
Considerations when debugging:
- Debug can override memory protections
- Watchpoints may not work properly on non-shared regions
- Tracing can be limited by MPU permissions
- Debug can invalidate or enable caches
Steps to take:
- Use debug authentication to control override abilities
- Understand MPU effects on watchpoints
- Give trace/debug access to needed resources
- Test debug carefully with caches on and off
Proper debug configuration and security settings are needed to avoid issues. Debug capabilities should be validated with production memory configuration.
Effects on Power
Memory configuration also impacts power consumption in various ways:
- MPU checks consume power
- Cache hits reduce memory accesses, saving power
- Non-shareable regions reduce bus contention, saving power
- Watchpoints, breakpoints can alter power
Steps to optimize power:
- Minimize MPU region count
- Use appropriate caching to reduce accesses
- Increase non-shareable regions if possible
- Profile power with debug vs without
Tuning the memory configuration appropriately can lead to measurable power savings. This could be reinvested to enable more caching for better performance.
Conclusion
In summary, proper configuration of Cortex-M3 memory regions, caches and MPU is vital for building effective systems. A good memory map forms the foundation. Shareability, cache policies, MPU protections must be set correctly during startup.
Validation should be thorough since bugs can be severe and hard to isolate. Performance and power can be optimized by tuning the configuration based on profiling. Debug must also be configured properly to match the system’s security needs.
Following best practices for the target use cases will enable developers to maximize the benefits of the Cortex-M3 memory architecture and mitigate any pitfalls.