SoC
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
  • Arm Cortex M3
  • Contact
Reading: Cortex-M3 Memory Region Shareability and Cache Policies (Explained)
SUBSCRIBE
SoCSoC
Font ResizerAa
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Search
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Have an existing account? Sign In
Follow US
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
© S-O-C.ORG, All Rights Reserved.
Arm

Cortex-M3 Memory Region Shareability and Cache Policies (Explained)

Scott Allen
Last updated: November 6, 2023 2:01 am
Scott Allen 9 Min Read
Share
SHARE

The Cortex-M3 memory system allows configuring memory regions to be shareable or non-shareable between processors. It also allows configuring cache policies for each region to specify whether the region is cached or bypassed. Proper configuration is important for system performance and determinism.

Contents
ShareabilityCache PoliciesSetup and ConfigurationValidationTuning and OptimizationUse of MPUInteractions with DebugEffects on PowerConclusion

Shareability

The Cortex-M3 allows configuring memory regions as shareable or non-shareable. Shareable regions can be accessed by multiple processors, while non-shareable regions are local to each processor.

Shareability is configured through the System Control Block (SCB) Type Register for each memory region. Setting a region as non-shareable provides faster access since the local processor has exclusive access. But care must be taken to avoid multiple processors accessing non-shareable regions.

Typically, code and read-only data like constants can be configured as non-shareable, since they are only read by processors. Read-write data like global variables should be shareable to avoid consistency issues. Memory mapped peripherals are usually shareable too.

The benefits of configuring regions as non-shareable are:

  • Faster access to local regions without arbitration
  • Avoiding bus contention with other processors
  • Better real-time determinism for memory access

The tradeoffs are:

  • Care must be taken to avoid data consistency issues
  • Startup code must setup regions correctly
  • Memory map must align with shareability requirements

Cache Policies

The Cortex-M3 contains an optional 2 KB or 4 KB instruction cache, and an optional 1 KB or 2 KB data cache. Cache policy for each region can be configured as:

  • Non-cacheable – Region is not cached
  • Write-through – Writes go to cache and memory
  • Write-back – Writes only go to cache, dirty data is flushed later

Cache policy is configured through the Type Register for each region. Non-cacheable regions bypass the cache completely. This provides deterministic latency but no caching benefits.

Write-through policy writes to both cache and memory simultaneously. Reads can benefit from caching, but writes have longer latency. Dirty data is never present.

Write-back only writes to the cache, and data is flushed to memory later when cache lines are evicted. This provides optimal performance but can result in dirty data in caches.

Key considerations for configuring cache policies:

  • Code regions can be write-through or non-cacheable
  • Read-only data like constants can be write-through
  • RW data and memory mapped peripherals should be non-cacheable
  • Stacks, heaps can be write-back for performance

The benefits of caching include:

  • Faster access to cached code and data
  • Higher hit rates improve performance
  • Write-back improves write performance

The tradeoffs are:

  • Cache misses result in slower variable latency
  • Write-back can result in dirty data in caches
  • Improper caching can cause bugs and data inconsistencies

Setup and Configuration

The Cortex-M3 memory regions and cache policies are configured via registers in the System Control Block (SCB). This is typically done in startup code.

Key steps for configuration:

  1. Define memory map with regions for code, data, peripherals
  2. Configure Type Register for each region via SCB interface
  3. Set shareability attribute for each region
  4. Set cache policy for each region
  5. Invalidate caches if enabled
  6. Enable caches and MPU if used

Tools like uVision, Keil and middleware like CMSIS provide abstractions to configure memory regions more easily.

Example pseudocode for SCB configuration: // Code region – write-through cacheable, non-shareable SCB->TypeR1 = 0x1 // Write-through, non-shareable // Data region – non-cacheable, shareable SCB->TypeR2 = 0x2 // Non-cacheable, shareable // Peripheral region – non-cacheable, shareable SCB->TypeR3 = 0x2 // Non-cacheable, shareable // Invalidate caches SCB->ICIALLU // Enable caches SCB->CCR |= 0x14 // Enable I and D cache

This configures code and data regions appropriately with the correct shareability and caching. Peripherals are marked as non-cacheable and shareable, which is required.

Validation

Memory region configuration should be validated by:

  • Reviewing startup code for correct SCB configuration
  • Checking memory map aligns with requirements
  • Verifying cache hit/miss rates during execution
  • Confirming real-time performance with caches on vs off
  • Testing inter-processor communication for shareable regions
  • Confirming non-shareable region access is local only

Bugs from improper configuration include:

  • Data corruption due to non-shareable access
  • Race conditions on shared data
  • Crashes due to instruction/data aborts
  • Cache coherency issues
  • Increased latency for cache misses

Validation should cover boot testing, runtime monitoring, and system integration testing.

Tuning and Optimization

Based on validation results, the memory configuration can be tuned for better performance:

  • Increase non-shareable regions to reduce contention
  • Configure more write-through regions to reduce dirty data
  • Make stacks/heaps write-back cacheable for speed
  • Make frequently used code regions non-cacheable to improve determinism

Cache usage and hit/miss rates should be monitored. Additional caching may help for memory bound applications. Or disabling caches could improve real-time determinism.

Latency of shareable vs non-shareable accesses should be measured. More non-shareable regions can be added if latency is critical.

Data cache policies can also be tuned depending on how much dirty data can be tolerated.

Use of MPU

The Cortex-M3 MPU provides an additional layer of memory protection. It can enforce:

  • Access permissions – read/write/execute
  • Privilege levels – unprivileged vs privileged mode access
  • Memory maps – which addresses are accessible

The MPU is configured via MPU registers to setup up to 8 protected regions and their attributes. It should be used to protect and isolate memory.

Important considerations when using MPU:

  • Enable MPU early in startup before memory access
  • Setup regions, permissions, privileges correctly
  • Ensure good coverage of memory map with regions
  • Minimize region count for performance
  • Keep privileged mode accesses minimum

The MPU provides an additional layer of defense for memory protection. It can mitigate bugs by limiting damage. Proper configuration is necessary for both security and correct operation.

Interactions with Debug

Debug and trace capabilities can interact with memory regions, caches and MPU in various ways.

Considerations when debugging:

  • Debug can override memory protections
  • Watchpoints may not work properly on non-shared regions
  • Tracing can be limited by MPU permissions
  • Debug can invalidate or enable caches

Steps to take:

  • Use debug authentication to control override abilities
  • Understand MPU effects on watchpoints
  • Give trace/debug access to needed resources
  • Test debug carefully with caches on and off

Proper debug configuration and security settings are needed to avoid issues. Debug capabilities should be validated with production memory configuration.

Effects on Power

Memory configuration also impacts power consumption in various ways:

  • MPU checks consume power
  • Cache hits reduce memory accesses, saving power
  • Non-shareable regions reduce bus contention, saving power
  • Watchpoints, breakpoints can alter power

Steps to optimize power:

  • Minimize MPU region count
  • Use appropriate caching to reduce accesses
  • Increase non-shareable regions if possible
  • Profile power with debug vs without

Tuning the memory configuration appropriately can lead to measurable power savings. This could be reinvested to enable more caching for better performance.

Conclusion

In summary, proper configuration of Cortex-M3 memory regions, caches and MPU is vital for building effective systems. A good memory map forms the foundation. Shareability, cache policies, MPU protections must be set correctly during startup.

Validation should be thorough since bugs can be severe and hard to isolate. Performance and power can be optimized by tuning the configuration based on profiling. Debug must also be configured properly to match the system’s security needs.

Following best practices for the target use cases will enable developers to maximize the benefits of the Cortex-M3 memory architecture and mitigate any pitfalls.

Newsletter Form (#3)

More ARM insights right in your inbox

 


Share This Article
Facebook Twitter Email Copy Link Print
Previous Article What is the difference between ARM Cortex-A55 and A76?
Next Article Memory Map Regions and Access Behavior in Cortex-M3
Leave a comment Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

2k Followers Like
3k Followers Follow
10.1k Followers Pin
- Sponsored-
Ad image

You Might Also Like

Debugging On-Chip Flash and RAM with Cortex-M1 and ULINK2

Debugging on-chip flash and RAM can be challenging for developers…

7 Min Read

ARM Cortex-M0+ Processor

The ARM Cortex-M0+ processor is a 32-bit reduced instruction set…

9 Min Read

What is the difference between UART and SPI?

UART (Universal Asynchronous Receiver/Transmitter) and SPI (Serial Peripheral Interface) are…

10 Min Read

Software Development on Cortex-M1 Hardware Without an OS

Developing software directly on Cortex-M1 hardware without using an operating…

6 Min Read
SoCSoC
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
Welcome Back!

Sign in to your account