SoC
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
  • Arm Cortex M3
  • Contact
Reading: What is Data Cache in Arm Cortex-M series?
SUBSCRIBE
SoCSoC
Font ResizerAa
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Search
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Have an existing account? Sign In
Follow US
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
© S-O-C.ORG, All Rights Reserved.
Arm

What is Data Cache in Arm Cortex-M series?

David Moore
Last updated: September 17, 2023 1:06 pm
David Moore 9 Min Read
Share
SHARE

The data cache in Arm Cortex-M series microcontrollers is a small, fast memory that stores copies of data from the main memory. The purpose of the data cache is to reduce the number of accesses to the main memory, which is slower, and improve the performance of data retrieval operations.

Contents
Overview of CachesBenefits of CachingCache OrganizationCache OperationWrite PoliciesData Cache in Cortex-MCache FeaturesCache MaintenanceCache CoherenceCache PerformanceGuidelines for Optimizing Cache PerformanceConfiguring Cache in Cortex-MUse CasesLimitationsConclusion

Overview of Caches

In computer systems, caches are small memories used to store copies of frequently used data. They serve as temporary staging areas for data that the processor is likely to need next. Reading data from a cache is much faster than reading from main memory.

Caches exploit the locality of reference principle – the tendency for programs to reuse data and instructions they have used recently. By keeping copies of recently accessed data in the fast cache, the processor avoids having to read slower main memory every time that data is needed.

Benefits of Caching

The key benefits of using caches are:

  • Reduced latency – Cache hits are faster than memory reads
  • Increased throughput – By reducing stalls due to main memory reads, the processor can do more work
  • Lower power consumption – Caches require less power than main memory
  • Simplified design – Caches hide memory latency from the processor

Cache Organization

Caches have a cache controller, cache memory, and cache directory. The cache controller manages the data flow between the main memory and the cache memory. The cache memory stores the actual copies of data. And the cache directory stores the mapping between memory addresses and cache locations.

Caches are organized into cache lines (or blocks). Each cache line corresponds to a contiguous block of memory that is copied as a unit to the cache. Typical cache line sizes range from 16 to 128 bytes. Data is moved between memory and cache in units of cache lines.

Cache Operation

When the processor needs to read data, it first checks if the data is present in the cache. If so, a cache hit occurs and the data is returned quickly. If not, a cache miss occurs and the data must be read from the slower main memory.

On a cache miss, a cache line containing the requested data is copied from memory into the cache. Other data at this cache location is evicted. The processor also reads ahead and pulls more data into the cache. This prefetching exploits spatial locality to improve performance.

Write Policies

With write operations, caches implement either a write-through or write-back policy. In a write-through cache, data is written to both the cache and main memory. In a write-back cache, data is only written to the cache initially. Writes are forwarded to main memory later when the cache line is evicted.

Data Cache in Cortex-M

The data cache in Cortex-M microcontrollers is a 4-way set associative write-through cache. It has a configurable size up to 32 Kbytes. The cache line size is 4 words (16 bytes).

The Cortex-M data cache sits between the CPU and the bus matrix. It helps reduce bus traffic and memory latency. The cache has allocators to buffer and align data. There are also write buffers to hold pending writes until the bus is available.

Cache Features

Key features of the Cortex-M data cache include:

  • 4-way set associative organization
  • Write-through policy
  • Allocate on reads
  • Non-allocate on writes
  • 16 byte cache lines
  • LRU replacement policy
  • Optional ECC protection

Cache Maintenance

The Cortex-M cache controller provides cache maintenance operations to manage cache coherence and consistency. These operations include:

  • Invalidate – Mark cache line as invalid
  • Clean – Write dirty data to memory
  • Clean and invalidate – Clean then invalidate cache line
  • Flush – Clean and invalidate entire cache

The ARMv7 architecture defines special cache maintenance instructions for these operations. The processor can perform maintenance operations on a single line, a cache set, or the entire cache.

Cache Coherence

In multicore Cortex-M systems, each core has its own data cache. ARM recommends using a modified Harvard cache architecture to maintain coherence. Instruction caches are kept coherent using hardware mechanisms. For data, a software cache coherence protocol is defined.

The protocol involves flushing or cleaning data caches at synchronization points. Multicore semaphores, locks, and shared data structures are designed to force cache maintenance operations when entering and exiting critical sections. This prevents cores from operating on stale cached data.

Cache Performance

The performance benefit of caching depends on the cache hit rate. This is the fraction of memory accesses that are satisfied by the cache without accessing main memory. A higher hit rate results in lower average memory access time.

The hit rate depends on the cache size, access locality of the application, and other policies like replacement and write strategy. By optimizing cache usage, a system can significantly improve performance.

Guidelines for Optimizing Cache Performance

Here are some guidelines for optimizing cache performance in a Cortex-M system:

  • Organize data structures to maximize spatial locality and sequential access
  • Improve temporal locality by reusing data and instructions
  • Increase cache size if the hit rate is low
  • Optimize cache friendly code to maximize cache hits
  • Minimize cache misses by prefetching data
  • Use cache coloring to avoid conflict misses
  • Leverage multi-core coherence protocols
  • Allocate stack and global variables cache optimally

Profiling cache behavior and optimizing based on real usage is key. Tools like ARM Streamline can be used to analyze cache performance.

Configuring Cache in Cortex-M

The Cortex-M data cache is highly configurable via processor registers. Key configuration options include:

  • Enabling/disabling the cache
  • Setting cache size from 4KB to 32KB
  • Way size and associativity
  • Burst length for cache refills
  • Memory access latency for misses
  • Shared attribute for multiprocessor coherence
  • Cache control register settings

At runtime, the cache can be enabled/disabled by manipulating the Cache Enable bit in the Auxiliary Control Register. The cache is disabled on reset.

Use Cases

Some typical use cases for leveraging the Cortex-M data cache are:

  • Storing frequently used data structures
  • Caching code sections to improve instruction fetch performance
  • Buffering data transferred over low bandwidth buses
  • Avoiding wait states when accessing high latency memories
  • Prefetching data for fast signal processing algorithms

For memory intensive applications, the data cache can help avoid stalls and improve throughput. It works best when access patterns have good locality.

Limitations

While caches improve performance, they have some limitations:

  • Added latency on cache misses
  • Complexity of cache coherence in multicore systems
  • Overhead of managing cache with limited memory
  • Power consumption of cache memories
  • Difficult to optimize due to non-deterministic behavior

The benefits of caching may be less noticeable for small, deterministic real-time systems. Cache usage should be tailored to the application requirements.

Conclusion

The Cortex-M data cache reduces memory latency by storing local copies of frequently used data. It improves performance by exploiting locality of memory accesses in embedded applications. Cache optimization can provide significant speedups for memory-bound use cases.

Understanding cache organization, operation, and configuration is key to utilizing it effectively. Paying attention to cache usage and tuning cache policies accordingly helps unlock the benefits of caching in embedded Arm processors.

Newsletter Form (#3)

More ARM insights right in your inbox

 


Share This Article
Facebook Twitter Email Copy Link Print
Previous Article What is Instruction Cache in Arm Cortex-M series?
Next Article What is Instruction TCM (ITCM) Memory in Arm Cortex-M series?
Leave a comment Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

2k Followers Like
3k Followers Follow
10.1k Followers Pin
- Sponsored-
Ad image

You Might Also Like

How to Correctly Configure Linker Options for Cortex-M23 in uVision5?

Configuring the linker options correctly is crucial for building efficient…

5 Min Read

Is Arm Cortex-M4 RISC or CISC?

The Arm Cortex-M4 processor is a 32-bit RISC CPU that…

7 Min Read

Cortex-M3 Memory Region Types and Attributes

The Cortex-M3 is an ARM processor core designed for microcontroller…

13 Min Read

What is the difference between ARM Cortex-M0 and M3?

The key differences between the ARM Cortex-M0 and M3 microcontrollers…

5 Min Read
SoCSoC
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
Welcome Back!

Sign in to your account