SoC
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
  • Arm Cortex M3
  • Contact
Reading: Configuring Memory and Caches for Arm Cortex-M1
SUBSCRIBE
SoCSoC
Font ResizerAa
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Search
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Have an existing account? Sign In
Follow US
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
© S-O-C.ORG, All Rights Reserved.
Arm

Configuring Memory and Caches for Arm Cortex-M1

Mike Johnston
Last updated: September 20, 2023 1:02 pm
Mike Johnston 6 Min Read
Share
SHARE

The Arm Cortex-M1 processor is designed for low-power embedded applications. It has a simple memory system without caches or memory management units. The Cortex-M1 memory system needs to be configured correctly for optimal performance and power efficiency.

Contents
Cortex-M1 Memory ArchitectureTCM ConfigurationExternal RAM ConfigurationCache ConfigurationOptimizing Memory PerformanceConclusion

Cortex-M1 Memory Architecture

The Cortex-M1 contains separate instruction and data bus interfaces to external memory. It has a von Neumann architecture so the instruction and data interfaces can be connected to the same memory. The processor contains limited amounts of dedicated instruction and data tightly coupled memories (TCM) for critical code and data. The TCM provides single cycle access but is limited to 64KB each for instructions and data.

For larger memories, the Cortex-M1 interfaces to external memory through the Advanced High-performance Bus (AHB). The AHB acts as a system bus and interconnect for on-chip peripherals and external memories and devices. It uses a central arbitration scheme to allow multiple bus masters to access the bus through a common interface.

The Cortex-M1 AHB interface has a 32-bit data width and runs at CPU frequency. It can provide a peak transfer rate of one word per cycle. The external memories are commonly SRAM or SDRAM and provide higher capacity storage than the TCMs but have longer access latency of multiple clock cycles.

TCM Configuration

The TCM provides single cycle access which maximizes Cortex-M1 performance for critical code and data. For best performance, frequently used code and data should be placed in ITCM and DTCM respectively. The compiler can place functions and data in TCM using directives.

ITCM is best utilized for time-critical interrupt handlers, DSP algorithms, and inner loops of key functions. DTCM should contain performance critical variables and data structures. Unused TCM results in wasted silicon area so memory requirements should be analyzed to right-size TCM capacity.

The TCM access time is one clock cycle so it should match the CPU frequency. Running TCM faster than the CPU wastes power while running it slower will stall the CPU. TCM also draws static current so limiting capacity as much as possible saves leakage power.

External RAM Configuration

External RAM provides higher capacity memory for code and data but has longer access latency. SRAM provides faster access times down to 10ns but is more expensive. SDRAM has access latency around 50-70ns but is cheaper per bit.

The AHB interface runs at CPU frequency so SDRAM may need a higher clock rate to match AHB bandwidth. Most Cortex-M1 based systems run SDRAM at 1-2x CPU speed to avoid starving the CPU. The external memory controller must also meet SDRAM timing requirements.

SDRAM has an initial latency of tens of clock cycles after opening a row. Fast memory controllers will prefetch code and data to hide this latency. Multi-bank SDRAM also improves average access time by interleaving accesses.

The Cortex-M1 only issues one outstanding external memory access at a time. Long multi-cycle SDRAM accesses can stall the CPU pipeline. Compiler optimizations to schedule instructions can help avoid stalls during accesses.

Cache Configuration

The Cortex-M1 does not contain instruction or data caches. Cache would reduce average access latency but at the cost of silicon area and power consumption. The deterministic single cycle TCM access also precludes the need for caching.

External memories can still implement caches transparently to the processor. Many SDRAM controllers include SRAM caches to hide row access latency. Memory mapped peripherals may also contain local caches for their registers and data.

These system level caches do not participate in processor coherency protocols. Software drivers and the compiler may need to invalidate caches or use cache bypass instructions for memory mapped peripherals.

Optimizing Memory Performance

There are several techniques to optimize memory performance when configuring the Cortex-M1 system:

  • Place critical code and data in ITCM and DTCM
  • Size TCM to minimize leakage while meeting performance needs
  • Run TCM at CPU frequency to avoid stalls
  • Use SDRAM configuration that matches AHB bandwidth
  • Enable SDRAM controller prefetch if available
  • Use multi-bank SDRAM to increase concurrency
  • Schedule code to prevent pipeline stalls during SDRAM access
  • Disable caches when accessing memory mapped peripherals

Profiling memory access patterns and timing is essential to ensuring the Cortex-M1 meets performance requirements. Memory system configuration and optimization provides significant opportunities to improve performance and efficiency.

Conclusion

The Cortex-M1 memory architecture with TCM and AHB bus provides flexible options for embedded systems. Optimizing the usage and configuration of TCM, SDRAM, and caches can maximize performance and efficiency. Careful memory system design is key to building high-performance Cortex-M1 applications.

Newsletter Form (#3)

More ARM insights right in your inbox

 


Share This Article
Facebook Twitter Email Copy Link Print
Previous Article Exception Handling Differences in Cortex-M and Cortex-R Processors
Next Article Configuring Memory and Caches for Arm Cortex-R4
Leave a comment Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

2k Followers Like
3k Followers Follow
10.1k Followers Pin
- Sponsored-
Ad image

You Might Also Like

Understanding Interrupt Latency and Jitter in Cortex-M

Interrupt latency and jitter are important performance metrics to consider…

6 Min Read

What is ARMv6-M in Arm Cortex-M series?

ARMv6-M refers to the architecture version 6-M of ARM Cortex-M…

8 Min Read

Integrating AMBA Bus with Cortex-M1 in FPGA Designs

Integrating the AMBA (Advanced Microcontroller Bus Architecture) bus with a…

10 Min Read

What is the bus interface in the Cortex-M3 processor?

The bus interface in the Cortex-M3 processor provides the connection…

8 Min Read
SoCSoC
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
Welcome Back!

Sign in to your account