What is Data TCM (DTCM) Memory in Arm Cortex-M series?

Data TCM (DTCM) is a small, fast memory located inside the Cortex-M processor that can be used for time-critical data accesses. It allows frequently accessed data to be stored close to the processor, reducing access latency and improving performance.

Contents

Overview of Tightly Coupled Memories in Cortex-M DTCM Use Cases Storing Time Critical Data Sharing Data Between Interrupts and Tasks Stack for Real-Time Interrupts Holding Time Critical Code Configuring DTCM in Cortex-M Devices DTCM vs External RAM DTCM vs ITCM DTCM Performance Factors DTCM Limitations Conclusion

Overview of Tightly Coupled Memories in Cortex-M

The Cortex-M series of processors contain small memories called Tightly Coupled Memories (TCMs) that are integrated into the processor pipeline. There are two types of TCM:

Instruction TCM (ITCM) – Stores program instructions

Data TCM (DTCM) – Stores data

The key features of TCMs are:

Small size – Up to 64KB for ITCM, up to 64KB for DTCM

Very low access latency – Single cycle access, compared to tens of cycles for external memory
Dedicated bus – TCMs have a dedicated bus to the Cortex-M core for high bandwidth
Optional – Use of TCMs is optional, they supplement external memories

Because TCMs are so fast, time critical code and data can be placed in them to improve real-time performance. However, their small size means they can’t hold an entire program or dataset, so they are used judiciously.

DTCM Use Cases

Here are some common uses of DTCM in Cortex-M designs:

Storing Time Critical Data

DTCM is ideal for data that needs very fast access times. Examples include:

Real-time control loop data
Sensor data that is sampled very frequently
Buffers for time critical peripheral data like ADC samples or radio packets

Critical system data structures

Storing this kind of data in DTCM instead of external RAM can improve performance and responsiveness.

DTCM can be used as a sharing buffer when different execution contexts need access to the same data. For example:

A real-time interrupt stores sensor samples in a DTCM buffer
A lower priority task periodically processes the samples from the same buffer

This avoids external memory accesses and complex synchronization logic.

Stack for Real-Time Interrupts

The stack for real-time interrupt service routines can be allocated in DTCM. This reduces latency when entering and exiting the ISR.

Holding Time Critical Code

Although DTCM is intended for data, it can also hold time critical code fragments that need fast access. For example:

Performance critical math functions

Tuning parameters for compute intensive algorithms

This code can be executed directly from DTCM to improve performance.

Configuring DTCM in Cortex-M Devices

Here are some key points about configuring DTCM in Cortex-M devices:

DTCM size is fixed in hardware – Up to 64KB in Cortex-M7 and M4, up to 32KB in M3 and below
DTCM is optional – can be disabled to save power if not needed
DTCM can be split into multiple smaller regions

DTCM regions can be aligned to different addresses
DTCM can be configured as general purpose RAM or system RAM
Access permissions can restrict core vs peripheral access

Hardware configuration is done via the Memory Protection Unit (MPU)

The specific configuration options depend on the Cortex-M implementation in the chip. Configuration is done via vendor provided tools or initialization code.

DTCM vs External RAM

Compared to external RAM, DTCM provides:

Faster access – single cycle vs tens of cycles latency
More predictable access time – no contention with peripherals
Higher priority – dedicated bus instead of sharing with DMA/CPU

However, DTCM capacity is very small compared to external RAM. So it is used only for the most timing critical data. The majority of data still resides in slower external RAM.

DTCM vs ITCM

DTCM and ITCM are similar in nature – fast, tightly coupled processor memories. But there are some key differences:

DTCM stores data, ITCM stores instructions

ITCM access is read-only, DTCM can be read/written
DTCM can be used as general purpose RAM
Data accesses exhibit more locality than instruction accesses

In summary, DTCM is specialized for low latency data accesses, while ITCM is optimized for instruction fetch.

DTCM Performance Factors

There are several factors that determine how much performance benefit DTCM provides compared to external memory:

Access frequency – DTCM helps most for data that is accessed many times

Latency delta – Bigger improvement if external memory latency is high
Bus contention – DTCM isolated from congested system bus
Cache usage – Improvements reduced if external RAM cached

Data locality – DTCM only fits localized data, doesn’t help nonlocal accesses

So DTCM gives the biggest boost for frequently accessed, localized data, with high external memory latency and bus contention. Caching and non-localized data reduce the impact.

DTCM Limitations

Some limitations to keep in mind when using DTCM:

Very small capacity – even 64KB is tiny by modern standards
Manual data management – no automated allocation like heap
Data transfers – getting data in/out of DTCM has overhead

Memory fragmentation concerns due to small size
Limited or no DMA access – peripherals may not be able to access DTCM

So DTCM improves performance but requires careful design to work within its limitations.

Conclusion

In summary, Data TCM is a fast, tightly coupled memory in Cortex-M processors that allows time critical data to be accessed with low, consistent latency. It works best for frequently used data that fits in its relatively small capacity. With careful design, DTCM can greatly improve determinism and real-time performance in Cortex-M systems.

What is Data TCM (DTCM) Memory in Arm Cortex-M series?

Overview of Tightly Coupled Memories in Cortex-M

DTCM Use Cases

Storing Time Critical Data

Stack for Real-Time Interrupts

Holding Time Critical Code

Configuring DTCM in Cortex-M Devices

DTCM vs External RAM

DTCM vs ITCM

DTCM Performance Factors

DTCM Limitations

Conclusion

More ARM insights right in your inbox

Leave a Reply Cancel reply

You Might Also Like

Is Arm Cortex-M4 RISC or CISC?

Does Arduino use ARM architecture?

Is X64 Compatible with ARM?

Using De Bruijn sequences for faster count leading zeros (CLZ)