Software debuggers and configuring for CoreSight components (Arm Cortex-M)

Debugging software on Arm Cortex-M devices requires configuring the CoreSight components like the Embedded Trace Macrocell (ETM), Trace Port Interface Unit (TPIU), Embedded Trace Buffer (ETB) etc. Choosing the right debugger and properly setting up the debug hardware is key to an efficient software debugging workflow.

Contents

Overview of CoreSight Components in Cortex-M

The Cortex-M series of Arm processors contain various CoreSight debug and trace components like:

Debug Access Port (DAP)

Embedded Trace Macrocell (ETM)
Trace Port Interface Unit (TPIU)
Embedded Trace Buffer (ETB)

Instrumentation Trace Macrocell (ITM)
Micro Trace Buffer (MTB)
Cross Trigger Interface (CTI)

Data Watchpoint and Trace Unit (DWT)

These components allow debugging through a debugger connected to the chip’s debug port. The DAP provides the debug interface, while the ETM and TPIU handle instruction trace. The ETB buffers trace data on-chip. The ITM and MTB allow instrumented software to generate custom debug data. The CTI handles cross-triggering between cores and debug components. The DWT provides watchpoint registers to set data breakpoints.

Debug Access Port

The Debug Access Port or DAP provides the interface for an external debugger to access the internal debug components. It has two interfaces:

JTAG interface for 4/5 pin standard JTAG connection
Serial Wire Debug (SWD) interface for 2-pin clock + data connection

The DAP handles the ARM Debug Interface v5 architecture protocol to communicate with the debugger. It provides access to internal debug components through an Internal debug APB bus. The DAP sets up the debug environment on power up and handles debug authentication.

Embedded Trace Macrocell

The Embedded Trace Macrocell or ETM provides instruction trace capabilities for Cortex-M devices. It can trace program flow by capturing pipeline information, branch addresses, exceptions etc. The ETM traces data is exported off-chip via the TPIU. Main features of the ETM include:

Instruction and data tracing
Branch tracing

Exception and interrupt tracing
Timestamp support
Trace start/stop logic

FIFO for trace data

The ETM trace gives detailed visibility into the software execution sequence on Cortex-M devices. However, it requires significant bandwidth to export the trace data off-chip. The ETB can be used to mitigate this by buffering trace data on-chip.

Trace Port Interface Unit

The Trace Port Interface Unit or TPIU handles exporting the trace data from the ETM to off-chip tools like a debugger or logic analyzer. It converts the ETM’s parallel trace output to a serial stream using either a 2-pin Serial Wire Output (SWO) or 8-bit Trace Port (TP) interface. The TPIU can be configured in various serial modes balancing throughput and pin requirements.

Embedded Trace Buffer

The Embedded Trace Buffer or ETB captures the instruction trace from the ETM in an on-chip RAM buffer. This avoids the high pin bandwidth requirements of exporting ETM trace continuously off-chip. The buffer size ranges from 1KB to 32KB. Trace data is written circularly until the capture is stopped. The buffer can be read out through the DAP and TPIU using the ARM Debug Interface. This trace capture approach provides significant benefits:

Reduces pin usage – only SWO or low TP width needed
No continuous real-time trace streaming needed

Avoids risk of losing trace data
Enables longer trace lengths

Instrumentation Trace Macrocell

The Instrumentation Trace Macrocell or ITM provides program trace and printf debugging support in Cortex-M devices. It offers 32 stimulus ports that software can write timestamps, data values, printf strings etc. to. The ITM captures and exports this instrumentation data to off-chip tools via the TPIU. The stimulus ports act as fixed locations where instrumentation data can be written to from anywhere in software using dedicated register addresses. Key features of the ITM include:

32 stimulus ports for instrumentation data
1GB/s maximum output bandwidth
FIFO buffers for holding data

Programmable port attributes

The ITM allows efficient diagnostics and profiling without detailed instruction trace. However, care must be taken to ensure instrumentation points have low overhead.

Micro Trace Buffer

The Micro Trace Buffer or MTB combines the ETB and ITM to provide a small on-chip RAM buffer for instrumentation trace data. It offers up to 16 stimulus ports from the ITM with sizes ranging from 256B to 4KB. The MTB can capture ITM trace selectively using triggers from hardware event signals. Key capabilities include:

Up to 16 stimulus ports
Size ranges from 256B to 4KB
Selective ITM trace capture using hardware event triggers

Trace data readout via DAP

The MTB enables collecting targeted ITM trace without streaming off-chip continuously. The downside is the smaller buffer size compared to a dedicated ETB.

Cross Trigger Interface

The Cross Trigger Interface or CTI allows chaining trace and debugging operations across multiple cores and debug components. It provides trigger in and out signals to connect debugger break/watchpoint events to ETM trace start/stop logic. This enables advanced use cases like:

Stopping trace upon debugger breakpoint hit
Starting trace upon DWT watchpoint match
Synchronizing trace across multiple ETMs

The CTI provides 4 trigger inputs and 4 trigger outputs to link multiple components. Buses can also be used instead of dedicated wires.

Data Watchpoint and Trace Unit

The Data Watchpoint and Trace or DWT unit provides breakpoint support through watchpoint registers in the Cortex-M processors. It contains 2 or 4 watchpoint comparators that can match on data or address accesses. Each watchpoint can be configured independently for different match conditions. The watchpoints can trigger breakpoints to stop program execution when matched. They can also output events to the CTI and ETM. Key features of the DWT include:

Up to 4 watchpoint comparators

Programmable match address range and attributes
Generate debugger breakpoints on match
Connect to ETM and CTI for trace

The DWT provides data and address breakpoint support without debugger involvement, enabling more efficient debugging.

Debugger Options for Cortex-M Devices

There are various on-chip and external debugger choices for software development using Cortex-M devices:

JTAG/SWD debug port with IDE/debugger

On-chip ROM bootloader over UART
In-Circuit Debug Interface (ICDI)
J-Link debug probes

ST-LINK debug adapters
Segger J-Trace Probes

The most common approach is using a JTAG/SWD debug adapter with an IDE like Keil MDK, IAR EWARM or software packagess like OpenOCD. Low-cost JTAG/SWD debug adapters are available from various vendors. More advanced debug probes also support trace capabilities via additional pins.

UART Bootloader Debugging

Many Cortex-M MCUs come with an on-chip ROM UART bootloader that enables communication over a UART port without a debug adapter. Commands can be sent to flash firmware, read memory, execute code etc. However, UART communication is slow and this method does not support advanced debugging and trace.

ICDI Debug Interface

The In-Circuit Debug Interface or ICDI is the on-chip debug capability built into some Cortex-M devices. It provides JTAG/SWD debug support over a UART-like asynchronous serial interface using just two pins. This reduces the number of dedicated debug pins required on the chip package. ICDI provides support for flash programming, breakpoint debugging but has limited trace support compared to JTAG/SWD debugging.

J-Link Debug Probes

Segger J-Link debug probes are widely used JTAG/SWD adapters for Cortex-M debugging. Various models are available with features like:

USB to JTAG/SWD communication
High-Speed USB 3.0 support
Ethernet connectivity option

Additional pins for trace support
Faster SWD clock rates

J-Link probes provide robust JTAG/SWD debug support at a low cost. Higher end models also support advanced tracing using additional pins.

ST-LINK Debug Adapters

ST-LINK debug adapters from STMicroelectronics are also commonly used for Cortex-M debugging. Key features supported include:

JTAG/SWD host interface
Virtual COM port over USB

Flash programming
Limited TPIU trace support on some models

ST-LINK provides basic JTAG/SWD debugging at low cost and is commonly bundled with STM32 boards and IDEs.

J-Trace Pro Debug and Trace Probes

Segger J-Trace Pro probes provide advanced debug and trace capabilities by combining J-Link debug with trace modules. Key features offered:

JTAG/SWD debugging
TPIU SWO tracing – high-speed USB 3.0 streaming

Trace port – parallel trace pin connection
Probe supports ETB trace buffer readout

J-Trace Pro probes allow leveraging the full debug and trace capabilities of Cortex-M devices. Their high bandwidth allows capturing large trace datasets.

Setting Up SWD Debugging

Serial Wire Debug (SWD) interface is the most common debug interface used for Cortex-M debugging today. The key steps for setting this up are:

Connect the SWD lines between the debugger and target device
Power up the target board

Configure debugger with correct target device and interface settings
Connect and initialize the debugger
Halt the target device after connection

Load the firmware image to flash on the device

The debugger software packages like Keil MDK, Segger Ozone, OpenOCD etc provide options to set the target device and interface. USB connected debug adapters like J-Link and ST-LINK are automatically detected. Launching the debugger will connect it to the device and halt execution. Flash programming can then be done followed by running and stepping through code.

SWD Interface Connections

The SWD interface uses just two signals for bidirectional communication:

SWDIO – Serial Wire Data I/O line
SWCLK – Serial Wire Clock

These two signals along with device ground and power need to be connected from the debugger probe to the target board. They should be connected directly between the devices or routed using short traces to minimize signal integrity issues.

Powering the Target

The target Cortex-M device needs to be powered on before attempting to connect with the debugger. This may require an external power supply or connecting a battery. Any boards with power sequencing requirements must have those completed before debugging. The debugger software will allow powering the board from the probe in some cases.

Configuring Debugger Software

The debugger software needs to be set up with the target device details and the debug interface being used. Cortex-M device options include:

Cortex-M0/M0+

Cortex-M3
Cortex-M4
Cortex-M7

Cortex-M23/M33

The SWD interface is selected as the debug access method. The correct device and core information ensures that the debugger configures the DAP and Debug Port (DP) registers optimally.

Connecting Debugger to Target

The debugger can be connected to the target device once the interface connection is made and target powered up. This may require pressing a connect button in the debugger software or may happen automatically. An initial handshake will take place with the chip’s DP and DAP. The debugger halts the device by default on connect.

Loading Firmware

Once connected, the debugger can be used to program the flash memory on the Cortex-M device. The firmware binary file is loaded onto the debugger. Flash programming options are then used to erase and program the target flash memory. Verification can be done after programming completes. This leaves the device ready to be debugged.

JTAG Interface Configuration

While less common today, the JTAG interface can also be used for debugging Cortex-M devices. The key aspects of setting this up include:

Connecting the JTAG interface pins

Selecting JTAG on the debugger
Ensuring target device enables JTAG

JTAG Pin Connections

The JTAG interface requires connecting 5 pins between the debugger and target:

TCK – Test Clock
TMS – Test Mode Select
TDI – Test Data In

TDO – Test Data Out
TRST – Test Reset (optional)

In addition to these, the device ground and power pins must be connected. Having short connections or minimizing stub traces is recommended for signal integrity.

Configuring Debugger for JTAG

The debugger software needs to be configured for JTAG interface setting instead of the default SWD mode. The target device settings remain the same. The JTAG pinout on the debugger probe needs to match that of the target board.

Enabling JTAG on Target

By default JTAG may by disabled on some Cortex-M devices after reset. This is done by setting bit 0 of the Debug Halting Control and Status Register (DHCSR) to 1. The debugger can forcefully enable JTAG by overriding this bit on connect. The recommendation is to ensure that the target firmware or bootloader explicitly enables JTAG for robust operation.

Trace Configuration

To leverage the instruction trace capabilities of Cortex-M devices, the trace components like ETM, TPIU, ETB/ITM need to be properly configured. The key steps are:

Enable trace clocks
Enable ETM trace
Connect trace pins

Configure debugger for trace
Capture or stream trace

Trace Clock Configuration

The ETM and TPIU components require separate clock sources to operate at high frequencies. These trace clocks need to enabled on the Cortex-M device. On some MCUs, this may require configuring the clock tree to provide these clocks. Typical trace clock frequencies are 100 MHz or more. The trace clocks need to be active before trace components are enabled.

Enabling ETM Trace

By default the ETM component may be disabled on Cortex-M devices after reset. The debugger can enable it by setting the ETM Control Register appropriately. However, the recommendation is to explicitly enable ETM from the target firmware or bootloader code for robust operation. ETM trace can produce large amounts of data so it is advisable to enable tracing only when required.

Trace Pin Connections

To get the trace data output from the device, the TPIU pins need to be connected to the debugger probe. There are typically 2 options:

SWO pin – Single wire output provides sampled trace data

TRACEPORT – 8-bit parallel trace port for maximum bandwidth

SWO can use a normal GPIO while TRACEPORT requires dedicated pins. The pins should be routed using controlled impedance lines and appropriate termination resistors on both ends.

Debugger Trace Configuration

The debugger needs to be aware that trace is enabled to capture and decode the trace output data. Trace options need to be enabled in the debugger. For basic SWO tracing, no additional configuration may be needed. For TRACEPORT, the pinout and modes have to be configured.

Capturing Trace Output

There are two main options for capturing trace data from the Cortex-M device:

Streaming trace in real-time over SWO or TRACEPORT to the debugger
Logging trace to the on-chip ETB buffer and then reading it back

Real-time streaming requires high bandwidth but minimizes trace data loss. ETB capture allows longer trace lengths but data could be lost before or after the capture window.

Tips for Effective Tracing

Here are some tips to help make best use of the Cortex-M trace capabilities:

Enable trace clocks early in system startup

Initialize ETM configuration before enabling it
Connect SWO pin to an available GPIO if not dedicated
Check TRACEPORT connections and terminations

Use debugger settings to reduce trace data captured
Minimize instrumentation overhead when using ITM
Employ CTI and DWT triggers to control trace capture

Properly configuring trace enables obtaining the required program flow visibility with minimal overhead and without losing key data.

Troubleshooting Trace

Some steps for troubleshooting trace capture issues are:

Confirm trace clocks enabled and running

Check ETM and TPIU control registers configured correctly
Verify trace pin connections and signal integrity
Check debugger recognizing trace data on pin(s)

Examine target trace configurations for issues
Reduce trace data rate and scenarios being traced

Narrowing down trace problems requires verifying each step from target to debugger is working correctly. Checking control registers, signals, and data at each stage can identify where things are going wrong.

Conclusion

The CoreSight debug and trace architecture on Arm Cortex-M devices provides very powerful capabilities. Leveraging them requires understanding the components and properly configuring them. The debugger used also needs to be integrated with the capabilities. With some effort getting setup, efficient workflows can be built around Cortex-M debug and trace.