Debugging software on Arm Cortex-M devices requires configuring the CoreSight components like the Embedded Trace Macrocell (ETM), Trace Port Interface Unit (TPIU), Embedded Trace Buffer (ETB) etc. Choosing the right debugger and properly setting up the debug hardware is key to an efficient software debugging workflow.
Overview of CoreSight Components in Cortex-M
The Cortex-M series of Arm processors contain various CoreSight debug and trace components like:
- Debug Access Port (DAP)
- Embedded Trace Macrocell (ETM)
- Trace Port Interface Unit (TPIU)
- Embedded Trace Buffer (ETB)
- Instrumentation Trace Macrocell (ITM)
- Micro Trace Buffer (MTB)
- Cross Trigger Interface (CTI)
- Data Watchpoint and Trace Unit (DWT)
These components allow debugging through a debugger connected to the chip’s debug port. The DAP provides the debug interface, while the ETM and TPIU handle instruction trace. The ETB buffers trace data on-chip. The ITM and MTB allow instrumented software to generate custom debug data. The CTI handles cross-triggering between cores and debug components. The DWT provides watchpoint registers to set data breakpoints.
Debug Access Port
The Debug Access Port or DAP provides the interface for an external debugger to access the internal debug components. It has two interfaces:
- JTAG interface for 4/5 pin standard JTAG connection
- Serial Wire Debug (SWD) interface for 2-pin clock + data connection
The DAP handles the ARM Debug Interface v5 architecture protocol to communicate with the debugger. It provides access to internal debug components through an Internal debug APB bus. The DAP sets up the debug environment on power up and handles debug authentication.
Embedded Trace Macrocell
The Embedded Trace Macrocell or ETM provides instruction trace capabilities for Cortex-M devices. It can trace program flow by capturing pipeline information, branch addresses, exceptions etc. The ETM traces data is exported off-chip via the TPIU. Main features of the ETM include:
- Instruction and data tracing
- Branch tracing
- Exception and interrupt tracing
- Timestamp support
- Trace start/stop logic
- FIFO for trace data
The ETM trace gives detailed visibility into the software execution sequence on Cortex-M devices. However, it requires significant bandwidth to export the trace data off-chip. The ETB can be used to mitigate this by buffering trace data on-chip.
Trace Port Interface Unit
The Trace Port Interface Unit or TPIU handles exporting the trace data from the ETM to off-chip tools like a debugger or logic analyzer. It converts the ETM’s parallel trace output to a serial stream using either a 2-pin Serial Wire Output (SWO) or 8-bit Trace Port (TP) interface. The TPIU can be configured in various serial modes balancing throughput and pin requirements.
Embedded Trace Buffer
The Embedded Trace Buffer or ETB captures the instruction trace from the ETM in an on-chip RAM buffer. This avoids the high pin bandwidth requirements of exporting ETM trace continuously off-chip. The buffer size ranges from 1KB to 32KB. Trace data is written circularly until the capture is stopped. The buffer can be read out through the DAP and TPIU using the ARM Debug Interface. This trace capture approach provides significant benefits:
- Reduces pin usage – only SWO or low TP width needed
- No continuous real-time trace streaming needed
- Avoids risk of losing trace data
- Enables longer trace lengths
Instrumentation Trace Macrocell
The Instrumentation Trace Macrocell or ITM provides program trace and printf debugging support in Cortex-M devices. It offers 32 stimulus ports that software can write timestamps, data values, printf strings etc. to. The ITM captures and exports this instrumentation data to off-chip tools via the TPIU. The stimulus ports act as fixed locations where instrumentation data can be written to from anywhere in software using dedicated register addresses. Key features of the ITM include:
- 32 stimulus ports for instrumentation data
- 1GB/s maximum output bandwidth
- FIFO buffers for holding data
- Programmable port attributes
The ITM allows efficient diagnostics and profiling without detailed instruction trace. However, care must be taken to ensure instrumentation points have low overhead.
Micro Trace Buffer
The Micro Trace Buffer or MTB combines the ETB and ITM to provide a small on-chip RAM buffer for instrumentation trace data. It offers up to 16 stimulus ports from the ITM with sizes ranging from 256B to 4KB. The MTB can capture ITM trace selectively using triggers from hardware event signals. Key capabilities include:
- Up to 16 stimulus ports
- Size ranges from 256B to 4KB
- Selective ITM trace capture using hardware event triggers
- Trace data readout via DAP
The MTB enables collecting targeted ITM trace without streaming off-chip continuously. The downside is the smaller buffer size compared to a dedicated ETB.
Cross Trigger Interface
The Cross Trigger Interface or CTI allows chaining trace and debugging operations across multiple cores and debug components. It provides trigger in and out signals to connect debugger break/watchpoint events to ETM trace start/stop logic. This enables advanced use cases like:
- Stopping trace upon debugger breakpoint hit
- Starting trace upon DWT watchpoint match
- Synchronizing trace across multiple ETMs
The CTI provides 4 trigger inputs and 4 trigger outputs to link multiple components. Buses can also be used instead of dedicated wires.
Data Watchpoint and Trace Unit
The Data Watchpoint and Trace or DWT unit provides breakpoint support through watchpoint registers in the Cortex-M processors. It contains 2 or 4 watchpoint comparators that can match on data or address accesses. Each watchpoint can be configured independently for different match conditions. The watchpoints can trigger breakpoints to stop program execution when matched. They can also output events to the CTI and ETM. Key features of the DWT include:
- Up to 4 watchpoint comparators
- Programmable match address range and attributes
- Generate debugger breakpoints on match
- Connect to ETM and CTI for trace
The DWT provides data and address breakpoint support without debugger involvement, enabling more efficient debugging.
Debugger Options for Cortex-M Devices
There are various on-chip and external debugger choices for software development using Cortex-M devices:
- JTAG/SWD debug port with IDE/debugger
- On-chip ROM bootloader over UART
- In-Circuit Debug Interface (ICDI)
- J-Link debug probes
- ST-LINK debug adapters
- Segger J-Trace Probes
The most common approach is using a JTAG/SWD debug adapter with an IDE like Keil MDK, IAR EWARM or software packagess like OpenOCD. Low-cost JTAG/SWD debug adapters are available from various vendors. More advanced debug probes also support trace capabilities via additional pins.
UART Bootloader Debugging
Many Cortex-M MCUs come with an on-chip ROM UART bootloader that enables communication over a UART port without a debug adapter. Commands can be sent to flash firmware, read memory, execute code etc. However, UART communication is slow and this method does not support advanced debugging and trace.
ICDI Debug Interface
The In-Circuit Debug Interface or ICDI is the on-chip debug capability built into some Cortex-M devices. It provides JTAG/SWD debug support over a UART-like asynchronous serial interface using just two pins. This reduces the number of dedicated debug pins required on the chip package. ICDI provides support for flash programming, breakpoint debugging but has limited trace support compared to JTAG/SWD debugging.
J-Link Debug Probes
Segger J-Link debug probes are widely used JTAG/SWD adapters for Cortex-M debugging. Various models are available with features like:
- USB to JTAG/SWD communication
- High-Speed USB 3.0 support
- Ethernet connectivity option
- Additional pins for trace support
- Faster SWD clock rates
J-Link probes provide robust JTAG/SWD debug support at a low cost. Higher end models also support advanced tracing using additional pins.
ST-LINK Debug Adapters
ST-LINK debug adapters from STMicroelectronics are also commonly used for Cortex-M debugging. Key features supported include:
- JTAG/SWD host interface
- Virtual COM port over USB
- Flash programming
- Limited TPIU trace support on some models
ST-LINK provides basic JTAG/SWD debugging at low cost and is commonly bundled with STM32 boards and IDEs.
J-Trace Pro Debug and Trace Probes
Segger J-Trace Pro probes provide advanced debug and trace capabilities by combining J-Link debug with trace modules. Key features offered:
- JTAG/SWD debugging
- TPIU SWO tracing – high-speed USB 3.0 streaming
- Trace port – parallel trace pin connection
- Probe supports ETB trace buffer readout
J-Trace Pro probes allow leveraging the full debug and trace capabilities of Cortex-M devices. Their high bandwidth allows capturing large trace datasets.
Setting Up SWD Debugging
Serial Wire Debug (SWD) interface is the most common debug interface used for Cortex-M debugging today. The key steps for setting this up are:
- Connect the SWD lines between the debugger and target device
- Power up the target board
- Configure debugger with correct target device and interface settings
- Connect and initialize the debugger
- Halt the target device after connection
- Load the firmware image to flash on the device
The debugger software packages like Keil MDK, Segger Ozone, OpenOCD etc provide options to set the target device and interface. USB connected debug adapters like J-Link and ST-LINK are automatically detected. Launching the debugger will connect it to the device and halt execution. Flash programming can then be done followed by running and stepping through code.
SWD Interface Connections
The SWD interface uses just two signals for bidirectional communication:
- SWDIO – Serial Wire Data I/O line
- SWCLK – Serial Wire Clock
These two signals along with device ground and power need to be connected from the debugger probe to the target board. They should be connected directly between the devices or routed using short traces to minimize signal integrity issues.
Powering the Target
The target Cortex-M device needs to be powered on before attempting to connect with the debugger. This may require an external power supply or connecting a battery. Any boards with power sequencing requirements must have those completed before debugging. The debugger software will allow powering the board from the probe in some cases.
Configuring Debugger Software
The debugger software needs to be set up with the target device details and the debug interface being used. Cortex-M device options include:
The SWD interface is selected as the debug access method. The correct device and core information ensures that the debugger configures the DAP and Debug Port (DP) registers optimally.
Connecting Debugger to Target
The debugger can be connected to the target device once the interface connection is made and target powered up. This may require pressing a connect button in the debugger software or may happen automatically. An initial handshake will take place with the chip’s DP and DAP. The debugger halts the device by default on connect.
Once connected, the debugger can be used to program the flash memory on the Cortex-M device. The firmware binary file is loaded onto the debugger. Flash programming options are then used to erase and program the target flash memory. Verification can be done after programming completes. This leaves the device ready to be debugged.
JTAG Interface Configuration
While less common today, the JTAG interface can also be used for debugging Cortex-M devices. The key aspects of setting this up include:
- Connecting the JTAG interface pins
- Selecting JTAG on the debugger
- Ensuring target device enables JTAG
JTAG Pin Connections
The JTAG interface requires connecting 5 pins between the debugger and target:
- TCK – Test Clock
- TMS – Test Mode Select
- TDI – Test Data In
- TDO – Test Data Out
- TRST – Test Reset (optional)
In addition to these, the device ground and power pins must be connected. Having short connections or minimizing stub traces is recommended for signal integrity.
Configuring Debugger for JTAG
The debugger software needs to be configured for JTAG interface setting instead of the default SWD mode. The target device settings remain the same. The JTAG pinout on the debugger probe needs to match that of the target board.
Enabling JTAG on Target
By default JTAG may by disabled on some Cortex-M devices after reset. This is done by setting bit 0 of the Debug Halting Control and Status Register (DHCSR) to 1. The debugger can forcefully enable JTAG by overriding this bit on connect. The recommendation is to ensure that the target firmware or bootloader explicitly enables JTAG for robust operation.
To leverage the instruction trace capabilities of Cortex-M devices, the trace components like ETM, TPIU, ETB/ITM need to be properly configured. The key steps are:
- Enable trace clocks
- Enable ETM trace
- Connect trace pins
- Configure debugger for trace
- Capture or stream trace
Trace Clock Configuration
The ETM and TPIU components require separate clock sources to operate at high frequencies. These trace clocks need to enabled on the Cortex-M device. On some MCUs, this may require configuring the clock tree to provide these clocks. Typical trace clock frequencies are 100 MHz or more. The trace clocks need to be active before trace components are enabled.
Enabling ETM Trace
By default the ETM component may be disabled on Cortex-M devices after reset. The debugger can enable it by setting the ETM Control Register appropriately. However, the recommendation is to explicitly enable ETM from the target firmware or bootloader code for robust operation. ETM trace can produce large amounts of data so it is advisable to enable tracing only when required.
Trace Pin Connections
To get the trace data output from the device, the TPIU pins need to be connected to the debugger probe. There are typically 2 options:
- SWO pin – Single wire output provides sampled trace data
- TRACEPORT – 8-bit parallel trace port for maximum bandwidth
SWO can use a normal GPIO while TRACEPORT requires dedicated pins. The pins should be routed using controlled impedance lines and appropriate termination resistors on both ends.
Debugger Trace Configuration
The debugger needs to be aware that trace is enabled to capture and decode the trace output data. Trace options need to be enabled in the debugger. For basic SWO tracing, no additional configuration may be needed. For TRACEPORT, the pinout and modes have to be configured.
Capturing Trace Output
There are two main options for capturing trace data from the Cortex-M device:
- Streaming trace in real-time over SWO or TRACEPORT to the debugger
- Logging trace to the on-chip ETB buffer and then reading it back
Real-time streaming requires high bandwidth but minimizes trace data loss. ETB capture allows longer trace lengths but data could be lost before or after the capture window.
Tips for Effective Tracing
Here are some tips to help make best use of the Cortex-M trace capabilities:
- Enable trace clocks early in system startup
- Initialize ETM configuration before enabling it
- Connect SWO pin to an available GPIO if not dedicated
- Check TRACEPORT connections and terminations
- Use debugger settings to reduce trace data captured
- Minimize instrumentation overhead when using ITM
- Employ CTI and DWT triggers to control trace capture
Properly configuring trace enables obtaining the required program flow visibility with minimal overhead and without losing key data.
Some steps for troubleshooting trace capture issues are:
- Confirm trace clocks enabled and running
- Check ETM and TPIU control registers configured correctly
- Verify trace pin connections and signal integrity
- Check debugger recognizing trace data on pin(s)
- Examine target trace configurations for issues
- Reduce trace data rate and scenarios being traced
Narrowing down trace problems requires verifying each step from target to debugger is working correctly. Checking control registers, signals, and data at each stage can identify where things are going wrong.
The CoreSight debug and trace architecture on Arm Cortex-M devices provides very powerful capabilities. Leveraging them requires understanding the components and properly configuring them. The debugger used also needs to be integrated with the capabilities. With some effort getting setup, efficient workflows can be built around Cortex-M debug and trace.