Designing an optimized clocking system is critical for any FPGA implementation, but especially so for DesignStart Cortex-M0 cores which have specific clocking requirements. This article will provide an overview of key considerations and techniques to optimize clock design when targeting Cortex-M0 DesignStart on FPGAs.
Cortex-M0 Clocking Overview
The Cortex-M0 processor has a two-stage pipeline and supports clock frequencies up to 50 MHz. The key clock domains are:
- System clock – Core system/bus clock, max 50 MHz
- Processor clock – Cortex-M0 core/pipeline clock, max 50 MHz
- Debug clock – Debug/trace interface clock, max 75 MHz
The processor and system clocks must have a set ratio of 1:1, 2:1, or 4:1. The debug clock can be decoupled and run at a faster rate.
The Cortex-M0 requires a clean, low jitter clock source to meet timing. On an FPGA, this generally requires using a dedicated PLL/MMCM rather than direct oscillators.
Clock Source Selection
For most FPGA families, a PLL/MMCM should be used to generate the Cortex-M0 system/processor clocks from a board-level oscillator. This allows generating a low jitter, high frequency clock from a lower frequency source.
Some key considerations for the PLL/MMCM settings:
- Select clean source oscillator – Typically 25-100 MHz from a crystal or LVDS oscillator
- Low PLL/MMCM multiplication – Lower is better, aim for 1-2X to minimize jitter
- Higher PLL/MMCM division – More stages of division will filter jitter
- Use PLL/MMCM frequency tuning – Slowly tune output frequency for best timing margin
For example, on a Xilinx FPGA a 50 MHz output clock could be generated from a 100 MHz LVDS oscillator using a PLL with settings like:
- Input: 100 MHz LVDS oscillator
- PLL multiplication: 2x (200 MHz VCO)
- PLL division: 4x Divider1 – 50 MHz output
The debug clock can use a separate PLL or be a divided version of the system clock. Using a separate PLL allows generating a higher frequency clock.
Clock Network Resources
In addition to the PLL/MMCM source, the clock network resources used to route the clocks are critical for low skew and jitter. Key tips here:
- Use dedicated routing – Global buffers, regional clocks
- Avoid excessive fanout – Replicate/buffer clocks if needed
- Balance routing delays – Match delays on paired clock networks
Also pay close attention to meeting timing constraints from the PLLs to endpoints, reducing skew, and avoiding excess delay variation.
Processor Clock Gating
The Cortex-M0 DesignStart processor implements extensive clock gating internally to gate the clock when idle. Due to this, the processor clock net activity may be very bursty.
To avoid issues like supply noise, some tips for processor clock gating are:
- Use a PLL/MMCM – helps filter gated clock vs. oscillator
- Increase regulator bandwidth – reduce supply noise
- Gate unused peripherals – reduce simultaneous switching
Debug and Trace Clocks
The debug and trace clocks can run at a higher frequency than the system clock. This allows debug components like SWD and ETB to operate faster.
Some guidelines for these clocks:
- Can use same PLL/MMCM as system clock or separate source
- Scale debug clock from system clock using clock dividers
- Target debug clock 3-4x system clock speed if possible
When sourcing the debug/trace clocks, minimize adding any additional jitter to the system clocks. Use dedicated routes if possible.
Reset Design and Clocks
Proper reset design is also crucial for robust clocking. The key aspects are:
- Use proper synchronized reset logic
- Assert reset for enough cycles for PLLs to lock
- Control clocks and reset state machines properly at init
- Ensure no glitches on reset de-assert if gating clocks
The reset logic and state machines must hold the Cortex-M0 in reset while PLLs lock, and cleanly release when clocks are stable.
Clock Constraints and Analysis
Applying complete timing constraints and doing thorough analysis is a must for complex clocking schemes. Some tips here:
- Constrain all clock frequencies, sources, edges
- Verify timing paths meet multi-cycle and false path constraints
- Examine clock interaction reports and skew
- Iteratively improve constraints based on analysis
Timing closure may require relaxation of certain paths with multi-cycle constraints. But use these minimally and avoid over-constraining.
Simulation of Clocks
Simulating the clocking logic helps validate functionality before hardware. Areas to validate in simulation:
- PLL/MMCM locking and output frequencies
- Clock interaction and gating logic
- Reset sequencing relative to clocks
- Timing margins through clock domains
While full timing simulation is not possible, running behavioural simulation with ideal clocks can still validate many interactions early.
Bring-Up Recommendations
During FPGA bring-up and prototyping, there are some additional clocking guidelines that can help identify and resolve issues:
- Monitor supply noise, jitter, and frequencies
- Start at low clock speeds and scale up gradually
- Observe clock waveforms for noise or coupling issues
- Measure timing margins through CDC logic
Following structured bring-up procedures will help catch any board-level issues impacting clocks before hitting timing failures.
Conclusion
Designing optimized clocking schemes is key to realizing the maximum performance capabilities of the Cortex-M0 DesignStart. By following the recommendations outlined here on clock sources, network resources, constraints, and design practices, a robust implementation can be achieved.
The most critical guidelines are using clean PLL/MMCM sources, minimizing jitter and skew, constraining CDC paths, and validating through rigorous simulation. Following these steps will help clocking convergence and ensure a successful FPGA implementation of the Cortex-M0 DesignStart processor.