Implementing an ARM Cortex-M0 processor on an FPGA can be an excellent way to prototype and test an embedded system design before committing to an ASIC or production FPGA. The Cortex-M0 DesignStart model provides a basic Cortex-M0 core in synthesizable RTL that can be targeted to many different FPGA devices. This article will walk through the full implementation process, from obtaining the DesignStart files to running code on the processor in simulation.
Overview of the Cortex-M0 Processor
The Cortex-M0 is ARM’s smallest and simplest 32-bit microcontroller CPU. It is meant for deeply embedded applications that require an inexpensive yet modern 32-bit processor. The M0 has a clean 2-stage pipeline, a single cycle ALU, and an optional nested vectored interrupt controller. It includes a SysTick timer, sleep modes, and basic debug capabilities. The full M0 core also contains memory protection unit, bitbanding, and optional FPB. But the DesignStart model strips these down to the bare minimum necessary to implement the core RTL.
Some key specs of the Cortex-M0 core:
- 32-bit RISC architecture, 2 stage pipeline
- Up to 25 DMIPS at 50 MHz
- ARMv6-M architecture
- 32-bit registers, 32-bit memory addressing
- Thumb-2 instruction set
- Hardware multiply and divide
- SysTick timer
- Nested Vectored Interrupt Controller
- Sleep and wake-up modes
- Optional FPB for debugging
Obtaining the DesignStart Cortex-M0
The Cortex-M0 DesignStart model is available free of charge from ARM, but you must register for a license. Go to www.arm.com/designstart and follow the steps to register and request access to the Cortex-M0 RTL. ARM will review the request and grant access through the DesignStart portal. The license allows use of the RTL for non-commercial engineering prototyping purposes.
Once access is granted, download the .zip file containing the DesignStart Cortex-M0 RTL from the portal. This will provide the Verilog RTL code along with documentation, integration guides, and example testbenches.
Constraints and Tools Setup
The DesignStart Cortex-M0 target FPGA is the Altera (Intel) Cyclone IV GX. So proper constraints and settings files are needed for this target. Download the latest Quartus Prime Lite from Intel’s website along with the Cyclone IV FPGA development kit files.
You will need the Quartus synthesis and place-and-route software, the ModelSim or QuestaSim HDL simulator, and the Platform Designer system integration tool. All of these are available free of charge from Intel. Install them on your system and familiarize yourself with the basics of FPGA design if you have not worked with them before.
Top Level System Integration
With the tools set up, we can now integrate the Cortex-M0 RTL into a top-level FPGA design. This consists of instantiating the processor core and adding the necessary peripherals, memories, interconnect, and I/O.
Platform Designer provides predefined wrappers and interconnects to speed up system integration. Create a new Platform Designer system with a Cyclone IV GX target and add the Cortex-M0 core from the IP catalog. Connect a JTAG UART module for simulation. Add a timer, GPIO, and connections to external DDR2 memory.
Configure the memory maps, IRQ assignments, clock rates, and other system-level settings. Generate the Platform Designer interconnect logic and top level synthesis files.
Simulation Testbench
With the top-level design complete, create a basic self-checking testbench to simulate the system. Instantiate the top level and initialize the JTAG UART, timers, memories, and other interfaces. Initialize any firmware code into the instruction and data memories.
The ARM-provided testbench can be used as a starting point. It includes clocks, resets, and default connections. Modify the testbench to match your Platform Designer system integration. Compile the simulation using ModelSim and load the design.
Running Code in Simulation
The Cortex-M0 will initially be halted. We need to start it up and run firmware code. Connect to the JTAG UART in ModelSim console. Issue the reset signal, then halt and resume commands to start the processor. Single step to confirm the reset vector is being fetched and executed.
Load a simple firmware program into the instruction memory that toggles GPIO pins in a loop. Resume simulation and confirm the GPIO outputs respond as expected. Debug features like breakpoints can be tested as well through the JTAG interface.
With basic code execution confirmed, simulate any benchmark code or applications you aim to run on the hardware. Identify any issues with peripherals or interconnect early before implementing them in FPGA.
FPGA Synthesis and Implementation
The Cortex-M0 DesignStart model is designed to synthesize efficiently for FPGAs. However, proper constraints are still required to meet timing closure. Apply all recommended DC and timing constraints from the ARM documentation.
Run through synthesis and place-and-route. The goal is to hit at least 50 MHz fMAX. If timing is not met, play with constraints and resource allocations to improve routing. The ARM implementation guide provides suggestions for optimization directives as well.
Examine the resource utilization reports after implementation. The Cortex-M0 should take under 20k LEs, well under half the size of the Cyclone IV GX FPGA. If utilization is substantially higher, ensure optimization options are enabled in the Quartus project settings.
Debug and Validation on FPGA
With basic timing closure met, the design can be tested on real FPGA hardware. Program the Cyclone IV development board with the compiler image file. Connect to the onboard JTAG interface.
Issue reset, initialize registers and memories as needed. Single step through instructions and load firmware code as with simulation. Attempt to run application code and debug if crashes or issues occur. Use signal tap logic analyzer if needed to trace inputs and outputs.
Benchmark performance against simulation by running code and measuring cycles. Characterize power consumption of the processor subsystem. Identify any discrepancies versus simulation results.
Conclusion
Successfully implementing the Cortex-M0 DesignStart model onto an Altera FPGA requires attention to detail in the integration, simulation, synthesis, and verification phases. Following ARM’s guidelines and adjusting the flows as needed enables the construction of a prototype embedded system showcasing the Cortex-M0 processor’s capabilities.
The experience gained working through this implementation provides valuable insight into the hardware needs and customizations required to utilize the Cortex-M0 in a production design scenario. With a functioning prototype on FPGA, the Cortex-M0 RTL and software can be fine-tuned to meet requirements before taping out any custom ASIC or production FPGA release.