The Cortex-M0 DesignStart core from ARM is an ultra low power 32-bit RISC processor that is optimized for low cost and small footprint FPGA implementations. By utilizing the DesignStart core on a low-cost FPGA, developers can quickly prototype an embedded system and evaluate the Cortex-M0 performance for their application. This article will examine the logic utilization and performance trade-offs of targeting different low-cost FPGAs like the Xilinx Spartan-6 and Intel/Altera Cyclone IV devices.
Overview of the Cortex-M0 DesignStart Core
The Cortex-M0 DesignStart core is a pre-verified 32-bit RISC processor implementing the ARMv6-M architecture. It is an ultra low power core that is optimized for FPGA implementation. The complete DesignStart core includes the Cortex-M0 CPU, an AHB bus interface, embedded trace macrocell (ETM), and optional memory protection unit (MPU).
For FPGA implementation, the core is delivered as encrypted RTL source code. It can be implemented in ASIC flows as well. The core is highly configurable with options like integrated embedded trace, memory protection, and choice of AHB lite interface or AXI interface. There are also various clocking options available.
The small silicon footprint and low power consumption makes the Cortex-M0 well suited for FPGA implementation, especially on low-cost devices. Typical applications include motor control, industrial automation, IoT edge nodes, and embedded vision. The core can also be extended by integrating custom logic in the FPGA.
Logic Utilization on Spartan-6 FPGAs
The Xilinx Spartan-6 FPGA is a low-cost option that works well with the Cortex-M0 DesignStart core. The Spartan-6 family features densities ranging from 6K to 147K logic cells, with power consumption as low as 0.141W.
Here are some example implementation results for the Cortex-M0 on select Spartan-6 devices:
- Spartan-6 LX25: Cortex-M0 uses ~1650 LCs out of 15K available LCs (~11% utilization)
- Spartan-6 LX75: Cortex-M0 uses ~1650 LCs out of 35K available LCs (~4% utilization)
- Spartan-6 LX150: Cortex-M0 uses ~1650 LCs out of 85K available LCs (~1% utilization)
This shows that even modest Spartan-6 devices can fit the Cortex-M0 with plenty of margin. Lots of logic is left over to implement custom peripherals, accelerators, memory interfaces, and other glue logic.
The Cortex-M0 achieves 66 – 100 MHz in the Spartan-6 devices. The maximum clock frequency scales up in the larger devices. Power consumption ranges from 0.15W – 0.30W depending on target frequency and device size.
Logic Utilization on Cyclone IV FPGAs
For Intel/Altera FPGA users, the low-cost Cyclone IV series also works well with the Cortex-M0 DesignStart core. Cyclone IV densities range from 8K to 149K LEs, with power as low as 0.25W.
Here are sample results for the Cortex-M0 implemented in Cyclone IV FPGAs:
- Cyclone IV E 25: Cortex-M0 uses ~1600 LEs out of 25K available (~6% utilization)
- Cyclone IV E 75: Cortex-M0 uses ~1600 LEs out of 75K available (~2% utilization)
- Cyclone IV E 150: Cortex-M0 uses ~1600 LEs out of 149K available (~1% utilization)
Again, the small CPU footprint leaves ample room for additional logic functions around the Cortex-M0. Cyclone IV FPGAs can achieve up to 100 MHz operation with the Cortex-M0 DesignStart core.
Optimizing the Design for Specific FPGAs
To push maximum performance and utilization, developers can customize the Cortex-M0 configuration for their target FPGA device. This includes:
- Trimming unneeded interfaces and functions
- Modifying memory interfaces for optimal resource usage
- Tuning clock constraints and clock logic for timing closure
- Setting FPGA-specific compiler optimizations
With optimization, it may be possible to fit the Cortex-M0 into even smaller low-cost FPGAs. The core clock speed can also be increased to approach the FPGA’s maximum frequency.
However, aggressive optimization requires detailed knowledge of the target device architecture and timing models. It may require custom RTL modifications. For many applications, the baseline Cortex-M0 implementation will be adequate when using sufficiently sized, low-cost FPGAs.
Integrating Custom Logic with the Cortex-M0
A major benefit of implementing the Cortex-M0 in an FPGA is the ability to integrate custom logic for hardware acceleration. Some examples include:
- Custom peripherals – Application-specific peripherals and I/O devices can be designed to interface with the Cortex-M0 AHB or AXI bus.
- Hardware accelerators – Compute-intensive functions can be migrated to custom hardware blocks in the FPGA for higher performance and power efficiency.
- Video and image processing – Image sensors and video pipelines can be implemented directly in the FPGA logic.
- Floating point units – Floating point accelerators can offload intensive math from the Cortex-M0.
With a single integrated FPGA design, the flexibility of software running on the Cortex-M0 is combined with the efficiency of custom hardware accelerators tuned for the application.
Conclusion
The Cortex-M0 DesignStart core enables leveraging the low cost and flexibility of FPGAs for embedded applications. Optimal results can be achieved by selecting the right low-cost FPGA family and optimizing the core configuration. Integrating custom logic in the FPGA with the processor opens up many possibilities to tune the hardware architecture for demanding embedded workloads.