The bootloader is a crucial software component in ARM Cortex-based systems. It is responsible for initializing the hardware, setting up the environment for the operating system, and eventually loading the OS kernel. Understanding how the boot process works provides valuable insight into the startup sequence and configuring the bootloader for different needs.
Overview of the Boot Process
Upon power-on or reset, the ARM Cortex CPU begins executing code from a pre-configured address in internal ROM or flash memory. This is the start of the first-stage bootloader. It initializes the RAM, clocks, and low-level peripherals. Once done, it looks for the second-stage bootloader in external flash or SD card and copies it to RAM. After handing over control, the second-stage bootloader initializes higher-level peripherals. It also looks for and verifies the integrity of the OS image, before loading it into RAM. Finally, it clears registers, enables caches and MMU, and jumps to the OS entry point to begin execution.
The ARM Cortex CPUs have configurable boot addresses to support booting from different sources:
- Internal ROM – Cortex-M series chips have built-in ROM bootloaders that start executing on power-on.
- Internal flash – Cortex-A series SOCs allow booting from QSPI flash connected to the CPU.
- External flash – Common for Cortex-A SOCs to boot from parallel NOR flash or NAND flash chips.
- SD card – Managed NAND flash like eMMC and SD cards need a more complex bootloader.
- Serial ROM – Some SOCs use a small SPI ROM chip with first-stage bootloader.
- Network – Ethernet bootloader allows booting Linux images over the network.
Stages of the Bootloader
The boot process involves one or more bootloader stages:
The primary responsibilities of the first-stage bootloader include:
- Minimal initialization of core clocks, RAM, and flash memory interface.
- Setup stack pointer for the CPU mode it is executed in – ARM or Thumb.
- Copy second-stage bootloader from flash to RAM.
- Handover control to the secondary bootloader.
As the first code to run, it must be kept simple and robust. For Cortex-M cores, the ROM bootloader performs this role. On Cortex-A application processors, U-Boot SPL or ARM Trusted Firmware BL1 acts as the first-stage bootloader.
The second-stage bootloader, executed from RAM, carries out more complex initialization tasks:
- Fully initialize RAM and clock modules.
- Configure board-specific pinmux and IO modules.
- Detect supported storage and interfaces like USB, Ethernet, etc.
- Locate, verify, and load the OS image from boot device to RAM.
- Pass control to the OS image.
On ARM Cortex-A systems, U-Boot proper is the most common second-stage bootloader. ARM Trusted Firmware BL2 also serves this purpose in some cases.
Optional Third-Stage Bootloader
An additional bootloader stage may exist to load the OS image from network or modular boot partitions:
- Tertiary bootloader – Ethernet bootloaders like Etherboot/gPXE can download a system image using BOOTP/TFTP.
- Additional bootloaders – Some SoCs use an EL3 monitor and Trusted OS BL32 bootloader in Secure world.
The main tasks carried out by a typical ARM Cortex bootloader are:
1. Processor Initialization
The bootloader must enable caches, TLBs, and MMU and switch to the desired exception level:
- Enable instruction and data caches to improve code execution speed.
- Initialize on-chip RAM and setup stack pointers.
- Configure Memory Protection Unit (MPU) or MMU for memory access control.
- Set Exception Level (EL) on Cortex-A processors for desired privilege.
2. Hardware Initialization
The SoC platform hardware must be initialized before loading the OS:
- Clock modules – Configure CPU and peripheral clocks.
- Power management – Enable regular power mode for SoC.
- External buses – Setup interface timings for RAM, flash, etc.
- Board-specific I/O – Pinmux configuration and GPIO access.
3. Storage Initialization
The boot source storage medium needs to be identified and initialized:
- Boot device identification – Detect attached storage devices.
- Flash initialization – Setup flash memory access timings, drivers.
- Storage drivers – Activate SD/eMMC, SATA, USB drivers as needed.
4. OS Image Loading
Locating, verifying and loading the OS image is the primary purpose of the bootloader:
- Partition parsing – Understand storage partitioning to find boot partitions.
- Image formats – Decode boot image formats like FIT, Flattened Image Tree used in U-Boot.
- Verification – Authenticate and check integrity of the OS image.
- Decompression – Uncompress gzipped kernel and ramdisk images.
- Relocation – Load compressed OS image to a ram disk for decompression.
5. Handoff to OS
For starting the loaded OS, the bootloader must:
- Pass boot parameters – Provide command line arguments and ramdisk address.
- Setup boot registers – Initialize stack pointer register and program counter.
- Flush caches – Clean and invalidate cache contents before boot.
- Jump to entry point – Transfer control to OS entry point.
For custom bootloader development the following open-source projects offer a starting point:
- U-Boot – Universal bootloader for embedded devices, supports Cortex-A, R and M-series.
- Barebox – Modular bootloader focused on embedded Linux systems.
- ARM Trusted Firmware – Reference bootloader and runtime services for ARMv8-A.
- Das U-Boot – Primary bootloader for QNX Neutrino RTOS, also used with Linux.
- mbed – Bootloader libraries for Cortex-M microcontrollers.
Additionally, SoC vendors like Xilinx, NXP, TI, etc. provide bootloader ports for their ARM processors. The boot source, boot stages required, and level of boot customization will determine the best starting point.
The main aspects of the boot process configured via the bootloader include:
- Boot interface – Choose between SD, eMMC, Ethernet, UART, USB, etc.
- Boot device – Select from multiple storage devices attached.
- Boot mode – One time boot or persist across reboots.
- Security options – Image verification using signatures, U-Boot Falcon mode, etc.
- Boot timing – Delay in milliseconds between stages.
- Console interface – Debug logs via UART, Ethernet console, etc.
- Initialize drivers – Early or late initialization for storage drivers.
Boot parameters can be specified via compile-time configurationdefines, environment variables set in earlier stages, hardware signals like GPIO toggles, or interactively via the bootloader console.
Boot Time Optimization
To optimize the boot time on Cortex-based designs, the following techniques help:
- Initialize only required HW to minimize boot stages.
- Reduce initialization code by consolidating repeated setup calls.
- Use optimized libraries for decompression and image loading.
- Keep bootloader resident in internal RAM whenever possible.
- Prefetch next bootloader stage while current one executes.
- Use concurrent instruction execution in Cortex-A class CPUs.
- Enable early cache initialization to improve code execution speed.
Careful bootloader optimization can reduce Linux boot time from several seconds to under a second in embedded ARM devices.
The bootloader fulfills a vital role in all ARM Cortex-based systems. It handles the initial SoC and board bring-up before loading the main OS. Choosing the right boot scheme, customizing initialization, and optimizing the boot sequence are key for fast and reliable boot. With open source bootloaders and ARM reference implementations available, engineers have a robust starting point for building boot firmware on ARM Cortex CPUs and platforms.