ARM processors are extremely popular in embedded systems due to their low cost, low power consumption, and wide range of options. Here is an overview of ARM embedded programming to help you get started with developing on these versatile chips.
Introduction to ARM
ARM stands for Advanced RISC Machine, and ARM processors are based on RISC (Reduced Instruction Set Computer) architecture. This architecture is known for its simplicity and efficiency compared to CISC (Complex Instruction Set Computer) chips. The ARM instruction set is much smaller than x86, allowing for a smaller, less complex, lower power processor design.
ARM processors are 32-bit and 64-bit RISC multi-core CPUs and GPUs designed by Arm Holdings. Rather than manufacture their own chips, Arm licenses their IP (intellectual property) to other semiconductor companies like Qualcomm, Apple, Samsung, Nvidia, etc. These companies integrate the ARM cores into their own System-on-Chip (SoC) designs.
The ARM ecosystem is vast, with thousands of ARM-powered microcontroller and microprocessor variants targeted at different applications like smartphones, tablets, IoT devices, wearables, game consoles, automotive tech, and more. ARM chip designs are broken into families like Cortex-A, Cortex-R, Cortex-M based on use case.
ARM Programming Model
From a programmer’s perspective, ARM processors support both the 32-bit Thumb instruction set and 32-bit ARM instruction set. Thumb code is more compact while ARM code runs faster. Modern ARM chips can intermix both 32-bit instruction sets, executing Thumb code until a performance-critical section of ARM code is needed. Some ARM processors also support a 16-bit Thumb instruction set for even higher code density.
The ARM Application Binary Interface (ABI) specifies standard conventions like register usage, stack organization, function calls, etc. Adhering to the ABI ensures compatibility across different compilers, operating systems, and enables easier porting between ARM chips.
Another key feature of the ARM architecture is co-processors. ARM CPUs include an optional floating point unit (FPU) to perform floating point math operations in hardware. There are also commonly used co-processors like the Memory Protection Unit (MPU) and Nested Vectored Interrupt Controller (NVIC).
Programming Model Peripherals
ARM chips include various internal peripherals like timers, serial interfaces, ADCs, GPIO controllers, etc. These are mapped into the processor’s memory address space. Software can configure and control peripherals by reading and writing to their registers using normal load/store instructions.
For example, writing a value to a specific timer control register would start the timer counting. When the timer expires, it triggers an interrupt causing the CPU to run a corresponding interrupt service routine (ISR). This allows peripherals to notify the processor asynchronously of events.
Higher level libraries and device drivers will abstract access to peripherals providing simple functions like startTimer(), readAdc(), etc. But understanding how peripherals work at the register level is still important for configuring and multiplexing them.
Toolchain Setup
To develop ARM software, you will need an ARM compiler and other build tools. Compilers convert C/C++ code into ARM assembly language. Assemblers convert assembly into machine code binaries. Linkers combine different code modules into executables. Debuggers help analyze program execution.
The ARM Compiler toolchain provided by Arm includes all these tools. GCC ARM Embedded and LLVM/Clang are also popular open source alternatives. These toolchains support command line use, and integration with IDEs like Eclipse, Visual Studio Code, etc.
Target hardware will require flashing tools to load binaries onto the chip. Many IDEs integrate flashing/debugging features. Dedicated JTAG/SWD debug adapters are also commonly used for debugging ARM chips.
Bare Metal Embedded C
The simplest way to program ARM microcontrollers is through direct register manipulation, referred to as “bare metal”. In this case, no operating system is used. The programmer is responsible for configuring the various hardware peripherals directly.
C and assembly language are commonly used. Startup code executed at reset configures core registers and settings like the stack pointer. Device configuration is done by writing to registers in the System Control Block. Interrupts and exceptions can be managed directly through the Nested Vectored Interrupt Controller.
With bare metal ARM programming, the developer has complete control over the hardware. But this low-level control comes at the cost of extra complexity vs using a hardware abstraction layer or OS.
Real-Time Operating Systems
Real-time operating systems (RTOS) are commonly used on ARM chips to simplify development. An RTOS provides pre-emptive multitasking allowing multiple application tasks to run concurrently while efficiently sharing CPU time.
Common RTOSes like FreeRTOS, Micrium uC/OS, and CycloneOS provide useful high level constructs like threads, mutexes, semaphores, queues, etc. RTOSes also implement common functionality needed across embedded applications like resource management, time delays, inter-task communication and synchronization.
RTOS kernels are highly configurable to optimize for performance vs memory footprint. For example, unused kernel features can be disabled to conserve space. RTOSes may also support tickless operation for low power consumption.
High Level Languages
While C and assembly programming are commonly used in embedded systems, higher level languages can simplify development. For example, MicroPython brings Python 3.x to microcontrollers allowing GPIO, peripherals, and hardware modules to be easily controlled using Python scripts.
The ESP32, STM32, NXP iMX, and other chips support MicroPython with external libraries providing RTOS-like functionality. Scripts can be coded in a text editor and copied directly to the board over USB. Debugging is also possible using REPL and other methods.
Other higher level options include JavaScript runtimes like JerryScript for IoT. And managed runtime environments like .NET nanoframework enable C# on microcontrollers. Higher level languages can accelerate development at the cost of some overhead.
Operating Systems
Full featured operating systems like Linux can run on application processors like the Cortex-A family. Linux boots up the system and manages processes, memory, I/O devices, networking, etc.
Embedded Linux variants like Yocto provide a compact, customizable OS tailored for ARM boards. A bootloader like U-Boot is used to load the Linux kernel image. Device trees describe the system hardware. Board support packages provide device drivers and libs.
Application code runs as standard Linux processes using traditional OS services. Programming in C/C++ using POSIX APIs or higher level languages provides consistency with desktop Linux development.
Debugging on ARM
Debugging is critical for finding bugs and analyzing program flow. GDB is the standard debugger used with the GCC ARM toolchain. IDE debuggers or dedicated JTAG/SWD adapters connect GDB to the target for hardware debugging.
GDB can set breakpoints, inspect registers/memory, call stack, variables, etc. Memory watchpoints can trigger breaks on memory access. Debug builds keep symbol info for tracking code.
For bare metal ARM debugging, semihosting prints to the host IDE console. RTOSes support hook functions to log kernel diagnostics. And software tracing allows real-time event streaming to peripherals like ETB.
Performance Optimization
Embedded programming requires optimizing for performance within tight resource constraints. Compiler optimizations like function inlining and dead code elimination improve code efficiency. Static analysis helps find issues.
Profiling using hardware events or instrumentation identifies software bottlenecks. Common optimizations include loop unrolling, prefetching, using caches effectively, and tailoring algorithms to the core architecture.
C coding best practices like minimizing heap allocation and avoiding recursion also improve performance. Assembly can optimize key functions. And using compiler intrinsics utilizes hardware acceleration like SIMD.
Conclusion
ARM’s RISC architecture and ecosystem enable extremely efficient embedded designs. With the right development tools and an understanding of the processor programming model, you can take full advantage of ARM’s capabilities for your application.
A variety of programming approaches are possible on ARM – from bare metal to RTOS to Linux. Debugging and profiling techniques allow ARM code to be fully analyzed and optimized. With ARM’s continued dominance, there are enormous opportunities for developers skilled in ARM embedded programming.