ARM Cortex-M microcontrollers are extremely popular in embedded systems due to their low cost, low power consumption, and extensive ecosystem. By programming them in assembly language and C, developers can optimize performance, memory usage, and power consumption for resource-constrained embedded applications.
Introduction to ARM Cortex-M
The ARM Cortex-M series targets microcontroller applications and includes the ultra low power Cortex-M0/M0+, mainstream Cortex-M3/M4/M7, and high performance Cortex-M23/M33 processors. Key features include:
- ARM Thumb instruction set for high code density
- NVIC for handling interrupts and exceptions
- SysTick timer for operating system timekeeping
- Memory Protection Unit (MPU) for system reliability
- DSP extensions in M4/M7 for digital signal processing
- Floating point unit (FPU) in M4/M7 for single/double precision math
Cortex-M cores are available from various semiconductor vendors like STMicroelectronics, NXP, Microchip, TI, and Silicon Labs. They offer a range of peripherals like GPIO, timers, ADC, DAC, I2C, SPI, UART, and more via the ARM CoreSight architecture.
Programming ARM Cortex-M in Assembly Language
Assembly language provides full control over the microcontroller hardware. Cortex-M assembly uses the Thumb-2 instruction set which combines 16-bit and 32-bit instructions for improved performance vs older Thumb-1. Key instructions include:
- MOV/ADD/SUB/CMP/BX for math and branching
- LDR/STR for memory access
- PUSH/POP for stack manipulation
- BL/BLX for function calls
- SVC/BX for exception returns
Assembly code can directly access registers like R0-R12, SP, LR, and PC. Special registers include xPSR, PRIMASK, and CONTROL for handling status, interrupts, and execution modes. Instructions that affect these registers can enable advanced techniques like nested interrupt handling.
Inline assembly can be used in C programs when developers need absolute control over instructions. This avoids overhead from function calls in performance critical code. Assembly programming requires paying close attention to instruction effects, pipeline stalls, and byte alignment for efficiency.
Leveraging C and the ARM Toolchain
While assembly coding gives full control, C programming enables faster and more portable software development. All ARM Cortex-M vendors provide GCC-based toolchains that include:
- Compiler for converting C to machine code
- Assembler for low-level assembly programming
- Linker for combining object code files
- Debugger for runtime control and inspection
- Simulator for testing without target hardware
The compiler hides hardware details while generating efficient assembly instructions. Developers can use intrinsic functions for special instructions like changing execution modes. And compiler options like -O3 enable further code optimizations.
C language features like const, volatile, restrict, and inline hints provide additional control over generated code. Pointers can access memory-mapped peripherals. And libraries support common embedded functionality for math, I/O, connectivity, OS, and more.
Startup Code for Cortex-M Reset and Initialization
All Cortex-M programs begin execution at the Reset vector after boot. Startup code here performs critical system initialization like:
- Configuring the stack pointer
- Initializing static and global variables
- Setting up the vector table
- Enabling FPU if required
The vector table defines the location of exception and interrupt handlers. Positioning this table in code memory avoids wasting SRAM. Startup also copies initialized variables from Flash to RAM before calling main().
Careful startup code optimization can save hundreds of bytes vs the default provided by vendors. Techniques like only initializing required peripherals, merging vector tables, and reducing unnecessary linking and copying all minimize resource usage for a smaller memory footprint.
Interrupts for Real-Time Performance
Interrupts allow Cortex-M processors to respond quickly to events. The Nested Vectored Interrupt Controller (NVIC) manages up to 240 interrupt sources with configurable priority levels. Key concepts include:
- Interrupt Service Routines (ISRs) – low latency event handlers
- Vector table – jump table for dispatching to ISR
- Priority groups – priority splitting between preemption and sub-priority
- Preemption – suspending lower priority ISRs
Efficient firmware requires designing priority levels and preemption to match application needs. Critical ISRs like servo control loops may need higher priorities to reduce jitter. While less critical interrupts like ADC sampling can run at lower priorities.
Interrupt-driven firmware also requires management of shared data to avoid race conditions between ISRs and background tasks. Techniques like reentrancy protection, critical sections, and mutexes help write reliable concurrent code.
Timers, PWM, and Scheduling
Cortex-M timers enable microsecond resolution timekeeping and pulse width modulation (PWM) for control applications:
- SysTick – simplified timer for RTOS time slicing
- General Purpose Timers – advanced PWM and input capture capabilities
- Low power timers – continues running in sleep modes
Timers can also trigger interrupts periodically to enable time sliced scheduling. Tasks get preempted on timer overflow so higher priority work can run. Schedulers help structure bigger firmware as a set of discrete threads.
Precision timer peripherals generate PWM for LED brightness control, motor speed regulation, and other analog tasks. Developers can tune PWM pulse widths and frequencies to support ranges like 1-10 KHz with 8+ bits of precision.
Sleep Modes for Low Power
Sleep modes minimize power draw during idle periods by disabling unused peripherals and cores. Key concepts include:
- Active mode – CPU running, peripherals enabled
- Sleep mode – CPU stopped, peripherals and SRAM on
- Deep sleep – peripherals disabled, only wakeup sources active
- Standby mode – SRAM off, only low power timers running
Lower power modes offer huge battery life benefits – sleep may use 5X less than run, while deep sleep can reduce power 100X vs active. Effective low power design involves:
- Minimizing active runtime with streamlined code
- Using peripherals judiciously – disable when idle
- Leveraging low power libraries and OS frameworks
Interrupt-driven operation also helps by allowing most tasks to remain asleep until needed. Developers balance power savings with acceptable wake latency for the application.
Debugging Cortex-M Firmware
ARM CoreSight debugging provides comprehensive visibility into runtime code execution. Debuggers connect via SWD or JTAG to enable:
- Breakpoints – pausing execution at instructions
- Watchpoints – trigger on memory accesses
- Stepping – single instruction execution
- Registers and peripherals – viewing internal processor state
- Memory access – reading variables and buffers
Advanced use of breakpoints and watchpoints is key for diagnosing complex bugs. Debuggers also support profiling for performance optimization – analyzing cycle counts and hotspots.
Post-mortem techniques like dumping registers and stacks on exceptions provide visibility when a hard fault crashes a system. This helps identify root causes like stack overflow, invalid memory access, or unhandled exceptions.
Conclusion
ARM Cortex-M microcontrollers enable highly efficient embedded applications through careful coding in assembly and C. Direct register and hardware access via assembly combined with the expressiveness and portability of C allows rapid development. Interrupts, timers, sleep modes, debugging, and more provide building blocks for both simple and complex systems.