SoC
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
  • Arm Cortex M3
  • Contact
Reading: ARM Cortex M Assembly Tutorial
SUBSCRIBE
SoCSoC
Font ResizerAa
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Search
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Have an existing account? Sign In
Follow US
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
© S-O-C.ORG, All Rights Reserved.
Arm

ARM Cortex M Assembly Tutorial

Graham Kruk
Last updated: September 8, 2023 10:35 am
Graham Kruk 10 Min Read
Share
SHARE

Assembly language is a low-level programming language that directly corresponds to a computer’s underlying machine or assembly language. Unlike high-level languages like C/C++, assembly language consists of mnemonic codes that directly control a microprocessor’s components like registers, arithmetic logic unit, etc. Learning assembly language allows programmers to write optimized code by directly controlling the CPU’s functions.

Contents
PrerequisitesDevelopment Environment SetupARM Assembly BasicsRegistersData InstructionsBranchingLinking and FunctionsARM Programming ModelMemory SegmentsEndiannessWriting ARM Assembly CodeAdvanced ARM Assembly CodingStack and SubroutinesInline and Embedded AssemblyIntrinsic FunctionsSIMD and DSPTimers and InterruptsCode OptimizationConclusion

ARM Cortex M is a family of 32-bit RISC ARM processor cores licensed by Arm Holdings. The Cortex-M series targets microcontroller applications and is widely used in IoT devices, wearables, robotics, and other embedded systems. This tutorial provides a beginner’s guide to programming ARM Cortex M cores using assembly language.

Prerequisites

To follow this ARM assembly programming tutorial, you should have a basic understanding of:

  • Digital logic and microprocessor architectures
  • Hexadecimal number system
  • Basic C programming

We’ll be using the ARM Thumb-2 instruction set which is supported by most Cortex-M cores like Cortex-M3, Cortex-M4, etc. Make sure you have access to an ARM Cortex M development board or simulator/debugger toolchain like Keil MDK, IAR EWARM, etc.

Development Environment Setup

You need a development environment with ARM toolchain to compile, assemble and debug ARM assembly code. We’ll use Keil uVision IDE which provides a free MDK-Lite version for Cortex-M processors. Here are the steps to set it up:

  1. Download and install Keil uVision IDE with MDK-Lite from the Arm website.
  2. Add a device pack like STM32F4xx to support your target device.
  3. Create a new uVision project, select your target device and add a source file with .s extension.
  4. Open the options for file and set the assembler to ARM Macro Assembler.
  5. Now you are ready to write ARM assembly code which can be assembled and downloaded to the target device.

ARM Assembly Basics

Registers

Like any microprocessor, ARM Cortex M cores contain registers which are small storage locations accessible in a single clock cycle. ARM follows Reduced Instruction Set Computer (RISC) architecture which has fewer instructions than Complex Instruction Set Computers (CISC). Here are some key 32-bit registers in Cortex-M:

  • R0-R12 – General purpose registers for data operations
  • SP (R13) – Stack pointer for storing temporary data
  • LR (R14) – Link register that holds return addresses
  • PC (R15) – Program counter pointing to current instruction

There are also 16 Advanced SIMD registers, Floating point registers and special registers like APSR, PRIMASK, etc. We’ll focus on the general purpose registers for now.

Data Instructions

ARM assembly provides various instructions to load data into registers or store register contents into memory. For example: LDR R1, =0x20001000 // Load 32-bit value 0x20001000 into R1 STR R2, [R3] // Store value in R2 to memory pointed by R3

Other data processing instructions allow addition, subtraction, logical operations like AND, OR etc. between registers or between a register and an immediate value.

Branching

Conditional branching in ARM assembly uses an IT (If-Then) instruction followed by conditional branches like BEQ, BNE: IT EQ // If equal BEQ loop // Branch to loop label … loop: // Destination label

This branches to loop if previous condition flag is set to equal. Other conditional branch instructions include BGT, BLT, BCS, etc.

Linking and Functions

BL (Branch with Link) and BX (Branch and Exchange) instructions are used to call and return from functions in ARM assembly. BL branches to the target label while also saving return address to Link Register (LR). BX causes a branch to the address in a register, usually LR to return. BL func // Call function … func: // Function Label … BX LR // Return to caller

ARM Programming Model

The ARM Cortex M programming model follows Von Neumann architecture with a single memory space for both code and data. Flash memory stores code while SRAM stores variables. The processor can access both memories through the same address and data bus.

Code memory is sequential and immutable during execution. Data memory can be read/written freely. ARM cores have a Harvard architecture variant that separates instruction and data memories but we’ll focus on the unified memory model.

Memory Segments

ARM Cortex M memory map consists of multiple segments like Flash, SRAM, Peripherals, etc. with pre-defined base addresses. Each segment is an array of bytes numbered sequentially starting from the base address.

For example, SRAM may start at address 0x20000000. 4 bytes from 0x20000004 to 0x20000007 can store a 32-bit variable var_1. The processor uses load/store instructions to access variables located at specific addresses.

Endianness

ARM follows little endian format where the least significant byte of a multi-byte value is stored at lowest address. For a 32-bit 0x11223344 stored at 0x20000004, memory will contain: 0x20000004 – 0x44 0x20000005 – 0x33 0x20000006 – 0x22 0x20000007 – 0x11

The processor takes care of proper endianness handling during load/store. Programmers just need to know the addressing details.

Writing ARM Assembly Code

Now that we have covered the key concepts, let’s look at a simple example of ARM Thumb assembly code for Cortex-M processors: 1 AREA program, CODE, READONLY 2 ENTRY 3 4 EXPORT Start 5 Start 6 7 MOVS R0, #10 8 MOVS R1, #20 9 10 ADD R2, R0, R1 11 12 Stop B Stop 13 ALIGN 14 END

Here is what each line does:

  1. Initialize code memory area named “program”
  2. Mark entry point for toolchain
  3. Export Start symbol
  4. Start label
  5. Load 10 to R0
  6. Load 20 to R1
  7. Add R0 and R1 storing result to R2
  8. Stop program execution
  9. Align code boundary
  10. End assembly

After assembling, this simple program can be run on a Cortex-M target. We can observe the register values like R0=10, R1=20, R2=30 in the debugger after stepping through each instruction.

This demonstrates the basic ARM assembly syntax. We can build more complex applications using loops, functions, variables, etc.

Advanced ARM Assembly Coding

Here are some more advanced ARM Thumb-2 assembly programming topics useful for functions, real-world projects and optimizations:

Stack and Subroutines

The stack allows storing temporary data and passing arguments during function calls. ARM stack grows downwards from high to low memory. We use the stack pointer register SP (R13) to Push/Pop data to and from the stack.

Function arguments are passed using the stack. Registers R0-R3 are used to pass the first few arguments while others are pushed to stack. BL preserves return address in LR. Subroutines use PUSH/POP to preserve volatile registers.

Inline and Embedded Assembly

For Cortex-M projects, most code is written in C using ARM compiler toolchain. Time-critical functions and optimizations can use inline or embedded assembly written within C code.

Inline assembly is inserted as strings within C code and allows access to C variables directly. Embedded assembly is written in assembly files included into C projects.

Intrinsic Functions

Compiler intrinsic functions like __disable_irq() allow inserting assembly instructions directly into C code. This gives full control of hardware resources like interrupts, DSP extensions, etc. from C rather than hand-written assembly.

SIMD and DSP

ARM processors include Single Instruction Multiple Data (SIMD) instructions to perform parallel computations on vectors. Cortex-M4 and above have DSP extensions using SIMD to accelerate signal processing algorithms.

DSP intrinsics are supported in C code while hand-crafted assembly can optimize utilizing these SIMD instructions.

Timers and Interrupts

Assembly language gives full control over microcontroller peripherals like timers, GPIO, communication buses etc. This allows optimizing interrupt service routines and peripheral initialization code.

Bit banding is a useful technique to set/clear individual peripheral registers directly using assembly inserts.

Code Optimization

Assembly programming enables code size and performance optimization using processor-specific features. Loop unrolling, instruction scheduling, reducing pipeline stalls are some common optimizations.

Inline assembly and intrinsic functions can be used to optimize hotspots and critical code segments without rewriting the full application.

Conclusion

ARM Cortex M assembly language enables writing optimized code for microcontrollers used in IoT, embedded, robotics and other applications. This tutorial covers ARM Thumb-2 assembly basics like syntax, registers, data processing, branching and functions needed to get started.

Advanced techniques like stack usage, SIMD instructions, peripherals control and code optimization help build real-world projects. With both high-level languages and low-level assembly, the ARM Cortex M family provides a strong, flexible platform for tomorrow’s embedded systems.

Newsletter Form (#3)

More ARM insights right in your inbox

 


Share This Article
Facebook Twitter Email Copy Link Print
Previous Article ARM Cortex-M3 Assembly Code Examples
Next Article armv7e-m vs armv7-m
Leave a comment Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

2k Followers Like
3k Followers Follow
10.1k Followers Pin
- Sponsored-
Ad image

You Might Also Like

Cortex-M Rust

Rust is a systems programming language that has gained popularity…

8 Min Read

What is the performance of the ARM Cortex-M0?

The ARM Cortex-M0 is an ultra low power 32-bit microcontroller…

5 Min Read

ARM Cortex M Configurations with Non-Native Endianness

The ARM Cortex-M processors are designed to operate with little…

9 Min Read

What are Saturated Math Instructions in Arm Cortex-M Series?

Saturated math instructions in Arm Cortex-M series refer to arithmetic…

9 Min Read
SoCSoC
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
Welcome Back!

Sign in to your account