ARM processors use a reduced instruction set computing (RISC) architecture which is optimized to provide high performance at low cost and power consumption. The assembly language for ARM processors is simple and straightforward compared to some other instruction set architectures.
ARM Instruction Set Architecture
The ARM instruction set architecture is based on 32-bit fixed length instructions. Each instruction is 32 bits or 4 bytes long. The ARM architecture provides 16 general purpose 32-bit registers R0-R15. Register 15 is the Program Counter (PC) which holds the address of the current instruction being executed. Register 14 is the Link Register (LR) which typically stores the return address when a function call is made. Registers R13-R0 are general purpose registers used to store data.
Instructions operate on the registers using 3 operand formats:
- Register-Register-Register – e.g. ADD R1, R2, R3
- Immediate-Register-Register – e.g. ADD R1, R2, #10
- Memory-Register-Register – e.g. ADD R1, [R2], R3
This allows great flexibility and efficiency in ARM code.
ARM Assembly Language Syntax
Here are some key points on ARM assembly syntax:
- Instructions, registers and directives are case-insensitive
- Labels end with a colon ‘:’.
- Comments start with a semicolon ‘;’
- Hexadecimal numbers start with ‘0x’ e.g. 0xFF
- Most instructions can be conditionally executed based on status flags
Some examples: ADD R1, R2, R3 ; Add using registers MOV R5, #100 ; Load immediate value STR R1, [R2] ; Store register to memory BEQ LOOP ; Branch on equal LOOP: ; Label The syntax is clean and readable for humans while also being compact and efficient for the processor.
ARM Processor Modes
ARM processors have several processor modes with different levels of access:
- User Mode – Has limited access to system resources but can execute most instructions. Used for regular applications.
- FIQ Mode – Supports fast interrupts and has access to some registers not available in User mode.
- IRQ Mode – Used to handle standard interrupts and exceptions.
- Supervisor Mode – Protected mode for operating system kernels to access system resources.
- Abort Mode – Entered when instruction prefetch generates a memory fault.
- Undefined Mode – Entered when an undefined instruction is executed.
- System Mode – Highest privilege level, can access all system resources.
The operating system kernel runs in a privileged mode while applications run in User mode. This provides system protection and security.
ARM Instruction Types
Here are some of the main types of ARM instructions:
- Data Processing – Arithmetic, logical, and move operations on registers and immediates.
- Load/Store – Transfer data between registers and memory.
- Branch – Change flow of execution by altering the PC.
- Coprocessor – Communicate and transfer data to coprocessors.
Within these classes are many different instructions for various purposes. We’ll go over some of the most common and important ones.
Data Processing Instructions
These perform arithmetic, logical, and move operations. Some examples:
- ADD – Add two registers or register and immediate.
- SUB – Subtract one register from another.
- RSB – Reverse subtract (immediate from register).
- AND – Bitwise AND of two registers.
- ORR – Bitwise OR of registers.
- EOR – Bitwise XOR of registers.
- MOV – Copy value between registers.
- MVN – Bitwise NOT of register.
These provide basic integer math, logical, and move operations for manipulation of data.
Load/Store Instructions
These transfer data between registers and memory. For example:
- LDR – Load register from memory.
- STR – Store register to memory.
- LDM – Load multiple registers from memory.
- STM – Store multiple registers to memory.
Loads copy memory contents into a register while stores copy a register to memory. Multiple registers can be transferred with a single LDM/STM instruction.
Branch Instructions
These alter control flow by changing the PC. Some branches:
- B – Unconditional branch.
- BL – Branch with Link to subroutine.
- BX – Branch and Exchange instruction set.
- BLX – Branch Link Exchange.
Conditional variants test status flags:
- BEQ – Branch if Equal
- BNE – Branch if Not Equal
- BGT – Branch if Greater Than
Branches are used for loops, if-statements, and function calls.
ARM Instruction Encoding
ARM instructions are encoded in 32 bits with different fields: 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 |cond| 0 0 1 0| Opcode | S | Rn | Rd | Operand2 This encodes:
- The condition for conditional execution
- The opcode specifying the instruction class
- The S bit for some opcode variants
- Rn – The first operand register
- Rd – The destination register
- Operand2 – The second operand (register, immediate, etc)
Other instruction classes have different field encodings but follow a regular consistent format. This architecture enables efficient decoding and execution.
ARM Assembly Programming
Here is a simple example ARM assembly program that adds two numbers: /* Add two numbers */ .global main main: LDR R0, =num1 // R0 = Address of num1 LDR R1, [R0] // R1 = Value of num1 LDR R0, =num2 // R0 = Address of num2 LDR R2, [R0] // R2 = Value of num2 ADD R3, R1, R2 // R3 = R1 + R2 (num1 + num2) // R3 now contains the sum BX LR // Return from subroutine .data num1: .word 10 num2: .word 20 This shows loading immediate values, performing an addition, and returning from a function call. More complex programs can be built up using structured programming techniques.
Conclusion
The ARM assembly language provides a low level programming model for ARM processors. It gives direct control over the CPU and memory while being more readable than raw machine code. Key characteristics include:
- 32-bit fixed width instructions
- Load/store architecture with 16 registers
- Conditional execution for flexibility
- Operands flexible with registers, immediates, etc
- Orthogonal instruction set encoding
Knowledge of ARM assembly is useful for system-level programming, embedded systems, OS kernels, drivers, and performance-critical code. It remains a commonly used assembly language on mobile and embedded devices.