The ARM Cortex-M4 is a 32-bit RISC processor core designed for microcontroller applications. It features the ARMv7-M architecture and includes the Thumb-2 instruction set which allows for improved performance and reduced code size compared to previous ARM Cortex-M cores.
Instruction Types
The Cortex-M4 instruction set can be divided into several categories:
- Data-processing instructions – Arithmetic, logical, and move operations on registers and immediates.
- Load/Store instructions – Transfer data between registers and memory.
- Branch instructions – Change flow of execution based on conditional flags.
- Supervisor Call instruction – Trigger exception or syscall.
- Coprocessor instructions – Interact with optional FPU or other coprocessors.
- Miscelleanous control instructions – Set control registers, hints, barriers etc.
Registers
The Cortex-M4 has 16 general purpose 32-bit registers, R0-R15. Some have specific roles:
- R13 – Stack pointer
- R14 – Link register
- R15 – Program counter
Additionally, there are control and status registers:
- xPSR – Combined program status flags
- MSP – Main stack pointer
- PSP – Process stack pointer
- PRIMASK – Disable interrupts
- FAULTMASK – Disable faults
- BASEPRI – Set interrupt priority threshold
- CONTROL – Configure stack and thread mode
Data Processing Instructions
These perform arithmetic, logical, and move operations. Some examples:
- ADD Rd, Rn, Rm – Add two registers
- SUB Rd, Rn, #imm – Subtract immediate from register
- MOV Rd, Rn – Copy register
- CMP Rn, #imm – Compare to immediate
- EOR Rd, Rn, Rm – Bitwise XOR
- LSL Rd, Rm, #imm – Logical shift left
The shifter operand can be a register or an immediate value. Useful for multiplying and dividing by powers of two.
Load/Store Instructions
Used to load data from memory into registers or store data from registers to memory. Some examples:
- LDR Rt, [Rn, #offset] – Load word
- STR Rt, [Rn, #offset] – Store word
- LDMIA Rn!, {Rlist} – Load multiple
- STMIA Rn!, {Rlist} – Store multiple
Loads use LDR, stores use STR. Square brackets specify the address calculated from base register Rn and optional offset.
Branch Instructions
These alter the flow of program execution by changing the program counter conditionally or unconditionally:
- B label – Branch to label
- BX Rm – Branch and exchange instruction set
- BL label – Branch with link to label
- BLX Rm – Branch, link, and exchange instruction set
Conditional variants test condition flags in APSR before branching:
- BEQ label – Branch if equal
- BNE label – Branch if not equal
- BGT label – Branch if greater than
- BLT label – Branch if less than
Supervisor Call Instruction
SVC generates an exception allowing switch to privileged mode:
- SVC #imm – Make supervisor call
Used to call OS kernel functions and system calls.
Coprocessor Instructions
Provide access to optional coprocessors like floating point unit:
- VMOV s0, r1 – Move between ARM and FPU register
- VLDR s0, [r1] – Load FPU register from memory
- VADD s0, s1, s2 – Add FPU registers
Allow high performance math operations on Cortex-M4F cores with FPU.
Miscellaneous Instructions
Some other instructions:
- NOP – No operation
- WFI – Wait for interrupt
- SEV – Send event
- ISB – Instruction synchronization barrier
- MRS – Move to general register from special register
- MSR – Move to special register from general register
Used for hints, synchronization, reading or writing control registers.
Thumb-2 Instruction Encoding
Thumb-2 uses variable length 2 or 4 byte encoding for each instruction:
- 16-bit format – MOV, ADD, BX, CBZ
- 32-bit format – BL, LDR, STR
Opcodes specify operation type and condition flags. Operands encoded in remaining bits.
Allows for high code density compared to regular ARM while retaining performance. Achieves 1.6x improvement over earlier Thumb encoding.
Instruction Set Summary
In summary, the Cortex-M4 instruction set provides:
- Efficient RISC architecture
- Low latency interrupt handling
- Thumb-2 variable length encoding
- Single cycle multiplies
- Optional single precision floating point
- Efficient branch and control
Combined with the ARMv7-M architecture, it delivers an optimal instruction set for embedded microcontrollers requiring low cost, low power consumption, and real time responsiveness.
The condensed 16 and 32-bit Thumb-2 instruction set allows for improved performance and high code density crucial for microcontroller applications. Overall an ideal processor architecture for a wide range of microcontroller and embedded applications.