Thumb-2 instructions are a subset of the ARM instruction set that is used in Arm Cortex-M series microcontrollers. Thumb-2 provides both 16-bit and 32-bit instructions to improve code density while maintaining high performance. The key features of Thumb-2 instructions are:
- 16-bit and 32-bit instructions intermixed in Thumb-2 code
- Most 32-bit instructions are encoded as 16-bit Thumb-2 instructions
- Provides Cortex-M cores with both high code density and high performance
- Upwards compatible extension of previous 16-bit Thumb instruction set
- Supports unconditional and conditional branch instructions
- Includes support for interrupts, coprocessor instructions and specialized control instructions
Background
The original Thumb instruction set was designed as a 16-bit compressed version of the 32-bit ARM instruction set used in earlier versions of Arm processors. Thumb provides higher code density by using 16-bit instructions instead of 32-bit instructions, which reduces code size by around 30-40% typically. However, 16-bit Thumb instructions also reduce performance compared to 32-bit ARM instructions.
To provide both high code density and high performance, Arm designed the Thumb-2 instruction set based on a variable-length encoding that allows both 16-bit and 32-bit instructions to be intermixed freely. This provides the code density benefits of 16-bit Thumb with the performance benefits of 32-bit ARM instructions in a unified instruction set architecture. The Thumb-2 instruction set is utilized in Arm Cortex-M series processors like Cortex-M3, Cortex-M4, Cortex-M7, etc.
Main Features of Thumb-2
Here are some of the main features of the Thumb-2 instruction set architecture:
1. 16-bit and 32-bit instructions intermixed
The defining feature of Thumb-2 is the ability to freely intermix 16-bit and 32-bit instructions within Thumb-2 code. This maintains backwards compatibility with previous Thumb code while allowing use of 32-bit instructions for higher performance.
2. 32-bit instructions encoded as 16-bit Thumb-2
Most 32-bit ARM instructions have equivalent 16-bit Thumb-2 instruction encodings. This allows use of 32-bit instructions without increasing code size in most cases. Some less frequently used 32-bit instructions do not have 16-bit Thumb-2 encodings.
3. High code density and high performance
By combining 16-bit and 32-bit instructions, Thumb-2 provides both high code density through use of 16-bit instructions and high performance by using 32-bit instructions when needed. This delivers the best of both worlds.
4. Upwards compatible extension of previous Thumb ISA
The 16-bit Thumb-2 instruction encodings maintain backwards compatibility with the previous Thumb ISA. Existing Thumb code can run unchanged on Cortex-M processors that support Thumb-2. Only new 32-bit Thumb-2 instructions may require changes.
5. Branch instructions
Thumb-2 provides both conditional and unconditional branch instructions to efficiently control program flow. Both 16-bit and 32-bit branch instructions are available.
6. Interrupt and exception support
Dedicated Thumb-2 instructions provide low latency exception and interrupt handling required for embedded applications. These include exception return, exception entry and priority drop instructions.
7. Coprocessor and specialized control instructions
Thumb-2 provides instructions for coprocessor control and communication along with specialized control instructions. These provide capabilities like setting privileged/unprivileged execution modes, memory barrier instructions, and no operation (NOP) instructions.
16-bit vs 32-bit Thumb-2 Instructions
The Thumb-2 instruction set implements most common 32-bit ARM instructions as equivalent 16-bit Thumb-2 instructions. However, some less frequently used 32-bit instructions do not have 16-bit Thumb-2 encodings. The main differences are:
- 16-bit T2 instructions: Basic ARM register-register operations, branches, loads/stores have 16-bit T2 forms. Higher code density.
- 32-bit T2 instructions: Less common ARM instructions like multiplies, divides, and floating point. Also some register-register ops lack 16-bit forms.
Guidelines for Thumb-2 code density vs performance:
- Use 16-bit T2 instructions whenever possible
- Use 32-bit instructions for higher performance when needed
- Avoid intermixing 16-bit and 32-bit instructions too frequently
- Group 32-bit instructions together into blocks/subroutines when possible
Thumb-2 Instruction Encodings
The different types of Thumb-2 instruction encodings are:
- 16-bit T2 encoding – For common ARM instructions, has 11-bit opcode field
- 32-bit T2 encoding – For less frequent ARM instructions, has 27-bit opcode field
- Branches – Conditional and unconditional branches, 16 or 32-bit forms
- Load/Store – Variety of addressing modes, 8/12/16-bit immediates
- Miscellaneous Control – Hint instructions, exception return, debug hint, etc.
Within a Thumb-2 code sequence, the processor detects whether an instruction is 16-bit or 32-bit based on the opcode bits and following bits. This allows seamlessly intermixing 16-bit and 32-bit T2 instructions.
16-bit Thumb-2 Instruction Encoding
The general format for a 16-bit Thumb-2 instruction is: [15-11] [10-9] [8-6] [5-3] [2-0] opcode Rm Rn Rd
- 11-bit opcode field determines the instruction type
- Rm, Rn, Rd fields encode source and destination registers
- Special purpose bits for shift, status flags, etc as needed
32-bit Thumb-2 Instruction Encoding
The general format for a 32-bit Thumb-2 instruction is: [31-28] [27-20] [19-16] [15-12] [11-0] 11111011 opcode Rn Rd imm
- 27-bit opcode field determines the instruction type
- Rn and Rd encode operand registers
- imm provides 12-bit immediate value when needed
Branch Instructions
Thumb-2 supports conditional and unconditional branch instructions in both 16-bit and 32-bit formats: B Unconditional branch B.cond Conditional branch CBZ Branch if zero CBNZ Branch if not zero BL/BLX Branch with link to subroutine
Maximum branch displacements are +-16MB for 32-bit branches and +-1MB for 16-bit branches. Branch targets must be 2-byte aligned.
Load/Store Instructions
Thumb-2 contains extensive load/store instructions with flexible addressing modes: LDR Load register from memory STR Store register to memory
Addressing modes supported:
- Offset – Register + immediate offset
- Pre-indexed – Register + immediate offset, and update register
- Post-indexed – Register + offset, and update register
8-bit, 12-bit, 16-bit immediates are available depending on instruction size.
Cortex-M Processor Support for Thumb-2
Here are some examples of Cortex-M series processors and their Thumb-2 support:
- Cortex-M3: Supports full Thumb-2 instruction set
- Cortex-M4: Supports Thumb-2 + DSP extensions
- Cortex-M7: Supports Thumb-2 + DSP + Floating point
- Cortex-M23/M33: Supports Thumb-2 + RAS extensions for reliability
All Cortex-M cores since Cortex-M3 support the complete base Thumb-2 instruction set. Advanced cores like Cortex-M4/M7 add additional Thumb-2 extensions for DSP, floating point, RAS features, etc.
Benefits of Using Thumb-2
Here are some of the benefits provided by using Thumb-2 instructions in Arm Cortex-M cores:
- Higher code density reduces code size compared to 32-bit ARM
- Intermixing 16-bit and 32-bit instructions provides both small code and high performance
- Efficient branch and load/store instructions for embedded applications
- Compatible upgrade from previous Thumb instruction set
- Supported by all versions of Cortex-M cores
- Allows building of efficient and compact firmware for microcontrollers
Overall, by combining high code density 16-bit instructions with high performance 32-bit instructions in a flexible variable-length encoding, Thumb-2 provides an instruction set architecture that is well-tailored for embedded and microcontroller applications requiring both compact code size and excellent performance.