The Thumb instruction set is a compressed variant of the ARM instruction set that was introduced in ARMv4T architectures. It is designed to reduce code size by using 16-bit instructions instead of 32-bit instructions. Thumb code is typically 30-40% smaller than equivalent ARM code, which helps reduce memory requirements and improve performance in embedded systems.
Overview of Thumb
In the 32-bit ARM instruction set, all instructions are 32 bits long. This provides flexibility and power, but also results in larger code size. With Thumb, most instructions are 16 bits long, reducing code size while still providing good performance for most tasks. The processor automatically switches between ARM and Thumb instruction sets using a bit in the program counter register.
The Thumb instruction set maintains most of the features of the ARM set but with limitations on some complex instructions like conditionals and shifts. Thumb has 16 general purpose 32-bit registers like ARM but a smaller subset of available instructions. Memory access instructions, branch instructions, and basic ALU operations are present, providing ample functionality for most embedded applications.
Thumb code executes on a subset of the main ARM processing core. Some ARM processors implement a full 32-bit ALU to handle both ARM and Thumb instructions efficiently. Other designs use a 16-bit data path and barrel shifter for Thumb code. In either case, performance of Thumb code is comparable to ARM in many embedded systems.
Main Features of Thumb
Here are some of the main features of the Thumb instruction set:
- 16-bit instruction length – Most instructions are halfword (16-bit) in size.
- High code density – Typically provides 30-40% code size reduction compared to 32-bit ARM code.
- Access to ARM registers – Thumb code can access all 16 ARM general purpose registers (R0-R15).
- Load/store architecture – Uniform and systematic memory access instructions like ARM.
- Conditional execution – Supports conditional execution of branches and moves.
- PC-relative branches – Branch instructions use PC-relative offsets.
- Stack operations – Push, pop, link register (LR) stacking supported.
- ALU instructions – Provides basic arithmetic, logical, and comparison operations.
- Shifts and rotates – Barrel shifter support, but limited shift options.
- Interworking with ARM – Seamless switching between ARM and Thumb modes.
This core subset of instructions allows Thumb code to be very compact while still supporting most operations needed for embedded C programs and compiler output. The mix of 16-bit and 32-bit instructions strikes a good balance between size and performance.
16-bit Thumb Instruction Formats
Thumb instructions use several formats to encode operations within 16 bits:
- Format 1 – 3-bit opcode, 3-bit Rd, 3-bit Rs, 5-bit immediate or offset.
- Format 2 – 11-bit immediate or offset, 5-bit opcode.
- Format 3 – 8-bit Rd, 3-bit opcode, 5-bit Rs.
- Format 4 – 5-bit Rm, 3-bit opcode, 8-bit Rd.
Format 1 is very flexible and used for arithmetic, logical, load/store, branch instructions. The small 5-bit immediate field limits complex operations. Format 2 encodes branch offsets and larger immediates compactly. Format 3 handles register moves and ALU ops. Format 4 has a separate Rm source register.
These 16-bit formats encode Thumb opcodes and registers efficiently while providing ample capacity for immediates and offsets. Additional special case instruction formats exist for dealing with PC loads, high registers, and hint instructions.
32-bit Thumb-2 Instruction Extensions
Later revisions of the Thumb instruction set known as Thumb-2 added some 32-bit instructions to improve performance and functionality:
- Branches – Long PC-relative branches with 24-bit offsets.
- Load/Store – Wider load/store instructions with largerimmediate offsets.
- ALU – 32-bit arithmetic/logical instructions with more operands.
- Multiply – 32-bit multiply instructions for better performance.
- Coprocessor – Improved coprocessor control instructions.
These 32-bit Thumb-2 instructions augment the space-saving 16-bit Thumb instruction set with more powerful operations commonly needed for performance. They allow Thumb code to avoid having to switch to ARM mode as frequently.
Interworking Between Thumb and ARM
To allow mixing Thumb and ARM code, processors with Thumb support provide interworking capabilities to switch modes smoothly:
- The program counter bit 0 indicates Thumb (1) or ARM (0) execution state.
- BX register instructions can change this bit to switch modes.
- PC-relative branches handle bit 0 automatically.
- BLX allows calling ARM code from Thumb and vice versa.
- Special code sequence requirements for some mode transitions.
With interworking, Thumb code can be used for compact sections like critical loops or I/O routines, with ARM code used where full power is needed. Optimizing compilers can generate optimal mixes of Thumb and ARM code within a program.
Thumb-2 Extensions in ARMv6T2 and ARMv7
The ARMv6T2 architecture introduced more instructions and changes for Thumb-2:
- More 32-bit Thumb instructions
- If-Then instructions for better conditional execution
- New register naming conventions
- Thumb instruction reordering to improve pipeline performance
ARMv7 added further enhancements:
- Thumb-2 code density improvements
- More conditional execution options
- Bitfield and branch hint instructions
- Synchronization instructions for multiprocessing
These changes made the Thumb-2 instruction set nearly as capable as ARM in many applications while retaining the code size advantage. Most ARM processors now provide full Thumb-2 support.
Benefits of Using Thumb
Here are some of the key benefits of using the Thumb instruction set:
- Reduced code size – The main advantage of Thumb is its smaller 16-bit instructions reducing code size compared to 32-bit ARM.
- Lower memory requirements – Less code allows embedded systems to use smaller, lower cost memory and reduce memory bandwidth needs.
- Better performance – Smaller code fits better in caches and pipelines, improving overall performance.
- Power efficiency – Fetching less code consumes less memory power, important in battery-powered devices.
- Binary compatibility – Thumb binaries work on all ARMv4T+ processors with interworking.
- Ease of use – Compilers handle Thumb code generation automatically.
For embedded operating systems, RTOS code size, and memory footprint are critical concerns. Thumb provides a compelling way to reduce overhead and improve performance across almost all ARM embedded designs.
Limitations of Thumb
Thumb also has some disadvantages and limitations to consider:
- Reduced functionality versus full ARM instruction set.
- Overhead to switch between Thumb and ARM states.
- More complicated code for some operations.
- Limited support for some DSP algorithms.
- Not recommended for complex application processing.
For performance-sensitive code segments, ARM instructions may be preferable. Thumb is best suited for simpler high-volume code where code size matters most.
Conclusion
The Thumb instruction set has become an essential part of most ARM embedded processors. Its 16-bit compressed instructions provide much better code density compared to 32-bit ARM code while retaining good performance. The interworking capabilities allow Thumb and ARM code to be mixed cleanly within programs.
Modern Thumb-2 extensions improve performance for 32-bit operations while maintaining compact code size. As memory costs decline, the code size benefits of Thumb remain relevant for optimizing instruction caches, buses, and power efficiency.
For embedded developers, Thumb provides an easy way to reduce system cost and improve performance. ARM’s excellent support for Thumb makes it simple to utilize in new designs for significant benefits.