ARM processors are known for their power efficiency compared to x86 processors typically found in PCs and laptops. There are several architectural and design differences that contribute to the improved energy efficiency of ARM processors:
Reduced Instruction Set Computer (RISC)
ARM processors use a RISC architecture which means they rely on simpler, streamlined instructions that can execute within one clock cycle. This differs from x86 processors that use a Complex Instruction Set Computing (CISC) architecture with more complex instructions that often require multiple clock cycles to execute. The RISC architecture of ARM enables greater efficiency and higher instructions per clock.
Instruction Pipelining
ARM processors utilize deep instruction pipelining which allows multiple instructions to be processed in parallel in different stages of execution. This results in higher instruction throughput and efficiency. For example, while one instruction is being executed, the next can be decoded and the one after that fetched from memory. This overlap in instruction processing reduces idle times.
Branch Prediction
ARM processors employ aggressive branch prediction algorithms to predict which path a branch instruction will take. This avoids stalling while waiting to resolve the branch. Predicting branches correctly enables smooth instruction flow and efficiency. Dynamic branch predictors get better with usage modeling the program’s branching behavior.
Simpler Execution Units
The simpler RISC architecture of ARM processors allows the execution units like Arithmetic Logic Units (ALUs) to be simpler in design requiring fewer transistors. This results in lower power consumption and heat dissipation compared to more complex x86 execution units.
Advanced Power Management
ARM processors integrate advanced power saving techniques including extensive clock gating, power gating of unused blocks, operation modes like Wait for Interrupt (WFI), and dynamically scaling frequency and voltage based on workload. This enables minimal power draw when idle or during low performance needs.
Optimized Memory Architecture
ARM processors employ techniques including multi-stage cache hierarchies, optimized branch prediction, and smart prefetching to optimize memory access times and reduce stalls. Quick memory access is vital for efficiency. Some ARM processors even integrate the cache and main memory into the same chip package for faster access.
Smaller Manufacturing Process Nodes
ARM processors are designed from the ground up for optimal power efficiency. Therefore, ARM cores are smaller which allows them to leverage cutting edge process nodes like 7nm or 5nm from foundries earlier. The smaller transistors switch faster and leak less power compared to larger x86 cores still produced at older 14nm nodes.
Customizable Core Design
ARM only licenses their core IP designs to partners. This allows each vendor to customize the microarchitecture design to optimize for their target application. For example, mobile SoCs tune for the ultimate efficiency. While servers tune for higher performance per watt. This flexibility ensures suitability for low power.
Simplified Instructions
ARM instruction sets like Thumb and Thumb-2 use 16-bit compressed instructions that take less space to store and fetch from memory. This directly reduces code size and memory accesses for greater efficiency. The instructions are decompressed before execution.
Uni-processor Design
ARM cores are designed as stand-alone uni-processors rather than complex multi-core processors. While each core is simpler, multiple cores can be added efficiently for parallel processing. But simplicity of a single core improves control of transistors switching for lower power.
Predominantly Out-of-Order Execution
Modern ARM cores rely heavily on out-of-order execution and register renaming to extract instruction level parallelism. Executing independent instructions in parallel reduces wasted cycles waiting for sequential order. More work is done per cycle while avoiding stalls.
Silicon-on-Insulator (SOI) Fabrication
Some ARM processors leverage SOI fabrication which adds an insulating layer under the silicon to reduce electrical leakage. This prevents current loss leading to lower power consumption especially during high temperatures when leakage increases.
Lower Operating Voltages
ARM processors are designed to operate at lower voltages usually ranging from 1.2V to 1.8V. The lower operating voltage directly reduces active and leakage power. However, scaling the voltage usually requires lowering frequency too.
Fewer Transistors
The simpler RISC architecture and smaller cores result in ARM silicon requiring many fewer transistors than large x86 chips. For example, an ARM Cortex-A75 core has ~3.5 billion transistors while an x86 Skylake core has over 10 billion. Fewer transistors means reduced leakage and power.
ARM TrustZone Security
ARM TrustZone technology creates a secure world for trusted applications and a normal world running main software. The secure world can be powered off when not needed reducing power. It also enables encrypting sensitive data for security without continuous power.
Multicore Heterogeneity
ARM multicore systems often pair energy efficient smaller cores with higher performance cores. The OS can intelligently schedule tasks on the right cores to efficiently match performance needs to capability. Idle cores power down saving energy.
In summary, ARM processors are designed to be highly energy efficient. Key factors include the RISC architecture requiring fewer transistors, advanced power saving techniques, optimized memory systems, smaller manufacturing process nodes, highly customizable core designs, ability to operate at lower voltages, and intelligent utilization of multicore heterogeneity.