ARM processors are generally known for being more power efficient than x86 processors from Intel and AMD. There are several architectural and design differences between ARM and x86 that contribute to ARM’s lower power consumption.
ARM’s RISC Architecture
One of the biggest reasons ARM uses less power is its RISC (Reduced Instruction Set Computing) architecture. RISC architectures have simpler and fewer instructions compared to CISC (Complex Instruction Set Computing) architectures used in x86. The simpler instructions allow for a simpler processor design that requires less power and transistors. x86 instructions can perform more complex tasks per instruction, but ARM makes up for this by executing more instructions in parallel.
Out-of-Order Execution
Another power saving technique used by ARM is out-of-order execution. This allows instructions to be rearranged and executed in parallel if their inputs are available, without waiting for previous instructions to finish. By keeping execution units busy, power efficiency is improved. Older x86 chips relied more on increasing clock speeds for performance, which consumed more power.
Integrated Memory Controller
ARM processors have an integrated memory controller built right into the CPU chip. This removes the need for a separate external memory controller chip. Having the memory controller integrated reduces memory access latency and power consumption. x86 historically relied on a northbridge memory controller external to the CPU.
Power Gating
ARM makes extensive use of power gating, which allows sections of a processor to be turned off when not in use. This saves leakage power in unused logic blocks. ARM implements fine-grained power gating that can disable smaller sub-blocks. x86 has trailed ARM in adopting aggressive power gating techniques.
Multicore and Heterogeneous Designs
ARM cores are smaller in size and power efficient by design. This makes it easier to integrate multiple ARM cores into a single chip called a multicore System-on-Chip (SoC). Having multiple simpler cores is more power efficient than one complex high performance core. ARM also utilizes heterogeneous cores with different sizes/power levels for different tasks.
Advanced Process Nodes
ARM processors are manufactured by third party foundries like TSMC and Samsung. The foundries have historically led in smaller process nodes which have lower power from transistor scaling. Intel manufactures x86 chips internally and has sometimes trailed in adopting newer manufacturing processes.
Focus on Mobile Power Efficiency
ARM has focused on the mobile and embedded markets which demand power efficiency for battery life. x86 historically targeted desktops and laptops where performance took priority over power. But as x86 has moved into mobile, power efficiency has become more critical.
Simpler Core Microarchitecture
The ARM core microarchitecture itself is simpler and smaller than x86 in terms of pipeline stages, execution units, buffers, caches, and control logic. This translates to improved power efficiency. x86 cores have greater complexity to support legacy x86 instructions.
Clock Gating
ARM makes extensive use of clock gating to prevent the clock from reaching idle circuitry. This reduces dynamic power consumption proportional to switching activity. Clock gating allows unused logic blocks to be temporarily turned off without losing state.
Voltage and Frequency Scaling
ARM SoCs support dynamic voltage and frequency scaling which allows the voltage and clock speed to be lowered when full performance is not needed. This provides a cubic reduction in dynamic power. x86 historically did not focus as much on power-aware frequency scaling.
SIMD and Vector Processing
ARM has added SIMD and vector processing options like NEON to improve performance for media and math code while maintaining low power. Vector processing allows a single instruction to perform multiple operations in parallel on data vectors.
Unified Coherency
ARM utilizes a simpler unified cache coherency protocol across all cores and cache levels. This reduces overhead and power consumption. x86 uses multiple specialized coherence protocols (MESI) at different cache levels resulting in higher complexity.
TrustZone Security
ARM TrustZone provides hardware-level security while minimizing power overhead. A dedicated secure world isolates trusted code from untrusted apps. x86 historically relied more on software techniques for security which have higher power costs.
Compression
ARM employs cache compression techniques to reduce memory traffic and power consumption. Values stored in caches can be compressed to take up less space, allowing more data to fit in the caches. x86 products have started adopting compression more recently.
In summary, ARM’s power efficiency advantage comes from its RISC architecture, advanced power management, multicore SoC integration, leading manufacturing process nodes, a focus on mobile designs, and microarchitectural optimizations.