Little endian is a memory architecture where the least significant byte of a word is stored at the smallest memory address and the most significant byte at the largest memory address. In little endian architecture, data is stored in memory from the least significant byte to the most significant byte.
Byte Ordering in Memory
In computer memory, a word is typically 4 bytes (32 bits). When a 32-bit word is stored in memory, the bytes can be ordered in two ways:
- Little Endian – The least significant byte is stored first with increasing address.
- Big Endian – The most significant byte is stored first with increasing address.
For example, let’s take the 32-bit hex word 0x12345678. In little endian architecture it would be stored as: Address Value 0x100 0x78 0x101 0x56 0x102 0x34 0x103 0x12
As you can see, the bytes are stored from least significant (0x78) to most significant (0x12).
In big endian architecture, the same 0x12345678 would be stored as: Address Value 0x100 0x12 0x101 0x34 0x102 0x56 0x103 0x78
Here the most significant byte 0x12 is stored first.
Endianness and Data Processing
The difference between little endian and big endian is important when processing data. For example, when a 32-bit integer is read from memory, a little endian CPU will read the lowest address first to reconstruct the number. A big endian CPU will read the highest address first.
Let’s see an example. Suppose we have the 32-bit hex value 0x12345678 stored in memory. A little endian CPU will read it as:
- Read 0x78 from address 0x100
- Read 0x56 from address 0x101
- Read 0x34 from address 0x102
- Read 0x12 from address 0x103
It will then reconstruct the number as 0x12345678.
A big endian CPU will read it as:
- Read 0x12 from address 0x100
- Read 0x34 from address 0x101
- Read 0x56 from address 0x102
- Read 0x78 from address 0x103
And reconstruct the number as 0x12345678.
So even though the raw bytes are the same in memory, the order they are read and processed is different between little endian and big endian architectures.
Endianness and Multi-Byte Data Types
The difference between little endian and big endian is important for any multi-byte data type, not just 32-bit integers. This includes 16-bit shorts, 64-bit longs, 32-bit floats, 64-bit doubles, etc.
For example, consider the 64-bit long value 0x0011223344556677 stored in little endian format: Address Value 0x100 0x77 0x101 0x66 0x102 0x55 0x103 0x44 0x104 0x33 0x105 0x22 0x106 0x11 0x107 0x00
The bytes are ordered from least significant (0x77) to most significant (0x00). A little endian CPU will read the lowest address first to reconstruct the number 0x0011223344556677.
In big endian format, the bytes would be ordered as: Address Value 0x100 0x00 0x101 0x11 0x102 0x22 0x103 0x33 0x104 0x44 0x105 0x55 0x106 0x66 0x107 0x77
A big endian CPU would read the highest address first, reconstructing the number as 0x0011223344556677.
The same idea applies to any multi-byte data type – shorts, floats, doubles, etc. The CPU architecture determines which byte order is used to store and process the data.
Endianness Conversion
Sometimes data needs to be converted between little endian and big endian formats. This may be necessary when sending data over a network between computers with different endianness, or storing data on disk to be read by another system.
There are functions available in most languages and libraries to handle endianness conversion. For example, in C: #include <arpa/inet.h> uint32_t value = 0x12345678; uint32_t little_endian = htonl(value); // 0x78563412 uint32_t big_endian = ntohl(value); // 0x12345678
The htonl() and ntohl() functions convert 32-bit integers between host byte order and standard big endian network byte order.
Similar functions exist for 16-bit (htons()/ntohs()) and 64-bit (htonll()/ntohll()) values. Care must be taken to use the proper byte swapping functions when converting between endian formats.
Endianness on ARM Processors
ARM processors use the little endian format. This means when a multi-byte value is loaded into a register, the least significant bytes are stored in the lower numbered registers.
For example, loading a 32-bit integer into R0 on ARM would result in: R0 = least significant byte R1 = second least significant byte R2 = second most significant byte R3 = most significant byte
The ARM Application Binary Interface (ABI) specifies that values must be passed to functions and returned from functions following the little endian format. So compilers generate code expecting data in little endian order.
Some ARM processors support setting a big endian data mode using the SETEND instruction. When in big endian mode, the processor will read and write data in big endian format. However, this is not commonly used.
Advantages of Little Endian
There are several advantages to using little endian architecture:
- Simpler implementation in hardware – Incrementing pointers accesses least significant bytes first which is useful for things like memory mapped I/O.
- More efficient decoding – Little endian matches the order processors operate on data from LSB to MSB.
- Network order is big endian – Easy conversion to network byte order which is typically big endian.
- Backward compatibility – Newer processors can read older little endian formatted data.
Due to these advantages, little endian format is used by many CPU architectures including x86, ARM, PowerPC, and MIPS. Big endian is less common and seen mainly in older systems or specialized applications.
Disadvantages of Little Endian
There are a few disadvantages to little endian format as well:
- Human readable strings are reversed – Hex values are written MSB first but strings end up reversed in memory.
- Requires conversion on big endian systems – Data must be byte swapped when sending between different endianness systems.
- Confusion with byte order – Endianness bugs can occur when byte order is mixed up.
However, these issues can usually be managed with proper handling when interfaces between systems. Overall little endian offers advantages which likely led to its popularity over big endian format.