SoC
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
  • Arm Cortex M3
  • Contact
Reading: Does ARM allow unaligned access?
SUBSCRIBE
SoCSoC
Font ResizerAa
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Search
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Have an existing account? Sign In
Follow US
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
© S-O-C.ORG, All Rights Reserved.
Arm

Does ARM allow unaligned access?

Holly Lindsey
Last updated: September 13, 2023 6:52 am
Holly Lindsey 6 Min Read
Share
SHARE

The answer is yes, ARM does allow unaligned memory accesses, but with some caveats. ARM processors can perform unaligned loads and stores, but doing so comes with a performance penalty compared to aligned accesses. Additionally, some ARM instruction sets like Thumb and Thumb-2 have greater restrictions on unaligned accesses than the 32-bit ARM instruction set.

Contents
What is aligned vs unaligned memory access?Unaligned access on ARM 32-bit architectureUnaligned loadsUnaligned storesUnaligned access on Thumb and Thumb-2ARMv8 and unaligned accessesSoftware handling of unaligned accessesPerformance implicationsConclusion

What is aligned vs unaligned memory access?

In computing, aligned memory access refers to reading from or writing to memory addresses that are multiples of the data size. For example, accessing a 4-byte integer at an address that is a multiple of 4 would be aligned. Accessing that same 4-byte integer at an address offset by 1, 2, or 3 bytes would be unaligned.

Processor architectures like x86 generally allow unaligned access without penalty, but RISC architectures like ARM prefer aligned access for performance reasons. Reading a value spanning two aligned addresses requires two separate memory accesses instead of one, impacting efficiency.

Unaligned access on ARM 32-bit architecture

The 32-bit ARM architecture and instruction set allow unaligned loads and stores for most data types. However, unaligned accesses have a performance cost compared to aligned accesses. On Cortex-A series processors, unaligned accesses take about twice as long as aligned accesses because two separate 32-bit transfers are required.

The ARM Architecture Reference Manual notes that unaligned accesses should be avoided for performance reasons. But ARM does provide mechanisms to accomplish unaligned loads and stores when required. There are some restrictions though – ARMv4 and earlier do not support unaligned word or halfword access.

Unaligned loads

To perform unaligned loads in the ARM 32-bit instruction set, the LDR instruction options can be used:

  • LDR – supports unaligned byte loads
  • LDRB – supports unaligned byte loads
  • LDRH – supports unaligned halfword loads
  • LDRSH – supports unaligned signed halfword loads
  • LDRSB – supports unaligned signed byte loads

Unaligned stores

To perform unaligned stores, the STR instruction options can be used:

  • STR – supports unaligned byte stores
  • STRB – supports unaligned byte stores
  • STRH – supports unaligned halfword stores

Unaligned access on Thumb and Thumb-2

Thumb is a 16-bit compressed subset of the ARM instruction set that improves code density. Thumb-2 extends Thumb with some 32-bit instructions. In Thumb and Thumb-2, unaligned load and store support is more limited than the 32-bit ARM instruction set:

  • Only single byte loads/stores are allowed
  • No support for halfword or word unaligned accesses

This means that while the ARM 32-bit ISA supports unaligned access to bytes, halfwords, and words, Thumb and Thumb-2 only handle byte unaligned accesses without faults. Halfword and word unaligned accesses will generate alignment faults.

ARMv8 and unaligned accesses

The newer 64-bit ARMv8 architecture maintains support for unaligned accesses but Alignments faults may still occur in some situations, like Speculative reads. ARM recommends avoiding unaligned accesses when possible for optimal performance.

Some key notes on unaligned accesses in ARMv8:

  • All unaligned accesses have a performance penalty
  • Atomic and exclusive unaligned accesses are not supported and will fault
  • Some SIMD/NEON instructions don’t support unaligned

Software handling of unaligned accesses

If unaligned accesses are required, the software needs to handle them carefully. Some ways compilers and hand-written assembly can deal with unaligned accesses on ARM:

  • Using the above LDR and STR instructions for explicit unaligned loads/stores
  • Using intrinsics like __packed, __unaligned, __packed__ to indicate unaligned data
  • Using memcpy to move unaligned data instead of direct access
  • Copying data to/from an aligned buffer before accessing
  • Issuing multiple aligned accesses and shifting/masking to simulate unaligned

In situations where alignment is not known at compile time, runtime checks may be needed to decide between aligned and unaligned access methods.

Performance implications

There are significant performance advantages to keeping data aligned where possible. Some benchmarks indicate unaligned 32-bit ARM accesses are 40-60% slower than aligned accesses. Thumb/Thumb-2 have the additional impact of alignment faults to handle.

Modern ARM chips do include mechanisms to mitigate the impact of unaligned accesses like load/store multiple instructions. But overall, maintaining alignment for performance sensitive code is recommended. Unaligned accesses should be minimized and isolated from critical paths if possible.

Conclusion

While ARM does allow unaligned memory access across its instruction sets, aligned access is strongly recommended for performance reasons. Unaligned accesses come with a penalty and not all instruction types support it. Software has to explicitly enable unaligned access using special instructions/intrinsics and handle any potential faults. Critical code should avoid unaligned accesses on ARM where possible, but the capability is available if needed.

Newsletter Form (#3)

More ARM insights right in your inbox

 


Share This Article
Facebook Twitter Email Copy Link Print
Previous Article What are the different faults in ARM?
Next Article What is unaligned memory access?
Leave a comment Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

2k Followers Like
3k Followers Follow
10.1k Followers Pin
- Sponsored-
Ad image

You Might Also Like

How to activate Eventrecorder together with RTX5 using Cortex-M0?

Activating the Eventrecorder together with RTX5 on a Cortex-M0 microcontroller…

7 Min Read

How much memory does the Cortex-M35P have?

The Cortex-M35P from ARM is a new microcontroller targeted for…

6 Min Read

What is Basepri?

Basepri is a register found in ARM Cortex processor cores…

5 Min Read

What is ARM reset handler?

The reset handler, also known as the reset vector, is…

8 Min Read
SoCSoC
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
Welcome Back!

Sign in to your account