SoC
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
  • Arm Cortex M3
  • Contact
Reading: How are the atomic functions implemented in case of ARM architecture?
SUBSCRIBE
SoCSoC
Font ResizerAa
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Search
  • Home
  • Arm
  • Arm Cortex M0/M0+
  • Arm Cortex M4
Have an existing account? Sign In
Follow US
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
© S-O-C.ORG, All Rights Reserved.
Arm

How are the atomic functions implemented in case of ARM architecture?

Eileen David
Last updated: September 12, 2023 1:59 pm
Eileen David 6 Min Read
Share
SHARE

Atomic functions in ARM architecture provide synchronization capabilities to ensure thread safety and avoid race conditions when accessing shared resources. The key to implementing atomic functions is the use of exclusive access instructions that ARM processors provide. These instructions allow a thread to gain exclusive access to a memory location, perform a read-modify-write operation atomically, and release access when done.

Contents
Load-Linked and Store-Conditional InstructionsARM Atomic InstructionsImplementing Common Atomic OperationsAtomic ExchangeAtomic Compare And SwapAtomic Add/IncrementAtomic FlagsMemory BarriersLocks Using ExclusivesCompiler Atomic BuiltinsSummary

Load-Linked and Store-Conditional Instructions

The main exclusive access instructions used to implement atomics in ARM are Load-Linked (LL) and Store-Conditional (SC). LL loads a value from memory and marks it as “linked”. SC will store a value only if no other thread has written to the linked location since the LL. This provides atomic read-modify-write semantics.

For example, an atomic increment may do: LL R1, [X2] // Load linked ADD R1, R1, #1 // Modify value SC [X2], R1 // Store conditional

If no other thread wrote to [X2] between the LL and SC, the store succeeds atomically. The SC result indicates if it succeeded or failed. If failed, the code can simply retry the LL-modify-SC sequence until it succeeds.

ARM Atomic Instructions

ARMv6 architecture introduced dedicated atomic instructions to simplify coding atomics using LL/SC. These perform atomic read-modify-write without having to manually code LL/SC loops.

Key instructions include:

  • LDREX/STREX – Load/Store exclusive register
  • LDREXB/STREXB – Load/Store exclusive byte
  • LDREXH/STREXH – Load/Store exclusive halfword

For example, LDREX loads from memory into a register exclusively, STREX stores the value only if no intervening store took place. This enables simple atomic RMW sequences like: LDREX R1, [X2] ADD R1, R1, #1 STREX R0, R1, [X2]

STREX result in R0 indicates if the store succeeded. RETRY on fail.

Implementing Common Atomic Operations

Using exclusive load/store and atomic RMW instructions, various common atomic primitives can be implemented:

Atomic Exchange

LOOP: LDREX R1, [X2] MOV R0, R1 STREX R1, X3, [X2] CBNZ R1, LOOP

X3 has the value we want to exchange with current value in [X2]. R1 is scratch register. Above loads current value from [X2] into R1 exclusively, copies it into R0, stores new value X3 to [X2], and repeats if store fails.

Atomic Compare And Swap

LOOP: LDREX R1, [X2] CMP R1, X3 BNE FAIL STREX R1, X4, [X2] CBNZ R1, LOOP FAIL:

X3 is expected value, X4 is new value. Does atomic compare against current value in [X2], only stores new value if current value matched expected.

Atomic Add/Increment

LOOP: LDREX R1, [X2] ADD R1, R1, #1 STREX R2, R1, [X2] CBNZ R2, LOOP

Increments value in [X2] atomically using LDREX/STREX.

Atomic Flags

X1 = 0 // Initial value LOOP: LDREXB W2, [X1] CBNZ W2, LOOP // Spin if set MOV W2, #1 STREXB W3, W2, [X1] CBNZ W3, LOOP // Retry if store failed

Spins until value at [X1] is 0, then atomically sets it to 1.

Memory Barriers

In addition to atomic instructions, memory barriers are needed to synchronize access across cores and ensure coherency. Key barriers include:

  • DMB – Data memory barrier
  • DSB – Data synchronization barrier
  • ISB – Instruction synchronization barrier

DMB ensures memory accesses before and after are observable in same order. DSB stalls execution until all memory accesses are complete. ISB flushes pipeline.

For example, a full barrier may be coded as: DMB ISH DSB SY ISB

Which ensures memory operations are ordered correctly, pipeline is flushed, and execution waits for memory to synchronize.

Locks Using Exclusives

Exclusive instructions can also implement locks and mutexes. A simple spinlock can be: LOCKED: .byte 0 lock: LDREXB W2, [X1] CBNZ W2, lock MOV W2, #1 STREXB W3, W2, [X1] CBNZ W3, lock unlock: MOV W2, #0 STREXB W3, W2, [X1] CBZ W3, unlock

Which spins until lock value is 0, then atomically sets it to 1 to acquire lock. Unlock sets it back to 0. Uses LDREX/STREX loop to retry if needed.

Compiler Atomic Builtins

Instead of hand coded assembly, compiler atomic builtins provide a simpler way to utilize architectural atomic support in C/C++ code: int x; atomic_store(&x, 10); // Atomic store int y = atomic_load(&x); // Atomic load int z = atomic_fetch_add(&x, 2); // Atomic RMW add

These are implemented using LDREX/STREX or equivalent instructions. Compiler handles generation of instruction sequences and retry loops.

Summary

ARM architectures provide exclusive access instructions like LDREX/STREX to enable atomic read-modify-write operations. These are used to implement common atomic primitives like exchange, CAS, increment. Memory barriers are needed between cores. Atomics are essential building blocks for lock-free concurrent algorithms and data structures on ARM platforms.

Newsletter Form (#3)

More ARM insights right in your inbox

 


Share This Article
Facebook Twitter Email Copy Link Print
Previous Article What are atomic operations in ARM?
Next Article What is an atomic memory operation?
Leave a comment Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

2k Followers Like
3k Followers Follow
10.1k Followers Pin
- Sponsored-
Ad image

You Might Also Like

ARM Cortex-M0 Microcontroller

The ARM Cortex-M0 is a 32-bit microcontroller core licensed by…

8 Min Read

What is the specification of STM32F407G?

The STM32F407G is a high-performance microcontroller from STMicroelectronics based on…

7 Min Read

Arm vs x86 Performance

The battle between Arm and x86 architectures has been going…

8 Min Read

How do you trigger a hard fault?

A hard fault is an unrecoverable error that occurs during…

7 Min Read
SoCSoC
  • Looking for Something?
  • Privacy Policy
  • About Us
  • Sitemap
  • Contact Us
Welcome Back!

Sign in to your account