When working with the ARM Cortex M3 processor, properly aligning data types in memory can have significant performance implications. The Cortex M3 has a 32-bit architecture and works most efficiently when data types are aligned to 32-bit or 4-byte boundaries. Understanding the alignment requirements and properly aligning data structures is crucial for optimized code on the Cortex M3.
Data Alignment Basics
Alignment refers to the position of data types in relation to memory addresses. A data type is said to be “n-byte aligned” if its memory address is a multiple of n bytes. For example, a 4-byte integer is 4-byte aligned if its address is a multiple of 4. When data types are not aligned to appropriate boundaries, the processor can incur performance penalties when accessing them.
The ARM Cortex M3 is a 32-bit processor, meaning its general purpose registers and data bus are 32 bits wide. It can most efficiently access 32-bit (4 byte) data types when they are aligned to 4-byte boundaries in memory. Accessing unaligned data types requires extra processing cycles, degrading performance.
Alignment Requirements
Here are the alignment requirements for the main data types on the Cortex M3:
- char (1 byte) – No alignment requirements
- short (2 bytes) – 2-byte aligned
- int (4 bytes) – 4-byte aligned
- long (4 bytes) – 4-byte aligned
- float (4 bytes) – 4-byte aligned
- double (8 bytes) – 8-byte aligned
- pointers (4 bytes) – 4-byte aligned
Structures and arrays should follow the alignment requirements of their individual members. For example, a structure containing ints and floats should be 4-byte aligned.
Implications of Unaligned Data
If data types are not properly aligned, the Cortex M3 must perform extra memory accesses to retrieve the data. This can significantly degrade performance. For example:
- Accessing a 4-byte int at an unaligned address requires 2 memory accesses instead of 1
- Accessing an 8-byte double at an unaligned address requires 4 memory accesses instead of 2
These extra memory accesses have three main performance penalties:
- Increased execution cycles – More clock cycles needed to fetch unaligned data
- Reduced bandwidth – Data bus cannot transfer full width on each access
- Increased power consumption – More memory accesses consume more power
In addition, unaligned accesses can cause exceptions and faults on some ARM cores. So proper alignment is critical for efficiency and correct operation.
Enforcing Alignment
There are a few techniques to enforce proper alignment of data structures and variables on the Cortex M3:
- Use #pragma directives to instruct the compiler to align specific variables or structures
- Align individual structure members using __attribute__((aligned(n)))
- Pad structures to round size up to alignment boundaries
- Allocate variables at aligned addresses using pointers
- Use memcpy to move data into aligned buffers before using
The compiler and linker also provide command line options to enforce alignments. For example, the GNU toolchain supports options like -mstructure-size-boundary and -falign-functions.
Examples
Here are examples of aligning some common data types in C code for the Cortex M3:
1. Aligning Individual Variables
/* 4-byte align integer */ int value __attribute__((aligned(4))); /* 8-byte align double */ double real __attribute__((aligned(8)));
2. Aligning Structures
/* Pad struct to 4-byte boundary */ struct __attribute__((__packed__)) Data { char c; short s; char c2; int i; // 4-byte aligned }; /* Explicitly 4-byte align members */ struct __attribute__((aligned(4))) Data2 { char c; short s; char c2; int i; // 4-byte aligned };
3. Aligning Arrays
/* 4-byte align array of ints */ int values[10] __attribute__((aligned(4))); /* Align char array to 4-byte boundary */ char buffer[15] __attribute__((aligned(4)));
Checking Alignment
It can be useful to check alignment of variables and structures during development. Some ways to check alignment in C code are:
- Use the __alignof operator to get alignment of a type
- Print pointer addresses to check if aligned
- Use inline assembly to directly check alignment
int x; /* Print alignment requirement */ printf(“int alignment: %d\n”, __alignof__(int)); /* Print actual alignment */ printf(“x address: %p\n”, &x); /* Inline assembly to check alignment */ asm volatile( “tst %1, #3 \n\t” “beq %%aligned \n\t” : : “r”(&x) : );
Conclusion
Alignment of data structures and variables is an important optimization on the ARM Cortex M3. By properly aligning data types to 4-byte boundaries, significant performance gains can be achieved. Unaligned data can degrade performance and cause faults. Techniques like compiler directives, padding, explicit alignment attributes, and manually aligned pointers can help enforce proper alignment. Check alignment using tools provided by the compiler and inline assembly during development for optimized Cortex M3 code.