Syntax for inline assembly operands in GCC

The GNU Compiler Collection (GCC) provides an inline assembler that allows embedding assembly language code into C and C++ programs. This can be useful for writing time-critical code segments, accessing processor-specific instructions, or implementing code requiring direct hardware access.

Contents

Operand Syntax Specifying Operand Constraints Matching Constraints to Operands Modifiers Specifying Clobbers Naming Operands Putting It Together: Examples Special Operand Names Memory Operands Offsettable Memory References Operand Size and Type Type Specifiers Extended Asm Statements Things to Keep in Mind Conclusion

One of the key components of using GCC’s inline assembler is properly defining the operands that are passed between the assembly code and surrounding C/C++ code. This article provides an overview of the syntax and features for specifying inline assembly operands in GCC.

Operand Syntax

Operands are specified using the following syntax: [constraint](clobbers)[, return] [asm symbolic name]

The components in brackets are optional:

constraint – specifies requirements for matching input/output operands
clobbers – list of clobbered registers

return – specifies operand that represents assembly instruction output
asm symbolic name – symbolic name representing the operand in assembly code

For example: “r” (input_var)

This defines an input operand by specifying a “r” register constraint and giving it the symbolic name “input_var”. The surrounding quotes are required.

Specifying Operand Constraints

The constraint specifies the requirements for the matching operand. This tells GCC how to reconcile the types between the inline assembly and surrounding C/C++ code.

Some common constraint codes are:

r – general purpose register
g – any register, memory or immediate integer
m – memory operand

i – immediate integer
o – offsettable memory
X – any operand

Multiple constraint alternatives can be specified, separated by commas. For example, “rm” would match a register or memory operand.

Matching Constraints to Operands

GCC will try to match input operands to the assembler instruction template based on order and compatible types. Output operands represent values that are written by the assembly code.

For example: asm (“add %1, %2, %0” : “=r” (result) : “r” (input1), “r” (input2));

This will add the registers input1 and input2, storing the result in result. GCC handles moving the values in and out of the registers as needed.

Modifiers

Additional modifiers can be used on constraints to alter matching behavior:

& – early clobber operand

+ – operand is writable (output)
% – operand is both readable and writable
? – ignore operand if it doesn’t match

Specifying Clobbers

The optional clobbers list indicates any registers or resources that are overwritten (clobbered) by the inline assembly code, separate by commas. For example: asm (“instr” : : : “r1”, “r2”, “cc”);

This indicates that assembly instruction “instr” modifies registers r1, r2, and the condition codes register cc.

Naming Operands

The symbolic operand name provides a way to reference the operand from within the assembly code. For example: asm (“mov %[result], #1” : [result] “=r” (myresult));

This allows the assembly code to use the symbolic name “result” rather than having to hardcode register numbers.

Putting It Together: Examples

To tie the key concepts together, here are some examples of inline assembly operands in GCC: int input = 10; int result; // Simple input operand asm (“dbl %[input]” : : [input] “r” (input)); // Input and output operand asm (“add %[input], #1, %[result]” : [result] “=r” (result) : [input] “r” (input)); // Multiple inputs, one output asm (“mul %[in1], %[in2], %[out]” : [out] “=r” (result) : [in1] “r” (input1), [in2] “r” (input2)); // Early clobber operand asm (“load %[addr], %[out]” : [out] “+r” (result) : [addr] “r” (address) : “memory”);

In the first example, the “r” constraint matches the input variable to a register operand expected by the assembly instruction. In the second, an output operand is added to represent the result. The third shows multiple inputs, and the fourth demonstrates an early clobber operand using “+r” to indicate it is writable.

Special Operand Names

There are some special symbolic names that can be used:

%0 – First output operand
%1, %2, etc. – Subsequent output operands

% – Corresponding input operand by position

For example: asm (“op %1, %2, %0” : “=r” (result) : “r” (input1), “r” (input2));

The %1 and %2 refer to input1 and input2 respectively based on the order, %0 refers to the output result.

Memory Operands

Memory operands represent dereferenced addresses to read/write memory in the inline assembly.

For example: int value = 0; asm (“ldr %[result], [%[addr]]” : [result] “=r” (result) : [addr] “r” (&value));

This loads the value at address &value into the result. The “r” constraint matches the C pointer to a register expected by the ldr instruction.

Offsettable Memory References

GCC allows dereferencing register plus offset addresses like: asm (“ldrb r1, [r2, #8]” : : : );

This can be represented in constraints by: “o” (*(struct_pointer + 8))

Which would match a register and offset memory operand.

Operand Size and Type

The size and type of operands is important for matching. For example, a 32-bit integer vs 64-bit long long int.

Size specifiers can be added to constraints to indicate required operand sizes:

b – byte (8 bit)

h – halfword (16 bit)
w – word (32 bit)
g – giant word (64 bit)

For example: long long result; asm (“op %1, %2, %0” : “=r” (result) : “ri” (32), “ri” (32));

This forces input operands to 32-bit integers even if long long would normally match a register.

Type Specifiers

GCC also supports explicit type specifiers that can override the default matching behavior:

i – signed integer
u – unsigned integer
n – immediate integer

E – float
F – double
X – long double

Adding these after normal register/memory constraints forces GCC to convert the C values to the specified type before passing into the assembler code.

Extended Asm Statements

GCC supports extended asm syntax that allows assembly instructions to be embedded directly in C/C++ code without requiring semicolons and output operands: asm volatile (“add %[input], #1” : [input] “+r” (input));

The volatile ensures instructions execute in order. Extended asm is compatible with normal constraints and clobbers.

Things to Keep in Mind

Here are some general guidelines when using inline assembly with GCC:

Ensure constraint codes match expected operand types in assembly instructions.
Specify all clobbered registers and memory to prevent conflicts.

Use size/type specifiers if relying on specific operand sizes.
Give unique symbolic names to all operands.
Check compiler documentation for available constraints and modifiers.

Validate assembly output using compiler optimizations like -S.

Properly specifying constraints and operands is key to avoiding issues when mixing inline assembly with GCC generated code.

Conclusion

GCC’s inline assembler provides a flexible way to embed assembly code in C/C++ programs. The key syntax includes constraints for matching input/output operands, clobbers for managing side effects, and symbolic names for referencing operands from assembly code. Constraint codes like “r”, “m”, and “i” specify register, memory, and immediate integer operands. Modifiers like “&” and “+” alter matching behavior. Type and size specifiers force specific operand types. Following best practices for inline assembly coding can help create stable mixed C/C++ and assembly code.