Relocation generation in assemblers
This post explores how GNU Assembler and LLVM integrated assemblergenerate relocations, an important step to generate a relocatable file.Relocations identify parts of instructions or data that cannot be fullydetermined during assembly because they depend on the final memorylayout, which is only established at link time or load time. These areessentially placeholders that will be filled in (typically with absoluteaddresses or PC-relative offsets) during the linking process.
Relocation generation: thebasics
Symbol references are the primary candidates for relocations. Forinstance, in the x86-64 instruction movl sym(%rip), %eax
(GNU syntax), the assembler calculates the displacement between theprogram counter (PC) and sym
. This distance affects theinstruction's encoding and typically triggers aR_X86_64_PC32
relocation, unless sym
is alocal symbol defined within the current section.
Both the GNU assembler and LLVM integrated assembler utilize multiplepasses during assembly, with several key phases relevant to relocationgeneration:
Parsing phase
During parsing, the assembler builds section fragments that containinstructions and other directives. It parses each instruction into itsopcode (e.g., movl
) and operands (e.g.,sym(%rip), %eax
). It identifies registers, immediate values(like 3 in movl $3, %eax
), and expressions.
Expressions can be constants, symbol refereces (likesym
), or unary and binary operators (-sym
,sym0-sym1
). Those unresolvable at parse time-potentialrelocation candidates-turn into "fixups". These often skip immediateoperand range checks, as shown here:
1 |
% echo 'addi a0, a0, 2048' | llvm-mc -triple=riscv64 |
A fixup ties to a specific location (an offset within a fragment),with its value being the expression (which must eventually evaluate to arelocatable expression).
Meanwhile, the assembler tracks defined and referenced symbols, andfor ELF, it tracks symbol bindings(STB_LOCAL, STB_GLOBAL, STB_WEAK
) from directives like.globl
, .weak
, or the rarely used.local
.
Section layout phase
After parsing, the assembler arranges each section by assigningprecise offsets to its fragments-instructions, data, or other directives(e.g., .line
, .uleb128
). It calculates sizesand adjusts for alignment. This phase finalizes symbol offsets (e.g.,start:
at offset 0x10) while leaving external ones for thelinker.
This phase, which employs a fixed-point iteration, is quite complex.I won't go into details, but you might find
Relocation decision phase
Then the assembler evaluates each fixup to determine if it can beresolved directly or requires a relocation entry. This process starts byattempting to convert fixups into relocatable expressions.
Evaluating relocatableexpressions
In their most general form, relocatable expressions follow thepattern relocation_specifier(sym_a - sym_b + offset)
,where
-
relocation_specifier
: This may or may not be absent. Iwill explain this concept later. -
sym_a
is a symbol reference (the "addend") -
sym_b
is an optional symbol reference (the"subtrahend") -
offset
is a constant value
Most common cases involve only sym_a
oroffset
(e.g., movl sym(%rip), %eax
ormovl $3, %eax
). Only a few target architectures support thesubtrahend term (sym_b
). Notable exceptions include AVR andRISC-V, as explored in
Attempting to use unsupported expression forms will result inassembly errors:
1 |
% echo -e 'movl a+b, %eax\nmovl a-b, %eax' | clang -c -xassembler - |
PC-relative fixups
PC-relative fixups compute their values assym_a + offset - current_location
. (I’ve skipped- sym_b
, since no target I know permits a subtrahendhere.)
When sym_a
is a local symbol defined within the currentsection, these PC-relative fixups evaluate to constants. But ifsym_a
is a global or weak symbol in the same section, arelocation entry is generated. This ensures
Resolution Outcomes
The assembler's evaluation of fixups leads to one of threeoutcomes:
- Error: When the expression isn't supported.
- Resolved fixups: When the fixup evaluates to a constant, theassembler updates the relevant bits in the instruction directly. Norelocation entry is needed.
- There are target-specific exceptions that make the fixup unresolved.In AArch64
adrp x0, l0; l0:
, the immediate might be either0 or 1, dependant on the instructin address. In RISC-V, linkerrelaxation might make fixups unresolved.
- There are target-specific exceptions that make the fixup unresolved.In AArch64
- Unresolved fixups: When the fixup evaluates to a relocatableexpression but not a constant, the assembler
- Generates an appropriate relocation (offset, type, symbol,addend).
- For targets that use RELA, usually zeros out the bits in theinstruction field that will be modified by the linker.
- For targets that use REL, leave the addend in the instructionfield.
- If the referenced symbol is defined and local, and the relocationtype is not in exceptions (gas
tc_fix_adjustable
), therelocation references the section symbol instead of the localsymbol.
If you are interested in relocation representations in differentobject file formats, please check out my post
Examples in action
Branches
1 |
% echo -e 'call fun\njmp fun' | clang -c -xassembler - -o - | fob -dr - |
Absolute and PC-relative symbol references
1 |
% echo -e 'movl a, %eax\nmovl a(%rip), %eax' | clang -c -xassembler - -o - | llvm-objdump -dr - |
(a-.)(%rip)
would probably be more semantically correctbut is not adopted by GNU Assembler.
Relocation specifiers
Relocation specifiers guide the assembler on how to resolve andencode expressions into instructions. They specify details like:
- Whether to reference the symbol itself, its Procedure Linkage Table(PLT) entry, or its Global Offset Table (GOT) entry.
- Which part of a symbol's address to use (e.g., lower or upperbits).
- Whether to use an absolute address or a PC-relative one.
This concept appears across various architectures but withinconsistent terminology. The Arm architecture refers to elements like:lo12:
and :lower16:
as "relocationspecifiers". IBM's AIX documentation also uses this term. Many GNUBinutils target documents simply call these "modifiers", while AVRdocumentation uses "relocatable expression modifiers".
Picking the right term was tricky. "Relocatable expression modifier"nails the idea of tweaking relocatable expressions but feels overlyverbose. "Relocation modifier", though concise, suggests adjustmentshappen during the linker's relocation step rather than the assembler'sexpression evaluation. I landed on "relocation specifier" as the winner.It's clear, aligns with Arm and IBM’s usage, and fits the assembler'srole seamlessly.
For example, RISC-V addi
can be used with either anabsolute address or a PC-relative address. Relocation specifiers%lo
and %pcrel_lo
could differentiate the twouses. Similarly, %hi
, %pcrel_hi
, and%got_pcrel_hi
could differentiate the uses oflui
and auipc
.
1 |
# Position-dependent code (PDC) - absolute addressing |
Why use %hi
with lui
if it's always paired?It's about clarify and explicitness. %hi
ensuresconsistency with %lo
and cleanly distinguishes it from from%pcrel_hi
. Since both lui
andauipc
share the U-type instruction format, tying relocationspecifiers to formats rather than specific instructions is a smart,flexible design choice.
Relocation specifier flavors
Assemblers use various syntaxes for relocation specifiers, reflectingarchitectural quirks and historical conventions. Below, we explore themain flavors, their usage across architectures, and some of theirpeculiarities.
expr@specifier
This is likely the most widespread syntax, adopted by many binutilstargets, including ARC, C-SKY, Power, M68K, SuperH, SystemZ, and x86,among others. It's also used in Mach-O object files, e.g.,adrp x8, _bar@GOTPAGE
.
This suffix style puts the specifier after an @
. It'sintuitive—think sym@got
. In PowerPC, operators can getelaborate, such as sym@toc@l(9)
. Here, @toc@l
is a single, indivisible operator-not two separate @
pieces-indicating a TOC-relative reference with a low 16-bitextraction.
Parsing is loose: while both expr@specifier+expr
andexpr+expr@specifier
are accepted (by many targets),conceptually it's just specifier(expr+expr)
. For example,x86 accepts sym@got+4
or sym+4@got
, but don'tmisread—@got
applies to sym+4
, not justsym
.
%specifier(expr)
MIPS, SPARC, RISC-V, and LoongArch favor this prefix style, wrappingthe expression in parentheses for clarity. In MIPS, parentheses areoptional, and operators can nest, like
1 |
# MIPS |
Like expr@specifier
, the specifier applies to the wholeexpression. Don't misinterpret %lo(3)+sym
-it resolves assym+3
with an R_MIPS_LO16
relocation.
1 |
# MIPS |
expr(specifier)
A simpler suffix style, this is used by AArch32 for data directives.It's less common but straightforward, placing the operator inparentheses after the expression.
1 |
.word sym(gotoff) |
:specifier:expr
AArch32 and AArch64 adopt this colon-framed prefix notation, avoidingthe confusion that parentheses might introduce.
1 |
// AArch32 |
Applying this syntax to data directives, however, could createparsing ambiguity. In both GNU Assembler and LLVM,.word :plt:fun
would be interpreted as.word: plt: fun
, treating .word
andplt
as labels, rather than achieving the intendedmeaning.
Recommendation
For new architectures, I'd suggest adopting%specifier expr
, and never use @specifier
. The%
symbol works seamlessly with data directives, and duringoperand parsing, the parser can simply peek at the first token to checkfor a relocation specifier.
( %specifier(...)
resembles %
expansion inGNU Assembler's altmacro mode.
1
2
3.altmacro
.macro m arg; .long \arg; .endm
.data; m %(1+2)
Inelegance
RISC-V favors %specifier(expr)
but clings tocall sym@plt
for
AArch64 uses :specifier:expr
, yetR_AARCH64_PLT32
(.word foo@plt - .
) and PAuthABI (.quad (g + 7)@AUTH(ia,0)
) cannot use :
after data directives due to parsing ambiguity.
TLS symbols
When a symbol is defined in a section with the SHF_TLS
flag (Thread-Local Storage), GNU assembler assigns it the typeSTT_TLS
in the symbol table. For undefined TLS symbols, theprocess differs: GCC and Clang don’t emit explicit labels. Instead,assemblers identify these symbols through TLS-specific relocationspecifiers in the code, deduce their thread-local nature, and set theirtype to STT_TLS
accordingly.
1 |
// AArch64 |
Composed relocations
Most instructions trigger zero or one relocation, but some generatetwo. Often, one acts as a marker, paired with a standard relocation. Forexample:
- PPC64
bl __tls_get_addr(x@tlsgd)
pairs a markerR_PPC64_TLSGD
withR_PPC64_REL24
- PPC64's link-time GOT-indirect to PC-relative optimization (withPower10's prefixed instruction) generates a
R_PPC64_PCREL_OPT
relocation following a GOT relocation.https://reviews.llvm.org/D79864 - RISC-V linker relaxation uses
R_RISCV_RELAX
alongsideanother relocation, andR_RISCV_ADD*
/R_RISCV_SUB*
pairs. - Mach-O scattered relocations for label differences.
These marker cases tie into "composed relocations", as outlined inthe Generic ABI:
If multiple consecutive relocation records are applied to the samerelocation location (
r_offset
), they are composed insteadof being applied independently, as described above. By consecutive, wemean that the relocation records are contiguous within a singlerelocation section. By composed, we mean that the standard applicationdescribed above is modified as follows:
In all but the last relocation operation of a composed sequence,the result of the relocation expression is retained, rather than havingpart extracted and placed in the relocated field. The result is retainedat full pointer precision of the applicable ABI processorsupplement.
In all but the first relocation operation of a composed sequence,the addend used is the retained result of the previous relocationoperation, rather than that implied by the relocation type.
Note that a consequence of the above rules is that the locationspecified by a relocation type is relevant for the first element of acomposed sequence (and then only for relocation records that do notcontain an explicit addend field) and for the last element, where thelocation determines where the relocated value will be placed. For allother relocation operands in a composed sequence, the location specifiedis ignored.
An ABI processor supplement may specify individual relocation typesthat always stop a composition sequence, or always start a new one.
Implicit addends
ELF SHT_REL
and Mach-O utilize implicit addends.TODO
-
R_MIPS_HI16
(https://reviews.llvm.org/D101773)
GNU Assembler internals
GNU Assembler utilizes struct fixup
to represent boththe fixup and the relocatable expression.
1 |
struct fix { |
The relocation specifier is part of the instruction instead of partof struct fix
. Targets have different internalrepresentations of instructions.
1 |
// gas/config/tc-aarch64.c |
The 2002 message
In PPC, the result of @l
and @ha
can beeither signed or unsigned, determined by the instruction opcode.
In md_apply_fix
, TLS-related relocation specifiers callS_SET_THREAD_LOCAL (fixP->fx_addsy);
.
LLVM internals
LLVM integrated assembler encodes fixups and relocatable expressionsseparately.
1 |
class MCFixup { |
It encodes relocatable expressions as MCValue
, with:
-
RefKind
as an optional relocation specifier. -
SymA
as an optional symbol reference (addend) -
SymB
as an optional symbol reference (subtrahend) -
Cst
as a constant value
This mirrors the relocatable expression concept, butRefKind
—RefKind
.)
AArch64 implements a clean approach to select the relocation type. Itdispatches on the fixup kind (an operand within a specific instructionformat), then refines it with the relocation specifier.
1 |
// AArch64ELFObjectWriter::getRelocType |
MCSymbolRefExpr
issues
The expression structure follows a traditional object-orientedhierarchy:
1 |
MCExpr |
MCSymbolRefExpr::VariantKind
enums the relocationspecifier, but it's a poor fit:
- Other expressions, like
MCConstantExpr
(e.g., PPC4@l
) andMCBinaryExpr
(e.g., PPC(a+1)@l
), also need it. - Semantics blur when folding expressions with
@
, whichis unavoidable when@
can occur at any position within thefull expression. - The generic
MCSymbolRefExpr
lacks target-specifichooks, cluttering the interface with any target-specific logic.
Consider what happens with addition or subtraction:
1 |
MCBinaryExpr |
Here, the specifier attaches only to the LHS, leaving the full resultuncovered. This awkward design demands workarounds.
- Parsing
a+4@got
exposes clumsiness. AfterAsmParser::parseExpression
processesa+4
, itdetects@got
and retrofits it ontoMCSymbolRefExpr(a)
, which feels hacked together. - PowerPC's @l
@ha optimization needs PPCAsmParser::extractModifierFromExpr
andPPCAsmParser::applyModifierToExpr
to convert aMCSymbolRefExpr
to aPPCMCExpr
. - Many targets (e.g., X86) use
MCValue::getAccessVariant
to grab LHS's specifier, thoughMCValue::RefKind
would becleaner.
Worse, leaky abstractions that MCSymbolRefExpr
isaccessed widely in backend code introduces another problem: whileMCBinaryExpr
with a constant RHS mimicsMCSymbolRefExpr
semantically, code often handles only thelatter.
MCTargetExpr
encoding relocation specifiersMCTargetExpr
subclasses, as used by AArch64 and RISC-V,offer a cleaner approach to encode relocations. We should limitMCTargetExpr
to top-level use to encode one singlerelocation and avoid its inclusion as a subexpression.
1 |
AArch64MCExpr |
MCSymbolRefExpr::VariantKind
as the legacy way to encoderelocations should be completely removed (probably in a distant futureas many cleanups are required).
Our long-term goal is to migrate MCValue
to useMCSymbol
pointers instead of MCSymbolRefExpr
pointers.
1 |
// Current |
AsmParser:expr@specifier
In LLVM's assembly parser library (LLVMMCParser), the parsing ofexpr@specifier
was supported for all targets until Iupdated it to be
The PowerPC AsmParser(llvm/lib/Target/PowerPC/AsmParser/PPCAsmParser.cpp
) parsesan operand and then calls PPCAsmParser::extractSpecifier
toextract the optional @
specifier. When the @
specifier is detected and removed, it generates aPPCMCExpr
. This functionality is currently implemented for@l
and @ha`,and it would be beneficial to extend this to include all specifiers.
AsmPrinter
In llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
,AsmPrinter::lowerConstant
outlines how LLVM handles theemission of a global variable initializer. When processingConstantExpr
elements, this function may generate datadirectives in the assembly code that involve differences betweensymbols.
One significant use case for this intricate code isclang++ -fexperimental-relative-c++-abi-vtables
. Thisfeature produces a PC-relative relocation that points to either the PLT(Procedure Linkage Table) entry of a function or the function symboldirectly.