lld 22 ELF changes
For those unfamiliar, lld is theLLVM linker, supporting PE/COFF, ELF, Mach-O, and WebAssembly ports.These object file formats differ significantly, and each port mustfollow the conventions of the platform's system linker. As a result, theports share limited code (diagnostics, memory allocation, etc) and havelargely separate reviewer groups.
With LLVM 22.1 releasing soon, I've added some notes to the
For the first time, I used an LLM agent (Claude Code) to help lookthrough commits(git log release/21.x..release/22.x -- lld/ELF) and draftthe release notes. Despite my request to only read lld/ELF changes,Claude Code also crafted notes for other ports, which I retained sincetheir release notes had been quite sparse for several releases. Changesback ported to the 21.x release are removed(git log --oneline llvmorg-22-init..llvmorg-21.1.8 -- lld).
I'll delve into some of the key changes.
-
--print-gc-sections=<file>has been added toredirect garbage collection section listing to a file, avoidingcontamination of stdout with other linker output. (#159706) - A
VersionNodelexer state has been added for betterversion script parsing. This brings the lexer behavior closer to GNU ld.(#174530) - Unversioned undefined symbols now use version index 0, aligning withGNU ld 2.46 behavior. (
#168189) -
.data.rel.ro.hotand.data.rel.ro.unlikelyare now recognized as RELRO sections, allowing profile-guided staticdata partitioning. (#148920) - DTLTO now supports archive members and bitcode members of thinarchives. (
#157043) - For DTLTO,
--thinlto-remote-compiler-prepend-arg=<arg>has beenadded to prepend an argument to the remote compiler's command line. (#162456) - Balanced Partitioning (BP) section ordering now skips input sectionswith null data, and filters out section symbols. (
#149265) ( #151685) - For AArch64, fixed a crash when using
--fix-cortex-a53-843419with synthetic sections andimproved handling when patched code is far from the short jump. (#170495) - For AArch64, added support for the
R_AARCH64_FUNCINIT64dynamic relocation type for relocating word-sized data using the returnvalue of a function. (#156564) - For AArch64, added support for the
R_AARCH64_PATCHINSTrelocation type to support deactivation symbols. (#133534) - For AArch64, added support for reading AArch64 Build Attributes andconverting them into GNU Properties. (
#147970) - For ARM, fixed incorrect veneer generation for wraparound branchesat the high end of the 32-bit address space branching to the low end.(
#165263) - For LoongArch,
-rnow synthesizesR_LARCH_ALIGNat input section start to preserve alignmentinformation. (#153935) - For LoongArch, added relocation types for LA32R/LA32S. (
#172618) ( #176312) - For RISC-V, added infrastructure for handling vendor-specificrelocations. (
#159987) - For RISC-V, added support for statically resolved vendor-specificrelocations. (
#169273) - For RISC-V,
-rnow synthesizesR_RISCV_ALIGNat input section start to preserve alignmentinformation during two-stage linking. (#151639)This is an interesting relocatablelinking challenge for linker relaxation.
Besides me, Peter Smith (smithp35) and Jessica Clarke (jrtc27) havedone a lot of reviews.
jrtc27 has done great work simplifying the dynamic relocation system,which is highly appreciated.
I should call out
Distributed ThinLTO
Distributed ThinLTO(DTLTO) enables distributing ThinLTO backend compilations toexternal systems (e.g., Incredibuild, distcc-like tools) during the linkstep. This feature was contributed by PlayStation, who had offered it asa proprietary technology before upstreaming.
The traditional distributed ThinLTO is implemented in Bazel in buck2.Bazel-style distribution (build system orchestrated)uses a multi-step workflow:
1 |
# Compile to bitcode (made parallel by build system) |
The build system sees the index files from step 2 as outputs andschedules step 3 jobs across the build cluster. This requires a buildsystem that handles dynamic dependencies—outputs ofstep 2 determine inputs to step 3.
DTLTO (linker orchestrated) integrates steps 2-4into a single link invocation:
1 |
clang -flto=thin -c a.c b.c |
LLD performs the thin-link internally, generates a JSON jobdescription for each backend compilation, invokes the distributorprocess, waits for native objects, and links them. The distributor isresponsible for farming out the compilations to remote machines.
DTLTO works with any build system but requires a separate distributorprocess that speaks its JSON protocol. DTLTO is essentially "ThinLTOdistribution for projects that don't use Bazel".
Pointer Field Protection
R_AARCH64_PATCHINST is a static relocation type usedwith Pointer Field Protection (PFP), which leverages Armv8.3-A PointerAuthentication (PAC) to protect pointer fields in structs.
Consider the following C++ code:
1 |
struct cls { |
With Pointer Field Protection enabled, the compiler generates PACinstructions to sign and authenticate pointers:
1 |
load: |
Each PAC instruction is associated with anR_AARCH64_PATCHINST relocation referencing adeactivation symbol (the __pfp_ds_ prefixstands for "pointer field protection deactivation symbol"). By default,__pfp_ds__ZTS3cls.ptr is an undefined weak symbol in everyrelocatable file.
However, if the field's address escapes in any translation unit(e.g., someone takes &c->ptr), the compiler definesthe deactivation symbol as an absolute symbol (ELFSHN_ABS). When the linker sees a defined deactivationsymbol, it patches the PAC instruction to a NOP(R_AARCH64_PATCHINST acts as R_AARCH64_ABS64when the referenced symbol is defined), disabling the protection forthat field. This is necessary because external code could modify thepointer without signing it, which would cause authenticationfailures.
The linker allows duplicate definitions of absolute symbols if thevalues are identical.
R_AARCH64_FUNCINIT64 is a related static relocation typethat produces an R_AARCH64_IRELATIVE dynamic relocation (
PFP is AArch64-specific because it relies on Pointer Authentication(PAC), a hardware feature introduced in Armv8.3-A. PAC providesdedicated instructions (pacda, autda, etc.)that cryptographically sign pointers using keys stored in systemregisters. x86-64 lacks an equivalent mechanism—Intel CET providesshadow stacks and indirect branch tracking for control-flow integrity,but cannot sign arbitrary data pointers stored in memory.
Takeaways:
- Security features need linker support. This is because many featuresrequire aggregated information across all translation units. In thiscase, if any TU exposes a field's address, the linker disablesprotection for this field everywhere The implementation isusually lightweight.
- Relocations can do more than fill in addresses:
R_AARCH64_PATCHINSTconditionally patches instructions toNOPs based on symbol resolution. This is a different paradigm fromtraditional "compute address, write it" relocations.
RISC-V vendor relocations
RISC-V's openness encourages vendors to add custom instructions.Qualcomm has the uC extensions for their microcontrollers; CHERIoT addscapability-based security.
The RISC-V psABI adopted the vendor relocation system:
1 |
Relocation 0: R_RISCV_VENDOR references symbol "QUALCOMM" |
The R_RISCV_VENDOR marker identifies the vendornamespace via its symbol reference. The subsequent relocation uses avendor-specific type number that only makes sense within that namespace.Different vendors can reuse the same type numbers without conflict.
In lld 22:
- Infrastructure for vendor relocations was added (
#159987).The implementation folds vendor namespace information into the upperbits of RelType, allowing existing relocation processingcode to work with minimal changes. - Support for statically-resolved vendor relocations was added (
#169273),including Qualcomm and Andes relocation types. The patch landed withoutinvolving the regular lld/ELF reviewer pool. For changes that setarchitectural precedents, broader consensus should be sought beforemerging. I've commentedon this.
The
There's a maintainability concern: accepting vendor-specificrelocations into the core linker sets a precedent. RISC-V is uniquelyfragmented compared to other LLVM backends-x86, AArch64, PowerPC, andothers don't have nearly as many vendors adding custom instructions andrelocations. This fragmentation is a direct consequence of RISC-V's opennature and extensibility, but it creates new challenges for upstreamtoolchain maintainers. Accumulated vendor-specific code could become asignificant maintenance burden.
GNU ld compatibility
Large corporate users of lld/ELF don't care about GNU ldcompatibility. They add features for their own use cases and move on. Idiligently coordinate with binutils maintainers and file featurerequests when appropriate. When lld implements a new option or behavior,I often file corresponding GNU ld feature requests to keep the toolsaligned.
This coordination work is largely invisible but essential for thebroader toolchain ecosystem. Users benefit when they can switch betweenlinkers without surprises.
Link: lld 21 ELFchanges



