我曾有一个坏习惯:常常会忍不住打开推特,看看有什么新闻(尤其是 AI 技术日新月异的当下,容易 FOMO)、关注的人又发布了什么动态、自己的推文有没有被点赞或评论。一开始倒没觉得有什么问题,但当这个行为的发生频率变高之后,我意识到它带来了明显的负面影响:注意力难以集中,思维变得碎片化。于是我想改掉这个坏习惯。
For those unfamiliar, lld is theLLVM linker, supporting PE/COFF, ELF, Mach-O, and WebAssembly ports.These object file formats differ significantly, and each port mustfollow the conventions of the platform's system linker. As a result, theports share limited code (diagnostics, memory allocation, etc) and havelargely separate reviewer groups.
With LLVM 22.1 releasing soon, I've added some notes to the https://github.com/llvm/llvm-project/blob/release/22.x/lld/docs/ReleaseNotes.rstas an lld/ELF maintainer. As usual, I've reviewed almost all the patchesnot authored by me.
For the first time, I used an LLM agent (Claude Code) to help lookthrough commits(git log release/21.x..release/22.x -- lld/ELF) and draftthe release notes. Despite my request to only read lld/ELF changes,Claude Code also crafted notes for other ports, which I retained sincetheir release notes had been quite sparse for several releases. Changesback ported to the 21.x release are removed(git log --oneline llvmorg-22-init..llvmorg-21.1.8 -- lld).
I'll delve into some of the key changes.
--print-gc-sections=<file> has been added toredirect garbage collection section listing to a file, avoidingcontamination of stdout with other linker output. (#159706)
A VersionNode lexer state has been added for betterversion script parsing. This brings the lexer behavior closer to GNU ld.(#174530)
Unversioned undefined symbols now use version index 0, aligning withGNU ld 2.46 behavior. (#168189)
.data.rel.ro.hot and .data.rel.ro.unlikelyare now recognized as RELRO sections, allowing profile-guided staticdata partitioning. (#148920)
DTLTO now supports archive members and bitcode members of thinarchives. (#157043)
For DTLTO,--thinlto-remote-compiler-prepend-arg=<arg> has beenadded to prepend an argument to the remote compiler's command line. (#162456)
Balanced Partitioning (BP) section ordering now skips input sectionswith null data, and filters out section symbols. (#149265) (#151685)
For AArch64, fixed a crash when using--fix-cortex-a53-843419 with synthetic sections andimproved handling when patched code is far from the short jump. (#170495)
For AArch64, added support for the R_AARCH64_FUNCINIT64dynamic relocation type for relocating word-sized data using the returnvalue of a function. (#156564)
For AArch64, added support for the R_AARCH64_PATCHINSTrelocation type to support deactivation symbols. (#133534)
For AArch64, added support for reading AArch64 Build Attributes andconverting them into GNU Properties. (#147970)
For ARM, fixed incorrect veneer generation for wraparound branchesat the high end of the 32-bit address space branching to the low end.(#165263)
For LoongArch, -r now synthesizesR_LARCH_ALIGN at input section start to preserve alignmentinformation. (#153935)
For LoongArch, added relocation types for LA32R/LA32S. (#172618) (#176312)
For RISC-V, added infrastructure for handling vendor-specificrelocations. (#159987)
For RISC-V, added support for statically resolved vendor-specificrelocations. (#169273)
For RISC-V, -r now synthesizesR_RISCV_ALIGN at input section start to preserve alignmentinformation during two-stage linking. (#151639)This is an interesting relocatablelinking challenge for linker relaxation.
Besides me, Peter Smith (smithp35) and Jessica Clarke (jrtc27) havedone a lot of reviews.
jrtc27 has done great work simplifying the dynamic relocation system,which is highly appreciated.
I should call out https://github.com/llvm/llvm-project/pull/172618: forthis relatively large addition, the author and approver are from thesame company and contributing to their architecture, and neither theauthor nor the approver is a regular lld contributor/reviewer. Theauthor did not request review from regular reviewers and landed thepatch just 3 minutes after their colleague's approval. I left a commentasking to keep the PR open for other maintainers to review.
Distributed ThinLTO
Distributed ThinLTO(DTLTO) enables distributing ThinLTO backend compilations toexternal systems (e.g., Incredibuild, distcc-like tools) during the linkstep. This feature was contributed by PlayStation, who had offered it asa proprietary technology before upstreaming.
The traditional distributed ThinLTO is implemented in Bazel in buck2.Bazel-style distribution (build system orchestrated)uses a multi-step workflow:
1 2 3 4 5 6 7 8 9
# Compile to bitcode (made parallel by build system) clang -c -O2 -flto=thin a.c b.c # Thin link clang -flto=thin -fuse-ld=lld -Wl,--thinlto-index-only=a.rsp,--thinlto-emit-imports-files -Wl,--thinlto-prefix-replace=';lto/' a.o b.o # Backend compilation (distributed by build system) with dynamic dependencies clang -c -O2 -fthinlto-index=lto/a.o.thinlto.bc a.o -o lto/a.o clang -c -O2 -fthinlto-index=lto/b.o.thinlto.bc b.o -o lto/b.o # Final native link clang -fuse-ld=lld @a.rsp # a.rsp contains lto/a.o and lto/b.o
The build system sees the index files from step 2 as outputs andschedules step 3 jobs across the build cluster. This requires a buildsystem that handles dynamic dependencies—outputs ofstep 2 determine inputs to step 3.
DTLTO (linker orchestrated) integrates steps 2-4into a single link invocation:
LLD performs the thin-link internally, generates a JSON jobdescription for each backend compilation, invokes the distributorprocess, waits for native objects, and links them. The distributor isresponsible for farming out the compilations to remote machines.
DTLTO works with any build system but requires a separate distributorprocess that speaks its JSON protocol. DTLTO is essentially "ThinLTOdistribution for projects that don't use Bazel".
Pointer Field Protection
R_AARCH64_PATCHINST is a static relocation type usedwith Pointer Field Protection (PFP), which leverages Armv8.3-A PointerAuthentication (PAC) to protect pointer fields in structs.
Consider the following C++ code:
1 2 3 4 5 6 7 8 9
structcls { ~cls(); long *ptr; private: long *ptr2; };
long *load(cls *c){ return c->ptr; } voidstore(cls *c, long *ptr){ c->ptr = ptr; }
With Pointer Field Protection enabled, the compiler generates PACinstructions to sign and authenticate pointers:
1 2 3 4 5 6 7 8 9 10
load: ldr x8, [x0] // Load the PAC-signed pointer from c->ptr autda x8, x0 // Authenticate and strip the PAC, R_AARCH64_PATCHINST __pfp_ds__ZTS3cls.ptr mov x0, x8 ret
store: pacda x1, x0 // Sign ptr using c as a discriminator, R_AARCH64_PATCHINST __pfp_ds__ZTS3cls.ptr str x1, [x0] ret
Each PAC instruction is associated with anR_AARCH64_PATCHINST relocation referencing adeactivation symbol (the __pfp_ds_ prefixstands for "pointer field protection deactivation symbol"). By default,__pfp_ds__ZTS3cls.ptr is an undefined weak symbol in everyrelocatable file.
However, if the field's address escapes in any translation unit(e.g., someone takes &c->ptr), the compiler definesthe deactivation symbol as an absolute symbol (ELFSHN_ABS). When the linker sees a defined deactivationsymbol, it patches the PAC instruction to a NOP(R_AARCH64_PATCHINST acts as R_AARCH64_ABS64when the referenced symbol is defined), disabling the protection forthat field. This is necessary because external code could modify thepointer without signing it, which would cause authenticationfailures.
The linker allows duplicate definitions of absolute symbols if thevalues are identical.
R_AARCH64_FUNCINIT64 is a related static relocation typethat produces an R_AARCH64_IRELATIVE dynamic relocation (GNU indirectfunction). It initializes function pointers in static data at loadtime by calling a resolver function that returns the PAC-signedaddress.
PFP is AArch64-specific because it relies on Pointer Authentication(PAC), a hardware feature introduced in Armv8.3-A. PAC providesdedicated instructions (pacda, autda, etc.)that cryptographically sign pointers using keys stored in systemregisters. x86-64 lacks an equivalent mechanism—Intel CET providesshadow stacks and indirect branch tracking for control-flow integrity,but cannot sign arbitrary data pointers stored in memory.
Takeaways:
Security features need linker support. This is because many featuresrequire aggregated information across all translation units. In thiscase, if any TU exposes a field's address, the linker disablesprotection for this field everywhere The implementation isusually lightweight.
Relocations can do more than fill in addresses:R_AARCH64_PATCHINST conditionally patches instructions toNOPs based on symbol resolution. This is a different paradigm fromtraditional "compute address, write it" relocations.
RISC-V vendor relocations
RISC-V's openness encourages vendors to add custom instructions.Qualcomm has the uC extensions for their microcontrollers; CHERIoT addscapability-based security.
The RISC-V psABI adopted the vendor relocation system:
The R_RISCV_VENDOR marker identifies the vendornamespace via its symbol reference. The subsequent relocation uses avendor-specific type number that only makes sense within that namespace.Different vendors can reuse the same type numbers without conflict.
In lld 22:
Infrastructure for vendor relocations was added (#159987).The implementation folds vendor namespace information into the upperbits of RelType, allowing existing relocation processingcode to work with minimal changes.
Support for statically-resolved vendor relocations was added (#169273),including Qualcomm and Andes relocation types. The patch landed withoutinvolving the regular lld/ELF reviewer pool. For changes that setarchitectural precedents, broader consensus should be sought beforemerging. I've commentedon this.
The RISC-Vtoolchain conventions document the vendor relocation scheme.
There's a maintainability concern: accepting vendor-specificrelocations into the core linker sets a precedent. RISC-V is uniquelyfragmented compared to other LLVM backends-x86, AArch64, PowerPC, andothers don't have nearly as many vendors adding custom instructions andrelocations. This fragmentation is a direct consequence of RISC-V's opennature and extensibility, but it creates new challenges for upstreamtoolchain maintainers. Accumulated vendor-specific code could become asignificant maintenance burden.
GNU ld compatibility
Large corporate users of lld/ELF don't care about GNU ldcompatibility. They add features for their own use cases and move on. Idiligently coordinate with binutils maintainers and file featurerequests when appropriate. When lld implements a new option or behavior,I often file corresponding GNU ld feature requests to keep the toolsaligned.
This coordination work is largely invisible but essential for thebroader toolchain ecosystem. Users benefit when they can switch betweenlinkers without surprises.
LLVM 22 will be released. As usual, I maintain lld/ELF and have addedsome notes to https://github.com/llvm/llvm-project/blob/release/22.x/lld/docs/ReleaseNotes.rst.I've meticulously reviewed nearly all the patches that are not authoredby me. I'll delve into some of the key changes.
--print-gc-sections=<file> has been added toredirect garbage collection section listing to a file, avoidingcontamination of stdout with other linker output. (#159706)
A VersionNode lexer state has been added for betterversion script parsing. This brings the lexer behavior closer to GNU ld.(#174530)
Unversioned undefined symbols now use version index 0, aligning withGNU ld 2.46 behavior. (#168189)
.data.rel.ro.hot and .data.rel.ro.unlikelyare now recognized as RELRO sections, allowing profile-guided staticdata partitioning. (#148920)
DTLTO now supports archive members and bitcode members of thinarchives. (#157043)
For DTLTO,--thinlto-remote-compiler-prepend-arg=<arg> has beenadded to prepend an argument to the remote compiler's command line. (#162456)
Balanced Partitioning (BP) section ordering now skips input sectionswith null data, and filters out section symbols. (#149265) (#151685)
For AArch64, fixed a crash when using--fix-cortex-a53-843419 with synthetic sections andimproved handling when patched code is far from the short jump. (#170495)
For AArch64, added support for the R_AARCH64_FUNCINIT64dynamic relocation type for relocating word-sized data using the returnvalue of a function. (#156564)
For AArch64, added support for the R_AARCH64_PATCHINSTrelocation type to support deactivation symbols. (#133534)
For AArch64, added support for reading AArch64 Build Attributes andconverting them into GNU Properties. (#147970)
For ARM, fixed incorrect veneer generation for wraparound branchesat the high end of the 32-bit address space branching to the low end.(#165263)
For LoongArch, -r now synthesizesR_LARCH_ALIGN at input section start to preserve alignmentinformation. (#153935)
For LoongArch, added relocation types for LA32R/LA32S. (#172618) (#176312)
For RISC-V, added infrastructure for handling vendor-specificrelocations. (#159987)
For RISC-V, added support for statically resolved vendor-specificrelocations. (#169273)
For RISC-V, -r now synthesizesR_RISCV_ALIGN at input section start to preserve alignmentinformation during two-stage linking. (#151639)
以 Claude Code 为代表的 Coding Agent 对软件行业的重塑已成定局。它们的可用性已然突破临界点,使得代码生成的边际成本显著下降,比如 Claude Code 本身已经已经全部由 Claude Code 编写了。过去需要一周的硬编码工作,现在可能缩短为半天;过去因技术门槛高而不敢涉猎的领域,现在变得触手可及。