LLVM integrated assembler: Improving sections and symbols
In my previous post,
Sections
Sections are named, contiguous blocks of code or data within anobject file. They allow you to logically group related parts of yourprogram. The assembler places code and data into these sections as itprocesses the source file.
1 |
class MCSection { |
In LLVM 20, the MCSection
class used an enum called SectionVariant
todifferentiate between various object file formats, such as ELF, Mach-O,and COFF. These subclasses are used in contexts where the section typeis known at compile-time, such as in MCStreamer
and MCObjectTargetWriter
.This change eliminates the need for runtime type information (RTTI)checks, simplifying the codebase and improving efficiency.
Additionally, the storage for fragments' fixups (adjustments toaddresses and offsets) has been moved into the MCSection
class.
Symbols
Symbols are names that represent memory addresses or values.
1 |
class MCSymbol { |
Similar to sections, the MCSymbol
class also used a discriminator enum, SymbolKind, to distinguishbetween object file formats. This enum has also been removed.
Furthermore, the MCSymbol
class had anenum Contents
to specify the kind of symbol. This name wasa bit confusing, so it has been enum Kind
for clarity.
- regular symbol
equatedsymbol - commonsymbol
A special enumerator, SymContentsTargetCommon
, which wasused by AMDGPU for a specific type of common symbol, has also been ELFObjectWriter
to respect the symbol's section index(SHN_AMDGPU_LDS
for this special AMDGPU symbol).
sizeof(MCSymbol)
has been reduced to 24 bytes on 64-bitsystems.
The previous blog post
- The
MCSymbol::IsUsed
flag was a workaround fordetecting a subset of invalid reassignments and isremoved. - The
MCSymbol::IsResolving
flag is added to detectcyclic dependencies of equated symbols.