-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LLD][COFF] Add basic ARM64X dynamic relocations support #118035
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -25,6 +25,7 @@ | |
|
||
using namespace llvm; | ||
using namespace llvm::object; | ||
using namespace llvm::support; | ||
using namespace llvm::support::endian; | ||
using namespace llvm::COFF; | ||
using llvm::support::ulittle32_t; | ||
|
@@ -1147,4 +1148,81 @@ uint32_t ImportThunkChunkARM64EC::extendRanges() { | |
return sizeof(arm64Thunk) - sizeof(uint32_t); | ||
} | ||
|
||
size_t Arm64XDynamicRelocEntry::getSize() const { | ||
switch (type) { | ||
case IMAGE_DVRT_ARM64X_FIXUP_TYPE_VALUE: | ||
return sizeof(uint16_t) + size; // A header and a payload. | ||
case IMAGE_DVRT_ARM64X_FIXUP_TYPE_DELTA: | ||
case IMAGE_DVRT_ARM64X_FIXUP_TYPE_ZEROFILL: | ||
llvm_unreachable("unsupported type"); | ||
} | ||
} | ||
|
||
void Arm64XDynamicRelocEntry::writeTo(uint8_t *buf) const { | ||
auto out = reinterpret_cast<ulittle16_t *>(buf); | ||
*out = (offset & 0xfff) | (type << 12); | ||
|
||
switch (type) { | ||
case IMAGE_DVRT_ARM64X_FIXUP_TYPE_VALUE: | ||
*out |= ((bit_width(size) - 1) << 14); // Encode the size. | ||
switch (size) { | ||
case 2: | ||
out[1] = value; | ||
break; | ||
case 4: | ||
*reinterpret_cast<ulittle32_t *>(out + 1) = value; | ||
break; | ||
case 8: | ||
*reinterpret_cast<ulittle64_t *>(out + 1) = value; | ||
break; | ||
default: | ||
llvm_unreachable("invalid size"); | ||
} | ||
break; | ||
case IMAGE_DVRT_ARM64X_FIXUP_TYPE_DELTA: | ||
case IMAGE_DVRT_ARM64X_FIXUP_TYPE_ZEROFILL: | ||
llvm_unreachable("unsupported type"); | ||
} | ||
} | ||
|
||
void DynamicRelocsChunk::finalize() { | ||
llvm::stable_sort(arm64xRelocs, [=](const Arm64XDynamicRelocEntry &a, | ||
const Arm64XDynamicRelocEntry &b) { | ||
return a.offset < b.offset; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The sorting itself won’t change, but the size might once we allow offsets to chunks. Adding thunks could shift offsets between pages, which might add or remove There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, sounds reasonable - no need to move the sorting out in that case; keeping it together probably makes more sense indeed than to microoptimize such things at this stage. |
||
}); | ||
|
||
size = sizeof(coff_dynamic_reloc_table) + sizeof(coff_dynamic_relocation64) + | ||
sizeof(coff_base_reloc_block_header); | ||
|
||
for (const Arm64XDynamicRelocEntry &entry : arm64xRelocs) { | ||
assert(!(entry.offset & ~0xfff)); // Not yet supported. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So essentially, at the point of this patch, we can only do dynamic relocations within the first 4 KB of the linked image, i.e. in the basics of the image headers? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, adding support for other offsets is straightforward, but it would remain dead code until we actually emit them. Since it’s only a few lines of code, I can add it if you think it’s better. In general, testing is tricky without separate namespace support, and namespace support depends on ARM64X relocs. So, I’m trying to find a way to keep things incremental. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, good, then I'm understanding it correctly. No need to add support for other offsets yet if it's not easy to test. |
||
size += entry.getSize(); | ||
} | ||
|
||
size = alignTo(size, sizeof(uint32_t)); | ||
} | ||
|
||
void DynamicRelocsChunk::writeTo(uint8_t *buf) const { | ||
auto table = reinterpret_cast<coff_dynamic_reloc_table *>(buf); | ||
table->Version = 1; | ||
table->Size = sizeof(coff_dynamic_relocation64); | ||
buf += sizeof(*table); | ||
|
||
auto header = reinterpret_cast<coff_dynamic_relocation64 *>(buf); | ||
header->Symbol = IMAGE_DYNAMIC_RELOCATION_ARM64X; | ||
buf += sizeof(*header); | ||
|
||
auto pageHeader = reinterpret_cast<coff_base_reloc_block_header *>(buf); | ||
pageHeader->BlockSize = sizeof(*pageHeader); | ||
for (const Arm64XDynamicRelocEntry &entry : arm64xRelocs) { | ||
entry.writeTo(buf + pageHeader->BlockSize); | ||
pageHeader->BlockSize += entry.getSize(); | ||
} | ||
pageHeader->BlockSize = alignTo(pageHeader->BlockSize, sizeof(uint32_t)); | ||
|
||
header->BaseRelocSize = pageHeader->BlockSize; | ||
table->Size += header->BaseRelocSize; | ||
assert(size == sizeof(*table) + sizeof(*header) + header->BaseRelocSize); | ||
} | ||
|
||
} // namespace lld::coff |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -79,6 +79,11 @@ static_assert(sizeof(dosProgram) % 8 == 0, | |
|
||
static const int dosStubSize = sizeof(dos_header) + sizeof(dosProgram); | ||
static_assert(dosStubSize % 8 == 0, "DOSStub size must be multiple of 8"); | ||
static const uint32_t coffHeaderOffset = dosStubSize + sizeof(PEMagic); | ||
static const uint32_t peHeaderOffset = | ||
coffHeaderOffset + sizeof(coff_file_header); | ||
static const uint32_t dataDirOffset64 = | ||
peHeaderOffset + sizeof(pe32plus_header); | ||
|
||
static const int numberOfDataDirectory = 16; | ||
|
||
|
@@ -272,6 +277,7 @@ class Writer { | |
OutputSection *findSection(StringRef name); | ||
void addBaserels(); | ||
void addBaserelBlocks(std::vector<Baserel> &v); | ||
void createDynamicRelocs(); | ||
|
||
uint32_t getSizeOfInitializedData(); | ||
|
||
|
@@ -754,6 +760,8 @@ void Writer::run() { | |
llvm::TimeTraceScope timeScope("Write PE"); | ||
ScopedTimer t1(ctx.codeLayoutTimer); | ||
|
||
if (ctx.config.machine == ARM64X) | ||
ctx.dynamicRelocs = make<DynamicRelocsChunk>(); | ||
createImportTables(); | ||
createSections(); | ||
appendImportThunks(); | ||
|
@@ -764,6 +772,7 @@ void Writer::run() { | |
mergeSections(); | ||
sortECChunks(); | ||
appendECImportTables(); | ||
createDynamicRelocs(); | ||
removeUnusedSections(); | ||
finalizeAddresses(); | ||
removeEmptySections(); | ||
|
@@ -1597,8 +1606,14 @@ void Writer::assignAddresses() { | |
|
||
for (OutputSection *sec : ctx.outputSections) { | ||
llvm::TimeTraceScope timeScope("Section: ", sec->name); | ||
if (sec == relocSec) | ||
if (sec == relocSec) { | ||
sec->chunks.clear(); | ||
addBaserels(); | ||
if (ctx.dynamicRelocs) { | ||
ctx.dynamicRelocs->finalize(); | ||
relocSec->addChunk(ctx.dynamicRelocs); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The added code here, Ok, I see that I'm pondering if there are other ways of making this clearer to the reader - either moving the newly added code here into There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed, that would be cleaner and more correct with respect to |
||
} | ||
} | ||
uint64_t rawSize = 0, virtualSize = 0; | ||
sec->header.VirtualAddress = rva; | ||
|
||
|
@@ -1673,6 +1688,7 @@ template <typename PEHeaderTy> void Writer::writeHeader() { | |
buf += sizeof(PEMagic); | ||
|
||
// Write COFF header | ||
assert(coffHeaderOffset == buf - buffer->getBufferStart()); | ||
auto *coff = reinterpret_cast<coff_file_header *>(buf); | ||
buf += sizeof(*coff); | ||
switch (config->machine) { | ||
|
@@ -1705,6 +1721,7 @@ template <typename PEHeaderTy> void Writer::writeHeader() { | |
sizeof(PEHeaderTy) + sizeof(data_directory) * numberOfDataDirectory; | ||
|
||
// Write PE header | ||
assert(peHeaderOffset == buf - buffer->getBufferStart()); | ||
auto *pe = reinterpret_cast<PEHeaderTy *>(buf); | ||
buf += sizeof(*pe); | ||
pe->Magic = config->is64() ? PE32Header::PE32_PLUS : PE32Header::PE32; | ||
|
@@ -1770,6 +1787,8 @@ template <typename PEHeaderTy> void Writer::writeHeader() { | |
pe->SizeOfInitializedData = getSizeOfInitializedData(); | ||
|
||
// Write data directory | ||
assert(!ctx.config.is64() || | ||
dataDirOffset64 == buf - buffer->getBufferStart()); | ||
auto *dir = reinterpret_cast<data_directory *>(buf); | ||
buf += sizeof(*dir) * numberOfDataDirectory; | ||
if (edataStart) { | ||
|
@@ -1799,9 +1818,12 @@ template <typename PEHeaderTy> void Writer::writeHeader() { | |
exceptionTable.last->getSize() - | ||
exceptionTable.first->getRVA(); | ||
} | ||
if (relocSec->getVirtualSize()) { | ||
size_t relocSize = relocSec->getVirtualSize(); | ||
if (ctx.dynamicRelocs) | ||
relocSize -= ctx.dynamicRelocs->getSize(); | ||
if (relocSize) { | ||
dir[BASE_RELOCATION_TABLE].RelativeVirtualAddress = relocSec->getRVA(); | ||
dir[BASE_RELOCATION_TABLE].Size = relocSec->getVirtualSize(); | ||
dir[BASE_RELOCATION_TABLE].Size = relocSize; | ||
} | ||
if (Symbol *sym = ctx.symtab.findUnderscore("_tls_used")) { | ||
if (Defined *b = dyn_cast<Defined>(sym)) { | ||
|
@@ -2523,7 +2545,6 @@ uint32_t Writer::getSizeOfInitializedData() { | |
void Writer::addBaserels() { | ||
if (!ctx.config.relocatable) | ||
return; | ||
relocSec->chunks.clear(); | ||
std::vector<Baserel> v; | ||
for (OutputSection *sec : ctx.outputSections) { | ||
if (sec->header.Characteristics & IMAGE_SCN_MEM_DISCARDABLE) | ||
|
@@ -2557,6 +2578,29 @@ void Writer::addBaserelBlocks(std::vector<Baserel> &v) { | |
relocSec->addChunk(make<BaserelChunk>(page, &v[i], &v[0] + j)); | ||
} | ||
|
||
void Writer::createDynamicRelocs() { | ||
if (!ctx.dynamicRelocs) | ||
return; | ||
|
||
// Adjust the Machine field in the COFF header to AMD64. | ||
ctx.dynamicRelocs->add(IMAGE_DVRT_ARM64X_FIXUP_TYPE_VALUE, sizeof(uint16_t), | ||
coffHeaderOffset + offsetof(coff_file_header, Machine), | ||
AMD64); | ||
|
||
// Clear the load config directory. | ||
// FIXME: Use the hybrid load config value instead. | ||
ctx.dynamicRelocs->add(IMAGE_DVRT_ARM64X_FIXUP_TYPE_VALUE, sizeof(uint32_t), | ||
dataDirOffset64 + | ||
LOAD_CONFIG_TABLE * sizeof(data_directory) + | ||
offsetof(data_directory, RelativeVirtualAddress), | ||
0); | ||
ctx.dynamicRelocs->add(IMAGE_DVRT_ARM64X_FIXUP_TYPE_VALUE, sizeof(uint32_t), | ||
dataDirOffset64 + | ||
LOAD_CONFIG_TABLE * sizeof(data_directory) + | ||
offsetof(data_directory, Size), | ||
0); | ||
} | ||
|
||
PartialSection *Writer::createPartialSection(StringRef name, | ||
uint32_t outChars) { | ||
PartialSection *&pSec = partialSections[{name, outChars}]; | ||
|
@@ -2660,6 +2704,18 @@ template <typename T> void Writer::prepareLoadConfig(T *loadConfig) { | |
loadConfig->DependentLoadFlags = ctx.config.dependentLoadFlags; | ||
} | ||
|
||
if (ctx.dynamicRelocs) { | ||
IF_CONTAINS(DynamicValueRelocTableSection) { | ||
loadConfig->DynamicValueRelocTableSection = relocSec->sectionIndex; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see that you're merging the Are there other fields that we currently set in load config, which could benefit from There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, that's a correct observation. Using the existing helpers felt appropriate here. For ARM64X, we will also need to modify the We should technically have such check for the existing |
||
loadConfig->DynamicValueRelocTableOffset = | ||
ctx.dynamicRelocs->getRVA() - relocSec->getRVA(); | ||
} | ||
else { | ||
warn("'_load_config_used' structure too small to include dynamic " | ||
"relocations"); | ||
} | ||
} | ||
|
||
if (ctx.config.guardCF == GuardCFLevel::Off) | ||
return; | ||
RETURN_IF_NOT_CONTAINS(GuardFlags) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,68 @@ | ||
// REQUIRES: aarch64 | ||
// RUN: split-file %s %t.dir && cd %t.dir | ||
|
||
// RUN: llvm-mc -filetype=obj -triple=aarch64-windows test.s -o test.obj | ||
// RUN: llvm-mc -filetype=obj -triple=aarch64-windows loadconfig.s -o loadconfig.obj | ||
// RUN: llvm-mc -filetype=obj -triple=aarch64-windows loadconfig-short.s -o loadconfig-short.obj | ||
|
||
// RUN: lld-link -machine:arm64x -out:out.dll -dll -noentry loadconfig.obj test.obj | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So we're testing writing an arm64x image, but where we only have plain aarch64 chunks? And if changing to the arm64ec form at runtime, it changes the machine field and clears the load config, but nothing else? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I think that's the simplest possible ARM64X relocs test (and testing more complex cases will require namespaces). |
||
|
||
// RUN: llvm-readobj --coff-load-config out.dll | FileCheck -check-prefix=DYNRELOCS %s | ||
// DYNRELOCS: DynamicValueRelocTableOffset: 0xC | ||
// DYNRELOCS-NEXT: DynamicValueRelocTableSection: 4 | ||
// DYNRELOCS: DynamicRelocations [ | ||
// DYNRELOCS-NEXT: Version: 0x1 | ||
// DYNRELOCS-NEXT: Arm64X [ | ||
// DYNRELOCS-NEXT: Entry [ | ||
// DYNRELOCS-NEXT: RVA: 0x7C | ||
// DYNRELOCS-NEXT: Type: VALUE | ||
// DYNRELOCS-NEXT: Size: 0x2 | ||
// DYNRELOCS-NEXT: Value: 0x8664 | ||
// DYNRELOCS-NEXT: ] | ||
// DYNRELOCS-NEXT: Entry [ | ||
// DYNRELOCS-NEXT: RVA: 0x150 | ||
// DYNRELOCS-NEXT: Type: VALUE | ||
// DYNRELOCS-NEXT: Size: 0x4 | ||
// DYNRELOCS-NEXT: Value: 0x0 | ||
// DYNRELOCS-NEXT: ] | ||
// DYNRELOCS-NEXT: Entry [ | ||
// DYNRELOCS-NEXT: RVA: 0x154 | ||
// DYNRELOCS-NEXT: Type: VALUE | ||
// DYNRELOCS-NEXT: Size: 0x4 | ||
// DYNRELOCS-NEXT: Value: 0x0 | ||
// DYNRELOCS-NEXT: ] | ||
// DYNRELOCS-NEXT: ] | ||
// DYNRELOCS-NEXT: ] | ||
|
||
// RUN: llvm-readobj --headers out.dll | FileCheck -check-prefix=HEADERS %s | ||
// HEADERS: BaseRelocationTableRVA: 0x4000 | ||
// HEADERS-NEXT: BaseRelocationTableSize: 0xC | ||
// HEADERS: LoadConfigTableRVA: 0x1000 | ||
// HEADERS-NEXT: LoadConfigTableSize: 0x140 | ||
// HEADERS: Name: .reloc (2E 72 65 6C 6F 63 00 00) | ||
// HEADERS-NEXT: VirtualSize: 0x38 | ||
|
||
// RUN: lld-link -machine:arm64x -out:out-short.dll -dll -noentry loadconfig-short.obj 2>&1 | FileCheck --check-prefix=WARN-RELOC-SIZE %s | ||
// WARN-RELOC-SIZE: lld-link: warning: '_load_config_used' structure too small to include dynamic relocations | ||
|
||
#--- test.s | ||
.data | ||
sym: | ||
// Emit a basereloc to make the loadconfig test more meaningful. | ||
.xword sym | ||
|
||
#--- loadconfig.s | ||
.section .rdata,"dr" | ||
.globl _load_config_used | ||
.p2align 3, 0 | ||
_load_config_used: | ||
.word 0x140 | ||
.fill 0x13c,1,0 | ||
|
||
#--- loadconfig-short.s | ||
.section .rdata,"dr" | ||
.globl _load_config_used | ||
.p2align 3, 0 | ||
_load_config_used: | ||
.word 0xe4 | ||
.fill 0xe0,1,0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This causes warnings when building with GCC:
This case is mentioned in the LLVM coding style guide, see https://llvm.org/docs/CodingStandards.html#don-t-use-default-labels-in-fully-covered-switches-over-enumerations - the default action to silence this warning with GCC is to add an
llvm_unreachable("")
after the switch; see e.g. b21de9b. (In this case, it feels a bit repetetive when some cases already end withllvm_unreachable()
too, but they're different cases.)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created #122382 for a fix, sorry about that. The current
llvm_unreachable()
will be removed one ARM64X support is more complete. Those two reloc types are needed for the shared IAT.