Skip to content

[ICP] Add a few tunings to indirect-call-promotion #149892

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 24, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 12 additions & 2 deletions llvm/lib/Analysis/ProfileSummaryInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -121,8 +121,18 @@ void ProfileSummaryInfo::computeThresholds() {
ProfileSummaryBuilder::getHotCountThreshold(DetailedSummary);
ColdCountThreshold =
ProfileSummaryBuilder::getColdCountThreshold(DetailedSummary);
assert(ColdCountThreshold <= HotCountThreshold &&
"Cold count threshold cannot exceed hot count threshold!");
// When the hot and cold thresholds are identical, we would classify
// a count value as both hot and cold since we are doing an inclusive check
// (see ::is{Hot|Cold}Count(). To avoid this undesirable overlap, ensure the
// thresholds are distinct.
if (HotCountThreshold == ColdCountThreshold) {
if (ColdCountThreshold > 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplify it to just bump the hot count threshold?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not a good idea as this will lead tons of test changes. Also change existing opt behaviors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean bump it when the cold and hot thresholds equal -- that should not be frequent.

(*ColdCountThreshold)--;
else
(*HotCountThreshold)++;
}
assert(ColdCountThreshold < HotCountThreshold &&
"Cold count threshold should be less than hot count threshold!");
if (!hasPartialSampleProfile() || !ScalePartialSampleProfileWorkingSetSize) {
HasHugeWorkingSetSize =
HotEntry.NumCounts > ProfileSummaryHugeWorkingSetSizeThreshold;
Expand Down
193 changes: 129 additions & 64 deletions llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,27 @@ static cl::opt<unsigned>
ICPCSSkip("icp-csskip", cl::init(0), cl::Hidden,
cl::desc("Skip Callsite up to this number for this compilation"));

// ICP the candidate function even when only a declaration is present.
static cl::opt<bool> ICPAllowDecls(
"icp-allow-decls", cl::init(false), cl::Hidden,
cl::desc("Promote the target candidate even when the defintion "
" is not available"));

// ICP hot candidate functions only. When setting to false, non-cold functions
// (warm functions) can also be promoted.
static cl::opt<bool>
ICPAllowHotOnly("icp-allow-hot-only", cl::init(true), cl::Hidden,
cl::desc("Promote the target candidate only if it is a "
"hot function. Otherwise, warm functions can "
"also be promoted"));

// If one target cannot be ICP'd, proceed with the remaining targets instead
// of exiting the callsite.
static cl::opt<bool> ICPAllowCandidateSkip(
"icp-allow-candidate-skip", cl::init(false), cl::Hidden,
cl::desc("Continue with the remaining targets instead of exiting "
"when failing in a candidate"));

// Set if the pass is called in LTO optimization. The difference for LTO mode
// is the pass won't prefix the source module name to the internal linkage
// symbols.
Expand Down Expand Up @@ -330,6 +351,7 @@ class IndirectCallPromoter {
struct PromotionCandidate {
Function *const TargetFunction;
const uint64_t Count;
const uint32_t Index;

// The following fields only exists for promotion candidates with vtable
// information.
Expand All @@ -341,7 +363,8 @@ class IndirectCallPromoter {
VTableGUIDCountsMap VTableGUIDAndCounts;
SmallVector<Constant *> AddressPoints;

PromotionCandidate(Function *F, uint64_t C) : TargetFunction(F), Count(C) {}
PromotionCandidate(Function *F, uint64_t C, uint32_t I)
: TargetFunction(F), Count(C), Index(I) {}
};

// Check if the indirect-call call site should be promoted. Return the number
Expand All @@ -356,12 +379,10 @@ class IndirectCallPromoter {
// Promote a list of targets for one indirect-call callsite by comparing
// indirect callee with functions. Return true if there are IR
// transformations and false otherwise.
bool tryToPromoteWithFuncCmp(CallBase &CB, Instruction *VPtr,
ArrayRef<PromotionCandidate> Candidates,
uint64_t TotalCount,
ArrayRef<InstrProfValueData> ICallProfDataRef,
uint32_t NumCandidates,
VTableGUIDCountsMap &VTableGUIDCounts);
bool tryToPromoteWithFuncCmp(
CallBase &CB, Instruction *VPtr, ArrayRef<PromotionCandidate> Candidates,
uint64_t TotalCount, MutableArrayRef<InstrProfValueData> ICallProfDataRef,
uint32_t NumCandidates, VTableGUIDCountsMap &VTableGUIDCounts);

// Promote a list of targets for one indirect call by comparing vtables with
// functions. Return true if there are IR transformations and false
Expand Down Expand Up @@ -394,12 +415,15 @@ class IndirectCallPromoter {
Constant *getOrCreateVTableAddressPointVar(GlobalVariable *GV,
uint64_t AddressPointOffset);

void updateFuncValueProfiles(CallBase &CB, ArrayRef<InstrProfValueData> VDs,
void updateFuncValueProfiles(CallBase &CB,
MutableArrayRef<InstrProfValueData> VDs,
uint64_t Sum, uint32_t MaxMDCount);

void updateVPtrValueProfiles(Instruction *VPtr,
VTableGUIDCountsMap &VTableGUIDCounts);

bool isValidTarget(uint64_t, Function *, const CallBase &, uint64_t);

public:
IndirectCallPromoter(
Function &Func, Module &M, InstrProfSymtab *Symtab, bool SamplePGO,
Expand All @@ -419,6 +443,53 @@ class IndirectCallPromoter {

} // end anonymous namespace

bool IndirectCallPromoter::isValidTarget(uint64_t Target,
Function *TargetFunction,
const CallBase &CB, uint64_t Count) {
// Don't promote if the symbol is not defined in the module. This avoids
// creating a reference to a symbol that doesn't exist in the module
// This can happen when we compile with a sample profile collected from
// one binary but used for another, which may have profiled targets that
// aren't used in the new binary. We might have a declaration initially in
// the case where the symbol is globally dead in the binary and removed by
// ThinLTO.
using namespace ore;
if (TargetFunction == nullptr) {
LLVM_DEBUG(dbgs() << " Not promote: Cannot find the target\n");
ORE.emit([&]() {
return OptimizationRemarkMissed(DEBUG_TYPE, "UnableToFindTarget", &CB)
<< "Cannot promote indirect call: target with md5sum "
<< NV("target md5sum", Target)
<< " not found (count=" << NV("Count", Count) << ")";
});
return false;
}
if (!ICPAllowDecls && TargetFunction->isDeclaration()) {
LLVM_DEBUG(dbgs() << " Not promote: target definition is not available\n");
ORE.emit([&]() {
return OptimizationRemarkMissed(DEBUG_TYPE, "NoTargetDef", &CB)
<< "Do not promote indirect call: target with md5sum "
<< NV("target md5sum", Target)
<< " definition not available (count=" << ore::NV("Count", Count)
<< ")";
});
return false;
}

const char *Reason = nullptr;
if (!isLegalToPromote(CB, TargetFunction, &Reason)) {

ORE.emit([&]() {
return OptimizationRemarkMissed(DEBUG_TYPE, "UnableToPromote", &CB)
<< "Cannot promote indirect call to "
<< NV("TargetFunction", TargetFunction)
<< " (count=" << NV("Count", Count) << "): " << Reason;
});
return false;
}
return true;
}

// Indirect-call promotion heuristic. The direct targets are sorted based on
// the count. Stop at the first target that is not promoted.
std::vector<IndirectCallPromoter::PromotionCandidate>
Expand Down Expand Up @@ -469,38 +540,15 @@ IndirectCallPromoter::getPromotionCandidatesForCallSite(
break;
}

// Don't promote if the symbol is not defined in the module. This avoids
// creating a reference to a symbol that doesn't exist in the module
// This can happen when we compile with a sample profile collected from
// one binary but used for another, which may have profiled targets that
// aren't used in the new binary. We might have a declaration initially in
// the case where the symbol is globally dead in the binary and removed by
// ThinLTO.
Function *TargetFunction = Symtab->getFunction(Target);
if (TargetFunction == nullptr || TargetFunction->isDeclaration()) {
LLVM_DEBUG(dbgs() << " Not promote: Cannot find the target\n");
ORE.emit([&]() {
return OptimizationRemarkMissed(DEBUG_TYPE, "UnableToFindTarget", &CB)
<< "Cannot promote indirect call: target with md5sum "
<< ore::NV("target md5sum", Target) << " not found";
});
break;
}

const char *Reason = nullptr;
if (!isLegalToPromote(CB, TargetFunction, &Reason)) {
using namespace ore;

ORE.emit([&]() {
return OptimizationRemarkMissed(DEBUG_TYPE, "UnableToPromote", &CB)
<< "Cannot promote indirect call to "
<< NV("TargetFunction", TargetFunction) << " with count of "
<< NV("Count", Count) << ": " << Reason;
});
break;
if (!isValidTarget(Target, TargetFunction, CB, Count)) {
if (ICPAllowCandidateSkip)
continue;
else
break;
}

Ret.push_back(PromotionCandidate(TargetFunction, Count));
Ret.push_back(PromotionCandidate(TargetFunction, Count, I));
TotalCount -= Count;
}
return Ret;
Expand Down Expand Up @@ -642,7 +690,7 @@ CallBase &llvm::pgo::promoteIndirectCall(CallBase &CB, Function *DirectCallee,
// Promote indirect-call to conditional direct-call for one callsite.
bool IndirectCallPromoter::tryToPromoteWithFuncCmp(
CallBase &CB, Instruction *VPtr, ArrayRef<PromotionCandidate> Candidates,
uint64_t TotalCount, ArrayRef<InstrProfValueData> ICallProfDataRef,
uint64_t TotalCount, MutableArrayRef<InstrProfValueData> ICallProfDataRef,
uint32_t NumCandidates, VTableGUIDCountsMap &VTableGUIDCounts) {
uint32_t NumPromoted = 0;

Expand All @@ -655,6 +703,8 @@ bool IndirectCallPromoter::tryToPromoteWithFuncCmp(
NumOfPGOICallPromotion++;
NumPromoted++;

// Update the count and this entry will be erased later.
ICallProfDataRef[C.Index].Count = 0;
if (!EnableVTableProfileUse || C.VTableGUIDAndCounts.empty())
continue;

Expand All @@ -679,21 +729,33 @@ bool IndirectCallPromoter::tryToPromoteWithFuncCmp(
"Number of promoted functions should not be greater than the number "
"of values in profile metadata");

// Update value profiles on the indirect call.
updateFuncValueProfiles(CB, ICallProfDataRef.slice(NumPromoted), TotalCount,
NumCandidates);
updateFuncValueProfiles(CB, ICallProfDataRef, TotalCount, NumCandidates);
updateVPtrValueProfiles(VPtr, VTableGUIDCounts);
return true;
}

void IndirectCallPromoter::updateFuncValueProfiles(
CallBase &CB, ArrayRef<InstrProfValueData> CallVDs, uint64_t TotalCount,
uint32_t MaxMDCount) {
CallBase &CB, MutableArrayRef<InstrProfValueData> CallVDs,
uint64_t TotalCount, uint32_t MaxMDCount) {
// First clear the existing !prof.
CB.setMetadata(LLVMContext::MD_prof, nullptr);

// Sort value profiles by count in descending order.
llvm::stable_sort(CallVDs, [](const InstrProfValueData &LHS,
const InstrProfValueData &RHS) {
return LHS.Count > RHS.Count;
});
// Drop the <target-value, count> pair if count is zero.
ArrayRef<InstrProfValueData> VDs(
CallVDs.begin(),
llvm::upper_bound(CallVDs, 0U,
[](uint64_t Count, const InstrProfValueData &ProfData) {
return ProfData.Count <= Count;
}));

// Annotate the remaining value profiles if counter is not zero.
if (TotalCount != 0)
annotateValueSite(M, CB, CallVDs, TotalCount, IPVK_IndirectCallTarget,
annotateValueSite(M, CB, VDs, TotalCount, IPVK_IndirectCallTarget,
MaxMDCount);
}

Expand Down Expand Up @@ -726,7 +788,7 @@ bool IndirectCallPromoter::tryToPromoteWithVTableCmp(
uint64_t TotalFuncCount, uint32_t NumCandidates,
MutableArrayRef<InstrProfValueData> ICallProfDataRef,
VTableGUIDCountsMap &VTableGUIDCounts) {
SmallVector<uint64_t, 4> PromotedFuncCount;
SmallVector<std::pair<uint32_t, uint64_t>, 4> PromotedFuncCount;

for (const auto &Candidate : Candidates) {
for (auto &[GUID, Count] : Candidate.VTableGUIDAndCounts)
Expand Down Expand Up @@ -771,7 +833,7 @@ bool IndirectCallPromoter::tryToPromoteWithVTableCmp(
return Remark;
});

PromotedFuncCount.push_back(Candidate.Count);
PromotedFuncCount.push_back({Candidate.Index, Candidate.Count});

assert(TotalFuncCount >= Candidate.Count &&
"Within one prof metadata, total count is the sum of counts from "
Expand All @@ -792,22 +854,12 @@ bool IndirectCallPromoter::tryToPromoteWithVTableCmp(
// used to load multiple virtual functions. The vtable profiles needs to be
// updated properly in that case (e.g, for each indirect call annotate both
// type profiles and function profiles in one !prof).
for (size_t I = 0; I < PromotedFuncCount.size(); I++)
ICallProfDataRef[I].Count -=
std::max(PromotedFuncCount[I], ICallProfDataRef[I].Count);
// Sort value profiles by count in descending order.
llvm::stable_sort(ICallProfDataRef, [](const InstrProfValueData &LHS,
const InstrProfValueData &RHS) {
return LHS.Count > RHS.Count;
});
// Drop the <target-value, count> pair if count is zero.
ArrayRef<InstrProfValueData> VDs(
ICallProfDataRef.begin(),
llvm::upper_bound(ICallProfDataRef, 0U,
[](uint64_t Count, const InstrProfValueData &ProfData) {
return ProfData.Count <= Count;
}));
updateFuncValueProfiles(CB, VDs, TotalFuncCount, NumCandidates);
for (size_t I = 0; I < PromotedFuncCount.size(); I++) {
uint32_t Index = PromotedFuncCount[I].first;
ICallProfDataRef[Index].Count -=
std::max(PromotedFuncCount[I].second, ICallProfDataRef[Index].Count);
}
updateFuncValueProfiles(CB, ICallProfDataRef, TotalFuncCount, NumCandidates);
updateVPtrValueProfiles(VPtr, VTableGUIDCounts);
return true;
}
Expand All @@ -822,9 +874,22 @@ bool IndirectCallPromoter::processFunction(ProfileSummaryInfo *PSI) {
uint64_t TotalCount;
auto ICallProfDataRef = ICallAnalysis.getPromotionCandidatesForInstruction(
CB, TotalCount, NumCandidates);
if (!NumCandidates ||
(PSI && PSI->hasProfileSummary() && !PSI->isHotCount(TotalCount)))
if (!NumCandidates)
continue;
if (PSI && PSI->hasProfileSummary()) {
// Don't promote cold candidates.
if (PSI->isColdCount(TotalCount)) {
LLVM_DEBUG(dbgs() << "Don't promote the cold candidate: TotalCount="
<< TotalCount << "\n");
continue;
}
// Only pormote hot if ICPAllowHotOnly is true.
if (ICPAllowHotOnly && !PSI->isHotCount(TotalCount)) {
LLVM_DEBUG(dbgs() << "Don't promote the non-hot candidate: TotalCount="
<< TotalCount << "\n");
continue;
}
}

auto PromotionCandidates = getPromotionCandidatesForCallSite(
*CB, ICallProfDataRef, TotalCount, NumCandidates);
Expand Down
1 change: 1 addition & 0 deletions llvm/test/ThinLTO/X86/memprof-icp.ll
Original file line number Diff line number Diff line change
Expand Up @@ -229,6 +229,7 @@
; RUN: llvm-lto2 run %t/main.o %t/foo.o -enable-memprof-context-disambiguation \
; RUN: -import-instr-limit=0 \
; RUN: -memprof-require-definition-for-promotion \
; RUN: -icp-allow-decls=false \
; RUN: -enable-memprof-indirect-call-support=true \
; RUN: -supports-hot-cold-new \
; RUN: -r=%t/foo.o,_Z3fooR2B0j,plx \
Expand Down
6 changes: 3 additions & 3 deletions llvm/test/Transforms/PGOProfile/icp_mismatch_msg.ll
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
; RUN: opt < %s -passes=pgo-icall-prom -pass-remarks-missed=pgo-icall-prom -S 2>& 1 | FileCheck %s

; CHECK: remark: <unknown>:0:0: Cannot promote indirect call to func4 with count of 1234: The number of arguments mismatch
; CHECK: remark: <unknown>:0:0: Cannot promote indirect call: target with md5sum{{.*}} not found
; CHECK: remark: <unknown>:0:0: Cannot promote indirect call to func2 with count of 7890: Return type mismatch
; CHECK: remark: <unknown>:0:0: Cannot promote indirect call to func4 (count=1234): The number of arguments mismatch
; CHECK: remark: <unknown>:0:0: Cannot promote indirect call: target with md5sum {{.*}} not found (count=2345)
; CHECK: remark: <unknown>:0:0: Cannot promote indirect call to func2 (count=7890): Return type mismatch

target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
Expand Down
Loading