-
Notifications
You must be signed in to change notification settings - Fork 14.5k
[ICP] Add a few tunings to indirect-call-promotion #149892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-llvm-transforms Author: None (xur-llvm) ChangesFull diff: https://github.com/llvm/llvm-project/pull/149892.diff 3 Files Affected:
diff --git a/llvm/lib/Analysis/ProfileSummaryInfo.cpp b/llvm/lib/Analysis/ProfileSummaryInfo.cpp
index e8d4e37a4eb7e..7f361f47576d3 100644
--- a/llvm/lib/Analysis/ProfileSummaryInfo.cpp
+++ b/llvm/lib/Analysis/ProfileSummaryInfo.cpp
@@ -121,8 +121,18 @@ void ProfileSummaryInfo::computeThresholds() {
ProfileSummaryBuilder::getHotCountThreshold(DetailedSummary);
ColdCountThreshold =
ProfileSummaryBuilder::getColdCountThreshold(DetailedSummary);
- assert(ColdCountThreshold <= HotCountThreshold &&
- "Cold count threshold cannot exceed hot count threshold!");
+ // When the hot and cold thresholds are identical, we would classify
+ // a count value as both hot and cold since we are doing an inclusitve check
+ // (see ::is{Hot|Cold}Count(). To avoid this undesirable overlap, ensure the
+ // thresholds are distinct.
+ if (HotCountThreshold == ColdCountThreshold) {
+ if (ColdCountThreshold > 0)
+ (*ColdCountThreshold)--;
+ else
+ (*HotCountThreshold)++;
+ }
+ assert(ColdCountThreshold < HotCountThreshold &&
+ "Cold count threshold should be less than hot count threshold!");
if (!hasPartialSampleProfile() || !ScalePartialSampleProfileWorkingSetSize) {
HasHugeWorkingSetSize =
HotEntry.NumCounts > ProfileSummaryHugeWorkingSetSizeThreshold;
diff --git a/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp b/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp
index 854db0ff6864c..eafac58f0580a 100644
--- a/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp
+++ b/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp
@@ -80,6 +80,18 @@ static cl::opt<unsigned>
ICPCSSkip("icp-csskip", cl::init(0), cl::Hidden,
cl::desc("Skip Callsite up to this number for this compilation"));
+// ICP the candidate function even when only a declaration is present.
+static cl::opt<bool>
+ ICPAllowDeclOnly("icp-allow-decl-only", cl::init(true), cl::Hidden,
+ cl::desc("Promote the target candidate even when the defintion "
+ " is not available"));
+
+// ICP all non-cold candidate functions. When it's false, only ICP hot functions.
+static cl::opt<bool>
+ ICPAllowWarmFunc("icp-allow-warm-func", cl::init(true), cl::Hidden,
+ cl::desc("Promote the target candidate even if it is not "
+ "hot"));
+
// Set if the pass is called in LTO optimization. The difference for LTO mode
// is the pass won't prefix the source module name to the internal linkage
// symbols.
@@ -477,7 +489,7 @@ IndirectCallPromoter::getPromotionCandidatesForCallSite(
// the case where the symbol is globally dead in the binary and removed by
// ThinLTO.
Function *TargetFunction = Symtab->getFunction(Target);
- if (TargetFunction == nullptr || TargetFunction->isDeclaration()) {
+ if (TargetFunction == nullptr) {
LLVM_DEBUG(dbgs() << " Not promote: Cannot find the target\n");
ORE.emit([&]() {
return OptimizationRemarkMissed(DEBUG_TYPE, "UnableToFindTarget", &CB)
@@ -486,6 +498,16 @@ IndirectCallPromoter::getPromotionCandidatesForCallSite(
});
break;
}
+ if (!ICPAllowDeclOnly && TargetFunction->isDeclaration()) {
+ LLVM_DEBUG(dbgs() << " Not promote: target definition is not available\n");
+ ORE.emit([&]() {
+ return OptimizationRemarkMissed(DEBUG_TYPE, "NoTargetDef", &CB)
+ << "Do not promote indirect call: target with md5sum "
+ << ore::NV("target md5sum", Target)
+ << " definition not available";
+ });
+ break;
+ }
const char *Reason = nullptr;
if (!isLegalToPromote(CB, TargetFunction, &Reason)) {
@@ -822,9 +844,18 @@ bool IndirectCallPromoter::processFunction(ProfileSummaryInfo *PSI) {
uint64_t TotalCount;
auto ICallProfDataRef = ICallAnalysis.getPromotionCandidatesForInstruction(
CB, TotalCount, NumCandidates);
- if (!NumCandidates ||
- (PSI && PSI->hasProfileSummary() && !PSI->isHotCount(TotalCount)))
+ if (!NumCandidates)
continue;
+ if (PSI && PSI->hasProfileSummary()) {
+ // Don't perform if ICPAllowWarmFunc is true AND the count is cold, OR
+ // ICPAllowWarmFunc is false AND the count is NOT hot.
+ if ((ICPAllowWarmFunc && PSI->isColdCount(TotalCount)) ||
+ (!ICPAllowWarmFunc && !PSI->isHotCount(TotalCount))) {
+ LLVM_DEBUG(dbgs() << " TotalCount=" << TotalCount
+ << " is not large enough.\n");
+ continue;
+ }
+ }
auto PromotionCandidates = getPromotionCandidatesForCallSite(
*CB, ICallProfDataRef, TotalCount, NumCandidates);
diff --git a/llvm/test/ThinLTO/X86/memprof-icp.ll b/llvm/test/ThinLTO/X86/memprof-icp.ll
index dbc532ee52828..87e5435a4229d 100644
--- a/llvm/test/ThinLTO/X86/memprof-icp.ll
+++ b/llvm/test/ThinLTO/X86/memprof-icp.ll
@@ -229,6 +229,7 @@
; RUN: llvm-lto2 run %t/main.o %t/foo.o -enable-memprof-context-disambiguation \
; RUN: -import-instr-limit=0 \
; RUN: -memprof-require-definition-for-promotion \
+; RUN: -icp-allow-decl-only=false \
; RUN: -enable-memprof-indirect-call-support=true \
; RUN: -supports-hot-cold-new \
; RUN: -r=%t/foo.o,_Z3fooR2B0j,plx \
|
@llvm/pr-subscribers-lto Author: None (xur-llvm) ChangesFull diff: https://github.com/llvm/llvm-project/pull/149892.diff 3 Files Affected:
diff --git a/llvm/lib/Analysis/ProfileSummaryInfo.cpp b/llvm/lib/Analysis/ProfileSummaryInfo.cpp
index e8d4e37a4eb7e..7f361f47576d3 100644
--- a/llvm/lib/Analysis/ProfileSummaryInfo.cpp
+++ b/llvm/lib/Analysis/ProfileSummaryInfo.cpp
@@ -121,8 +121,18 @@ void ProfileSummaryInfo::computeThresholds() {
ProfileSummaryBuilder::getHotCountThreshold(DetailedSummary);
ColdCountThreshold =
ProfileSummaryBuilder::getColdCountThreshold(DetailedSummary);
- assert(ColdCountThreshold <= HotCountThreshold &&
- "Cold count threshold cannot exceed hot count threshold!");
+ // When the hot and cold thresholds are identical, we would classify
+ // a count value as both hot and cold since we are doing an inclusitve check
+ // (see ::is{Hot|Cold}Count(). To avoid this undesirable overlap, ensure the
+ // thresholds are distinct.
+ if (HotCountThreshold == ColdCountThreshold) {
+ if (ColdCountThreshold > 0)
+ (*ColdCountThreshold)--;
+ else
+ (*HotCountThreshold)++;
+ }
+ assert(ColdCountThreshold < HotCountThreshold &&
+ "Cold count threshold should be less than hot count threshold!");
if (!hasPartialSampleProfile() || !ScalePartialSampleProfileWorkingSetSize) {
HasHugeWorkingSetSize =
HotEntry.NumCounts > ProfileSummaryHugeWorkingSetSizeThreshold;
diff --git a/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp b/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp
index 854db0ff6864c..eafac58f0580a 100644
--- a/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp
+++ b/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp
@@ -80,6 +80,18 @@ static cl::opt<unsigned>
ICPCSSkip("icp-csskip", cl::init(0), cl::Hidden,
cl::desc("Skip Callsite up to this number for this compilation"));
+// ICP the candidate function even when only a declaration is present.
+static cl::opt<bool>
+ ICPAllowDeclOnly("icp-allow-decl-only", cl::init(true), cl::Hidden,
+ cl::desc("Promote the target candidate even when the defintion "
+ " is not available"));
+
+// ICP all non-cold candidate functions. When it's false, only ICP hot functions.
+static cl::opt<bool>
+ ICPAllowWarmFunc("icp-allow-warm-func", cl::init(true), cl::Hidden,
+ cl::desc("Promote the target candidate even if it is not "
+ "hot"));
+
// Set if the pass is called in LTO optimization. The difference for LTO mode
// is the pass won't prefix the source module name to the internal linkage
// symbols.
@@ -477,7 +489,7 @@ IndirectCallPromoter::getPromotionCandidatesForCallSite(
// the case where the symbol is globally dead in the binary and removed by
// ThinLTO.
Function *TargetFunction = Symtab->getFunction(Target);
- if (TargetFunction == nullptr || TargetFunction->isDeclaration()) {
+ if (TargetFunction == nullptr) {
LLVM_DEBUG(dbgs() << " Not promote: Cannot find the target\n");
ORE.emit([&]() {
return OptimizationRemarkMissed(DEBUG_TYPE, "UnableToFindTarget", &CB)
@@ -486,6 +498,16 @@ IndirectCallPromoter::getPromotionCandidatesForCallSite(
});
break;
}
+ if (!ICPAllowDeclOnly && TargetFunction->isDeclaration()) {
+ LLVM_DEBUG(dbgs() << " Not promote: target definition is not available\n");
+ ORE.emit([&]() {
+ return OptimizationRemarkMissed(DEBUG_TYPE, "NoTargetDef", &CB)
+ << "Do not promote indirect call: target with md5sum "
+ << ore::NV("target md5sum", Target)
+ << " definition not available";
+ });
+ break;
+ }
const char *Reason = nullptr;
if (!isLegalToPromote(CB, TargetFunction, &Reason)) {
@@ -822,9 +844,18 @@ bool IndirectCallPromoter::processFunction(ProfileSummaryInfo *PSI) {
uint64_t TotalCount;
auto ICallProfDataRef = ICallAnalysis.getPromotionCandidatesForInstruction(
CB, TotalCount, NumCandidates);
- if (!NumCandidates ||
- (PSI && PSI->hasProfileSummary() && !PSI->isHotCount(TotalCount)))
+ if (!NumCandidates)
continue;
+ if (PSI && PSI->hasProfileSummary()) {
+ // Don't perform if ICPAllowWarmFunc is true AND the count is cold, OR
+ // ICPAllowWarmFunc is false AND the count is NOT hot.
+ if ((ICPAllowWarmFunc && PSI->isColdCount(TotalCount)) ||
+ (!ICPAllowWarmFunc && !PSI->isHotCount(TotalCount))) {
+ LLVM_DEBUG(dbgs() << " TotalCount=" << TotalCount
+ << " is not large enough.\n");
+ continue;
+ }
+ }
auto PromotionCandidates = getPromotionCandidatesForCallSite(
*CB, ICallProfDataRef, TotalCount, NumCandidates);
diff --git a/llvm/test/ThinLTO/X86/memprof-icp.ll b/llvm/test/ThinLTO/X86/memprof-icp.ll
index dbc532ee52828..87e5435a4229d 100644
--- a/llvm/test/ThinLTO/X86/memprof-icp.ll
+++ b/llvm/test/ThinLTO/X86/memprof-icp.ll
@@ -229,6 +229,7 @@
; RUN: llvm-lto2 run %t/main.o %t/foo.o -enable-memprof-context-disambiguation \
; RUN: -import-instr-limit=0 \
; RUN: -memprof-require-definition-for-promotion \
+; RUN: -icp-allow-decl-only=false \
; RUN: -enable-memprof-indirect-call-support=true \
; RUN: -supports-hot-cold-new \
; RUN: -r=%t/foo.o,_Z3fooR2B0j,plx \
|
@llvm/pr-subscribers-llvm-analysis Author: None (xur-llvm) ChangesFull diff: https://github.com/llvm/llvm-project/pull/149892.diff 3 Files Affected:
diff --git a/llvm/lib/Analysis/ProfileSummaryInfo.cpp b/llvm/lib/Analysis/ProfileSummaryInfo.cpp
index e8d4e37a4eb7e..7f361f47576d3 100644
--- a/llvm/lib/Analysis/ProfileSummaryInfo.cpp
+++ b/llvm/lib/Analysis/ProfileSummaryInfo.cpp
@@ -121,8 +121,18 @@ void ProfileSummaryInfo::computeThresholds() {
ProfileSummaryBuilder::getHotCountThreshold(DetailedSummary);
ColdCountThreshold =
ProfileSummaryBuilder::getColdCountThreshold(DetailedSummary);
- assert(ColdCountThreshold <= HotCountThreshold &&
- "Cold count threshold cannot exceed hot count threshold!");
+ // When the hot and cold thresholds are identical, we would classify
+ // a count value as both hot and cold since we are doing an inclusitve check
+ // (see ::is{Hot|Cold}Count(). To avoid this undesirable overlap, ensure the
+ // thresholds are distinct.
+ if (HotCountThreshold == ColdCountThreshold) {
+ if (ColdCountThreshold > 0)
+ (*ColdCountThreshold)--;
+ else
+ (*HotCountThreshold)++;
+ }
+ assert(ColdCountThreshold < HotCountThreshold &&
+ "Cold count threshold should be less than hot count threshold!");
if (!hasPartialSampleProfile() || !ScalePartialSampleProfileWorkingSetSize) {
HasHugeWorkingSetSize =
HotEntry.NumCounts > ProfileSummaryHugeWorkingSetSizeThreshold;
diff --git a/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp b/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp
index 854db0ff6864c..eafac58f0580a 100644
--- a/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp
+++ b/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp
@@ -80,6 +80,18 @@ static cl::opt<unsigned>
ICPCSSkip("icp-csskip", cl::init(0), cl::Hidden,
cl::desc("Skip Callsite up to this number for this compilation"));
+// ICP the candidate function even when only a declaration is present.
+static cl::opt<bool>
+ ICPAllowDeclOnly("icp-allow-decl-only", cl::init(true), cl::Hidden,
+ cl::desc("Promote the target candidate even when the defintion "
+ " is not available"));
+
+// ICP all non-cold candidate functions. When it's false, only ICP hot functions.
+static cl::opt<bool>
+ ICPAllowWarmFunc("icp-allow-warm-func", cl::init(true), cl::Hidden,
+ cl::desc("Promote the target candidate even if it is not "
+ "hot"));
+
// Set if the pass is called in LTO optimization. The difference for LTO mode
// is the pass won't prefix the source module name to the internal linkage
// symbols.
@@ -477,7 +489,7 @@ IndirectCallPromoter::getPromotionCandidatesForCallSite(
// the case where the symbol is globally dead in the binary and removed by
// ThinLTO.
Function *TargetFunction = Symtab->getFunction(Target);
- if (TargetFunction == nullptr || TargetFunction->isDeclaration()) {
+ if (TargetFunction == nullptr) {
LLVM_DEBUG(dbgs() << " Not promote: Cannot find the target\n");
ORE.emit([&]() {
return OptimizationRemarkMissed(DEBUG_TYPE, "UnableToFindTarget", &CB)
@@ -486,6 +498,16 @@ IndirectCallPromoter::getPromotionCandidatesForCallSite(
});
break;
}
+ if (!ICPAllowDeclOnly && TargetFunction->isDeclaration()) {
+ LLVM_DEBUG(dbgs() << " Not promote: target definition is not available\n");
+ ORE.emit([&]() {
+ return OptimizationRemarkMissed(DEBUG_TYPE, "NoTargetDef", &CB)
+ << "Do not promote indirect call: target with md5sum "
+ << ore::NV("target md5sum", Target)
+ << " definition not available";
+ });
+ break;
+ }
const char *Reason = nullptr;
if (!isLegalToPromote(CB, TargetFunction, &Reason)) {
@@ -822,9 +844,18 @@ bool IndirectCallPromoter::processFunction(ProfileSummaryInfo *PSI) {
uint64_t TotalCount;
auto ICallProfDataRef = ICallAnalysis.getPromotionCandidatesForInstruction(
CB, TotalCount, NumCandidates);
- if (!NumCandidates ||
- (PSI && PSI->hasProfileSummary() && !PSI->isHotCount(TotalCount)))
+ if (!NumCandidates)
continue;
+ if (PSI && PSI->hasProfileSummary()) {
+ // Don't perform if ICPAllowWarmFunc is true AND the count is cold, OR
+ // ICPAllowWarmFunc is false AND the count is NOT hot.
+ if ((ICPAllowWarmFunc && PSI->isColdCount(TotalCount)) ||
+ (!ICPAllowWarmFunc && !PSI->isHotCount(TotalCount))) {
+ LLVM_DEBUG(dbgs() << " TotalCount=" << TotalCount
+ << " is not large enough.\n");
+ continue;
+ }
+ }
auto PromotionCandidates = getPromotionCandidatesForCallSite(
*CB, ICallProfDataRef, TotalCount, NumCandidates);
diff --git a/llvm/test/ThinLTO/X86/memprof-icp.ll b/llvm/test/ThinLTO/X86/memprof-icp.ll
index dbc532ee52828..87e5435a4229d 100644
--- a/llvm/test/ThinLTO/X86/memprof-icp.ll
+++ b/llvm/test/ThinLTO/X86/memprof-icp.ll
@@ -229,6 +229,7 @@
; RUN: llvm-lto2 run %t/main.o %t/foo.o -enable-memprof-context-disambiguation \
; RUN: -import-instr-limit=0 \
; RUN: -memprof-require-definition-for-promotion \
+; RUN: -icp-allow-decl-only=false \
; RUN: -enable-memprof-indirect-call-support=true \
; RUN: -supports-hot-cold-new \
; RUN: -r=%t/foo.o,_Z3fooR2B0j,plx \
|
@llvm/pr-subscribers-pgo Author: None (xur-llvm) ChangesFull diff: https://github.com/llvm/llvm-project/pull/149892.diff 3 Files Affected:
diff --git a/llvm/lib/Analysis/ProfileSummaryInfo.cpp b/llvm/lib/Analysis/ProfileSummaryInfo.cpp
index e8d4e37a4eb7e..7f361f47576d3 100644
--- a/llvm/lib/Analysis/ProfileSummaryInfo.cpp
+++ b/llvm/lib/Analysis/ProfileSummaryInfo.cpp
@@ -121,8 +121,18 @@ void ProfileSummaryInfo::computeThresholds() {
ProfileSummaryBuilder::getHotCountThreshold(DetailedSummary);
ColdCountThreshold =
ProfileSummaryBuilder::getColdCountThreshold(DetailedSummary);
- assert(ColdCountThreshold <= HotCountThreshold &&
- "Cold count threshold cannot exceed hot count threshold!");
+ // When the hot and cold thresholds are identical, we would classify
+ // a count value as both hot and cold since we are doing an inclusitve check
+ // (see ::is{Hot|Cold}Count(). To avoid this undesirable overlap, ensure the
+ // thresholds are distinct.
+ if (HotCountThreshold == ColdCountThreshold) {
+ if (ColdCountThreshold > 0)
+ (*ColdCountThreshold)--;
+ else
+ (*HotCountThreshold)++;
+ }
+ assert(ColdCountThreshold < HotCountThreshold &&
+ "Cold count threshold should be less than hot count threshold!");
if (!hasPartialSampleProfile() || !ScalePartialSampleProfileWorkingSetSize) {
HasHugeWorkingSetSize =
HotEntry.NumCounts > ProfileSummaryHugeWorkingSetSizeThreshold;
diff --git a/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp b/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp
index 854db0ff6864c..eafac58f0580a 100644
--- a/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp
+++ b/llvm/lib/Transforms/Instrumentation/IndirectCallPromotion.cpp
@@ -80,6 +80,18 @@ static cl::opt<unsigned>
ICPCSSkip("icp-csskip", cl::init(0), cl::Hidden,
cl::desc("Skip Callsite up to this number for this compilation"));
+// ICP the candidate function even when only a declaration is present.
+static cl::opt<bool>
+ ICPAllowDeclOnly("icp-allow-decl-only", cl::init(true), cl::Hidden,
+ cl::desc("Promote the target candidate even when the defintion "
+ " is not available"));
+
+// ICP all non-cold candidate functions. When it's false, only ICP hot functions.
+static cl::opt<bool>
+ ICPAllowWarmFunc("icp-allow-warm-func", cl::init(true), cl::Hidden,
+ cl::desc("Promote the target candidate even if it is not "
+ "hot"));
+
// Set if the pass is called in LTO optimization. The difference for LTO mode
// is the pass won't prefix the source module name to the internal linkage
// symbols.
@@ -477,7 +489,7 @@ IndirectCallPromoter::getPromotionCandidatesForCallSite(
// the case where the symbol is globally dead in the binary and removed by
// ThinLTO.
Function *TargetFunction = Symtab->getFunction(Target);
- if (TargetFunction == nullptr || TargetFunction->isDeclaration()) {
+ if (TargetFunction == nullptr) {
LLVM_DEBUG(dbgs() << " Not promote: Cannot find the target\n");
ORE.emit([&]() {
return OptimizationRemarkMissed(DEBUG_TYPE, "UnableToFindTarget", &CB)
@@ -486,6 +498,16 @@ IndirectCallPromoter::getPromotionCandidatesForCallSite(
});
break;
}
+ if (!ICPAllowDeclOnly && TargetFunction->isDeclaration()) {
+ LLVM_DEBUG(dbgs() << " Not promote: target definition is not available\n");
+ ORE.emit([&]() {
+ return OptimizationRemarkMissed(DEBUG_TYPE, "NoTargetDef", &CB)
+ << "Do not promote indirect call: target with md5sum "
+ << ore::NV("target md5sum", Target)
+ << " definition not available";
+ });
+ break;
+ }
const char *Reason = nullptr;
if (!isLegalToPromote(CB, TargetFunction, &Reason)) {
@@ -822,9 +844,18 @@ bool IndirectCallPromoter::processFunction(ProfileSummaryInfo *PSI) {
uint64_t TotalCount;
auto ICallProfDataRef = ICallAnalysis.getPromotionCandidatesForInstruction(
CB, TotalCount, NumCandidates);
- if (!NumCandidates ||
- (PSI && PSI->hasProfileSummary() && !PSI->isHotCount(TotalCount)))
+ if (!NumCandidates)
continue;
+ if (PSI && PSI->hasProfileSummary()) {
+ // Don't perform if ICPAllowWarmFunc is true AND the count is cold, OR
+ // ICPAllowWarmFunc is false AND the count is NOT hot.
+ if ((ICPAllowWarmFunc && PSI->isColdCount(TotalCount)) ||
+ (!ICPAllowWarmFunc && !PSI->isHotCount(TotalCount))) {
+ LLVM_DEBUG(dbgs() << " TotalCount=" << TotalCount
+ << " is not large enough.\n");
+ continue;
+ }
+ }
auto PromotionCandidates = getPromotionCandidatesForCallSite(
*CB, ICallProfDataRef, TotalCount, NumCandidates);
diff --git a/llvm/test/ThinLTO/X86/memprof-icp.ll b/llvm/test/ThinLTO/X86/memprof-icp.ll
index dbc532ee52828..87e5435a4229d 100644
--- a/llvm/test/ThinLTO/X86/memprof-icp.ll
+++ b/llvm/test/ThinLTO/X86/memprof-icp.ll
@@ -229,6 +229,7 @@
; RUN: llvm-lto2 run %t/main.o %t/foo.o -enable-memprof-context-disambiguation \
; RUN: -import-instr-limit=0 \
; RUN: -memprof-require-definition-for-promotion \
+; RUN: -icp-allow-decl-only=false \
; RUN: -enable-memprof-indirect-call-support=true \
; RUN: -supports-hot-cold-new \
; RUN: -r=%t/foo.o,_Z3fooR2B0j,plx \
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a description summarizing the changes. Also the changes need tests.
assert(ColdCountThreshold <= HotCountThreshold && | ||
"Cold count threshold cannot exceed hot count threshold!"); | ||
// When the hot and cold thresholds are identical, we would classify | ||
// a count value as both hot and cold since we are doing an inclusitve check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/inclusitve/inclusive/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add test cases?
// (see ::is{Hot|Cold}Count(). To avoid this undesirable overlap, ensure the | ||
// thresholds are distinct. | ||
if (HotCountThreshold == ColdCountThreshold) { | ||
if (ColdCountThreshold > 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
simplify it to just bump the hot count threshold?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not a good idea as this will lead tons of test changes. Also change existing opt behaviors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean bump it when the cold and hot thresholds equal -- that should not be frequent.
@@ -80,6 +80,18 @@ static cl::opt<unsigned> | |||
ICPCSSkip("icp-csskip", cl::init(0), cl::Hidden, | |||
cl::desc("Skip Callsite up to this number for this compilation")); | |||
|
|||
// ICP the candidate function even when only a declaration is present. | |||
static cl::opt<bool> | |||
ICPAllowDeclOnly("icp-allow-decl-only", cl::init(true), cl::Hidden, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should it be off initially to avoid behavior change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do want the behavior change, right?
continue; | ||
if (PSI && PSI->hasProfileSummary()) { | ||
// Don't perform if ICPAllowWarmFunc is true AND the count is cold, OR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add early check: if (is_cold) continue;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
// ICP all non-cold candidate functions. When it's false, only ICP hot functions. | ||
static cl::opt<bool> | ||
ICPAllowWarmFunc("icp-allow-warm-func", cl::init(true), cl::Hidden, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change it to 'icp-allow-hot-only' and set it to true by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
if (PSI && PSI->hasProfileSummary()) { | ||
// Don't perform if ICPAllowWarmFunc is true AND the count is cold, OR | ||
// ICPAllowWarmFunc is false AND the count is NOT hot. | ||
if ((ICPAllowWarmFunc && PSI->isColdCount(TotalCount)) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is cleaner to do this:
if (is_cold)
continue;
if (is_hot_only && !is_hot)
continue;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
If hot and cold thresholds are identical, both isColdCount() and isHotCount() can return true due to inclusive checks in these functions. This is undesirable. This patch ensures distinct hot and cold counter thresholds. An alternative fix is to make the check in isColdCount() exclusive. But that requires many test case changes.
884231d
to
66524b3
Compare
Integrated reviews from David and Teresa in the latest patchset. |
66524b3
to
b2ddd5c
Compare
@@ -80,6 +80,27 @@ static cl::opt<unsigned> | |||
ICPCSSkip("icp-csskip", cl::init(0), cl::Hidden, | |||
cl::desc("Skip Callsite up to this number for this compilation")); | |||
|
|||
// ICP the candidate function even when only a declaration is present. | |||
static cl::opt<bool> ICPAllowDeclOnly( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naming can be confused as 'only allow decls (without definition)' while it means 'allow symbols with only declarations'. To avoid the confusion, rename it to just 'ICPAllowDecls'.
|
||
// If one target cannot be ICP'd, proceed with the remaining targets instead | ||
// of exiting the callsite | ||
static cl::opt<bool> ICPAllowAllocSkip( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ICPAllowCandidateSkip matches the meaning more.
; RUN: opt < %s -passes=pgo-icall-prom -icp-allow-decl-only=false -icp-allow-hot-only=true -icp-allow-skip=false -S -pass-remarks=pgo-icall-prom 2>&1 | FileCheck %s --check-prefix=REMARK1 | ||
; RUN: opt < %s -passes=pgo-icall-prom -icp-allow-decl-only=false -icp-allow-hot-only=false -icp-allow-skip=false -S -pass-remarks=pgo-icall-prom 2>&1 | FileCheck %s --check-prefixes=REMARK1 | ||
; RUN: opt < %s -passes=pgo-icall-prom -icp-allow-decl-only=false -icp-allow-hot-only=false -icp-allow-skip=true -S -pass-remarks=pgo-icall-prom 2>&1 | FileCheck %s --check-prefixes=REMARK1,REMARK3 | ||
; RUN: opt < %s -passes=pgo-icall-prom -icp-allow-hot-only=false -icp-allow-decl-only=true -icp-allow-skip=true -S -pass-remarks=pgo-icall-prom 2>&1 | FileCheck %s --check-prefixes=REMARK1,REMARK2,REMARK4,REMARK5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: can you put the flags in the same order, so it is easier to compare the different RUN lines? I think they mostly are except for the final invocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, when allow skip is true can you double check that the VP metadata is updated correctly? I.e. because unlike previously, we aren't always needing to remove the first N.
b2ddd5c
to
b1cbd3e
Compare
Updated based on David and Teresa reviews. |
Indirect-call promotion (ICP) has been adjusted with the following tunings. (1) Candidate functions can be now ICP'd even if only a declaration is present. (2) All non-cold candidate functions are now considered by ICP. Previously, only hot targets were considered. (3) If one target cannot be ICP'd, proceed with the remaining targets instead of exiting the callsite. In this patch, all tunings are disabled by default. They will be enabled in a subsequent patch.
b1cbd3e
to
d81f611
Compare
[ICP] Add a few tunings to indirect-call-promtion
Indirect-call promotion (ICP) has been adjusted with the following tunings.
(1) Candidate functions can be now ICP'd even if only a declaration is present.
(2) All non-cold candidate functions are now considered by ICP. Previously, only hot targets were considered.
(3) If one target cannot be ICP'd, proceed with the remaining targets instead of exiting the callsite.
This update hides all tunings under internal options and disables them by default. They'll be enabled in a later update. There'll also be another update to address the "not found" issue with indirect targets.