Skip to content

Commit 6d4f0e2

Browse files
authored
Merge pull request #1280 from rylev/artifact-comparison
Make the term 'artifact comparison' consistent
2 parents c7cce59 + 7379c8a commit 6d4f0e2

File tree

3 files changed

+48
-59
lines changed

3 files changed

+48
-59
lines changed

docs/glossary.md

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -8,18 +8,14 @@ The following is a glossary of domain specific terminology. Although benchmarks
88
* **profile**: a [cargo profile](https://doc.rust-lang.org/cargo/reference/profiles.html). Note: the database uses "opt" whereas cargo uses "release".
99
* **scenario**: The scenario under which a user is compiling their code. Currently, this is the incremental cache state and an optional change in the source since last compilation (e.g., full incremental cache and a `println!` statement is added).
1010
* **metric**: a name of a quantifiable metric being measured (e.g., instruction count)
11-
* **artifact**: a specific version of rustc (usually a commit sha or some sort of human readable "tag" like 1.51.0)
11+
* **artifact**: a specific rustc binary labeled by some identifier tag (usually a commit sha or some sort of human readable id like "1.51.0" or "test")
1212
* **category**: a high-level group of benchmarks. Currently, there are three categories, primary (mostly real-world crates), secondary (mostly stress tests), and stable (old real-world crates, only used for the dashboard).
1313

1414
## Benchmarks
1515

1616
* **stress test benchmark**: a benchmark that is specifically designed to stress a certain part of the compiler. For example, [projection-caching](https://github.com/rust-lang/rustc-perf/tree/master/collector/benchmarks/projection-caching) stresses the compiler's projection caching mechanisms.
1717
* **real world benchmark**: a benchmark based on a real world crate. These are typically copied as-is from crates.io. For example, [serde](https://github.com/rust-lang/rustc-perf/tree/master/collector/benchmarks/serde-1.0.136) is a popular crate and the benchmark has not been altered from a release of serde on crates.io.
1818

19-
## Artifacts
20-
21-
* **artifact tag**: an identifier for a particular artifact (e.g., the tag "1.51.0" would presumably point to the 1.51.0 release of rustc).
22-
2319
## Testing
2420

2521
* **test case**: a combination of a benchmark, a profile, and a scenario.
@@ -33,7 +29,8 @@ The following is a glossary of domain specific terminology. Although benchmarks
3329

3430
## Analysis
3531

36-
* **test result comparison**: the delta between two test results for the same test case but (optionally) different artifacts. The [comparison page](https://perf.rust-lang.org/compare.html) lists all the test result comparisons as percentages between two runs.
32+
* **artifact comparisons**: the comparison of two artifacts. This is composed of many test result comparisons. The [comparison page](https://perf.rust-lang.org/compare.html) shows a single artifact comparison between two artifacts.
33+
* **test result comparison**: the delta between two test results for the same test case but different artifacts. The [comparison page](https://perf.rust-lang.org/compare.html) lists all the test result comparisons as percentages between two runs.
3734
* **significance threshold**: the threshold at which a test result comparison is considered "significant" (i.e., a real change in performance and not just noise). You can see how this is calculated [here](https://github.com/rust-lang/rustc-perf/blob/master/docs/comparison-analysis.md#what-makes-a-test-result-significant).
3835
* **significant test result comparison**: a test result comparison above the significance threshold. Significant test result comparisons can be thought of as being "statistically significant".
3936
* **relevant test result comparison**: a test result comparison can be significant but still not be relevant (i.e., worth paying attention to). Relevance is a factor of the test result comparison's significance and magnitude. Comparisons are considered relevant if they are significant and have at least a small magnitude .

site/src/comparison.rs

Lines changed: 41 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -111,7 +111,7 @@ pub async fn handle_compare(
111111
let benchmark_map = conn.get_benchmarks().await;
112112

113113
let comparisons = comparison
114-
.statistics
114+
.comparisons
115115
.into_iter()
116116
.map(|comparison| api::comparison::Comparison {
117117
benchmark: comparison.benchmark.to_string(),
@@ -145,7 +145,7 @@ pub async fn handle_compare(
145145

146146
async fn populate_report(
147147
ctxt: &SiteCtxt,
148-
comparison: &Comparison,
148+
comparison: &ArtifactComparison,
149149
report: &mut HashMap<Direction, Vec<String>>,
150150
) {
151151
let benchmark_map = ctxt.get_benchmark_category_map().await;
@@ -177,7 +177,7 @@ async fn populate_report(
177177
/// A summary of a given comparison
178178
///
179179
/// This summary only includes changes that are significant and relevant (as determined by a change's magnitude).
180-
pub struct ComparisonSummary {
180+
pub struct ArtifactComparisonSummary {
181181
/// Relevant comparisons ordered by magnitude from largest to smallest
182182
relevant_comparisons: Vec<TestResultComparison>,
183183
/// The cached number of comparisons that are improvements
@@ -186,13 +186,13 @@ pub struct ComparisonSummary {
186186
num_regressions: usize,
187187
}
188188

189-
impl ComparisonSummary {
190-
pub fn summarize_comparison(comparison: &Comparison) -> Self {
189+
impl ArtifactComparisonSummary {
190+
pub fn summarize(comparison: &ArtifactComparison) -> Self {
191191
let mut num_improvements = 0;
192192
let mut num_regressions = 0;
193193

194194
let mut comparisons = comparison
195-
.statistics
195+
.comparisons
196196
.iter()
197197
.filter(|c| c.is_relevant())
198198
.inspect(|c| {
@@ -213,7 +213,7 @@ impl ComparisonSummary {
213213
};
214214
comparisons.sort_by(cmp);
215215

216-
ComparisonSummary {
216+
ArtifactComparisonSummary {
217217
relevant_comparisons: comparisons,
218218
num_improvements,
219219
num_regressions,
@@ -334,14 +334,6 @@ impl ComparisonSummary {
334334
.filter(|c| c.is_regression())
335335
}
336336

337-
fn num_changes(&self) -> usize {
338-
self.relevant_comparisons.len()
339-
}
340-
341-
fn largest_change(&self) -> Option<&TestResultComparison> {
342-
self.relevant_comparisons.first()
343-
}
344-
345337
fn largest_improvement(&self) -> Option<&TestResultComparison> {
346338
self.relevant_comparisons
347339
.iter()
@@ -367,9 +359,9 @@ impl ComparisonSummary {
367359
}
368360

369361
async fn write_triage_summary(
370-
comparison: &Comparison,
371-
primary: &ComparisonSummary,
372-
secondary: &ComparisonSummary,
362+
comparison: &ArtifactComparison,
363+
primary: &ArtifactComparisonSummary,
364+
secondary: &ArtifactComparisonSummary,
373365
) -> String {
374366
use std::fmt::Write;
375367
let mut result = if let Some(pr) = comparison.b.pr {
@@ -393,8 +385,8 @@ async fn write_triage_summary(
393385

394386
/// Writes a Markdown table containing summary of relevant results.
395387
pub fn write_summary_table(
396-
primary: &ComparisonSummary,
397-
secondary: &ComparisonSummary,
388+
primary: &ArtifactComparisonSummary,
389+
secondary: &ArtifactComparisonSummary,
398390
with_footnotes: bool,
399391
result: &mut String,
400392
) {
@@ -512,7 +504,7 @@ pub async fn compare(
512504
end: Bound,
513505
stat: String,
514506
ctxt: &SiteCtxt,
515-
) -> Result<Option<Comparison>, BoxedError> {
507+
) -> Result<Option<ArtifactComparison>, BoxedError> {
516508
let master_commits = &ctxt.get_master_commits().commits;
517509

518510
compare_given_commits(start, end, stat, ctxt, master_commits).await
@@ -525,7 +517,7 @@ async fn compare_given_commits(
525517
stat: String,
526518
ctxt: &SiteCtxt,
527519
master_commits: &[collector::MasterCommit],
528-
) -> Result<Option<Comparison>, BoxedError> {
520+
) -> Result<Option<ArtifactComparison>, BoxedError> {
529521
let idx = ctxt.index.load();
530522
let a = ctxt
531523
.artifact_id_for_bound(start.clone(), true)
@@ -553,7 +545,7 @@ async fn compare_given_commits(
553545

554546
let mut historical_data =
555547
HistoricalDataMap::calculate(ctxt, a.clone(), master_commits, stat).await?;
556-
let statistics = statistics_for_a
548+
let comparisons = statistics_for_a
557549
.into_iter()
558550
.filter_map(|(test_case, a)| {
559551
statistics_for_b
@@ -574,10 +566,10 @@ async fn compare_given_commits(
574566
errors_in_b.remove(&name);
575567
}
576568

577-
Ok(Some(Comparison {
569+
Ok(Some(ArtifactComparison {
578570
a: ArtifactDescription::for_artifact(&*conn, a.clone(), master_commits).await,
579571
b: ArtifactDescription::for_artifact(&*conn, b.clone(), master_commits).await,
580-
statistics,
572+
comparisons,
581573
newly_failed_benchmarks: errors_in_b.into_iter().collect(),
582574
}))
583575
}
@@ -720,16 +712,16 @@ impl From<ArtifactDescription> for api::comparison::ArtifactDescription {
720712

721713
// A comparison of two artifacts
722714
#[derive(Clone)]
723-
pub struct Comparison {
715+
pub struct ArtifactComparison {
724716
pub a: ArtifactDescription,
725717
pub b: ArtifactDescription,
726-
/// Statistics based on test case
727-
pub statistics: HashSet<TestResultComparison>,
718+
/// Test result comparisons between the two artifacts
719+
pub comparisons: HashSet<TestResultComparison>,
728720
/// A map from benchmark name to an error which occured when building `b` but not `a`.
729721
pub newly_failed_benchmarks: HashMap<String, String>,
730722
}
731723

732-
impl Comparison {
724+
impl ArtifactComparison {
733725
/// Gets the previous commit before `a`
734726
pub fn prev(&self, master_commits: &[collector::MasterCommit]) -> Option<String> {
735727
prev_commit(&self.a.artifact, master_commits).map(|c| c.sha.clone())
@@ -758,13 +750,13 @@ impl Comparison {
758750
next_commit(&self.b.artifact, master_commits).map(|c| c.sha.clone())
759751
}
760752

761-
/// Splits comparison into primary and secondary summaries based on benchmark category
753+
/// Splits an artifact comparison into primary and secondary summaries based on benchmark category
762754
pub fn summarize_by_category(
763755
self,
764756
category_map: HashMap<Benchmark, Category>,
765-
) -> (ComparisonSummary, ComparisonSummary) {
757+
) -> (ArtifactComparisonSummary, ArtifactComparisonSummary) {
766758
let (primary, secondary) = self
767-
.statistics
759+
.comparisons
768760
.into_iter()
769761
.partition(|s| category_map.get(&s.benchmark()) == Some(&Category::Primary));
770762

@@ -773,21 +765,21 @@ impl Comparison {
773765
.into_iter()
774766
.partition(|(b, _)| category_map.get(&b.as_str().into()) == Some(&Category::Primary));
775767

776-
let primary = Comparison {
768+
let primary = ArtifactComparison {
777769
a: self.a.clone(),
778770
b: self.b.clone(),
779-
statistics: primary,
771+
comparisons: primary,
780772
newly_failed_benchmarks: primary_errors,
781773
};
782-
let secondary = Comparison {
774+
let secondary = ArtifactComparison {
783775
a: self.a,
784776
b: self.b,
785-
statistics: secondary,
777+
comparisons: secondary,
786778
newly_failed_benchmarks: secondary_errors,
787779
};
788780
(
789-
ComparisonSummary::summarize_comparison(&primary),
790-
ComparisonSummary::summarize_comparison(&secondary),
781+
ArtifactComparisonSummary::summarize(&primary),
782+
ArtifactComparisonSummary::summarize(&secondary),
791783
)
792784
}
793785
}
@@ -1166,7 +1158,7 @@ Triage done by **@???**.
11661158
Revision range: [{first_commit}..{last_commit}](https://perf.rust-lang.org/?start={first_commit}&end={last_commit}&absolute=false&stat=instructions%3Au)
11671159
11681160
{num_regressions} Regressions, {num_improvements} Improvements, {num_mixed} Mixed; ??? of them in rollups
1169-
{num_comparisons} comparisons made in total
1161+
{num_comparisons} artifact comparisons made in total
11701162
11711163
#### Regressions
11721164
@@ -1407,13 +1399,13 @@ mod tests {
14071399

14081400
// (category, before, after)
14091401
fn check_table(values: Vec<(Category, f64, f64)>, expected: &str) {
1410-
let mut primary_statistics = HashSet::new();
1411-
let mut secondary_statistics = HashSet::new();
1402+
let mut primary_comparisons = HashSet::new();
1403+
let mut secondary_comparisons = HashSet::new();
14121404
for (index, (category, before, after)) in values.into_iter().enumerate() {
14131405
let target = if category == Category::Primary {
1414-
&mut primary_statistics
1406+
&mut primary_comparisons
14151407
} else {
1416-
&mut secondary_statistics
1408+
&mut secondary_comparisons
14171409
};
14181410

14191411
target.insert(TestResultComparison {
@@ -1425,16 +1417,16 @@ mod tests {
14251417
});
14261418
}
14271419

1428-
let primary = create_summary(primary_statistics);
1429-
let secondary = create_summary(secondary_statistics);
1420+
let primary = create_summary(primary_comparisons);
1421+
let secondary = create_summary(secondary_comparisons);
14301422

14311423
let mut result = String::new();
14321424
write_summary_table(&primary, &secondary, true, &mut result);
14331425
assert_eq!(result, expected);
14341426
}
14351427

1436-
fn create_summary(statistics: HashSet<TestResultComparison>) -> ComparisonSummary {
1437-
let comparison = Comparison {
1428+
fn create_summary(comparisons: HashSet<TestResultComparison>) -> ArtifactComparisonSummary {
1429+
let comparison = ArtifactComparison {
14381430
a: ArtifactDescription {
14391431
artifact: ArtifactId::Tag("a".to_string()),
14401432
pr: None,
@@ -1447,9 +1439,9 @@ mod tests {
14471439
bootstrap: Default::default(),
14481440
bootstrap_total: 0,
14491441
},
1450-
statistics,
1442+
comparisons,
14511443
newly_failed_benchmarks: Default::default(),
14521444
};
1453-
ComparisonSummary::summarize_comparison(&comparison)
1445+
ArtifactComparisonSummary::summarize(&comparison)
14541446
}
14551447
}

site/src/github.rs

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
use crate::api::github::Issue;
2-
use crate::comparison::{write_summary_table, ComparisonSummary, Direction, Magnitude};
2+
use crate::comparison::{write_summary_table, ArtifactComparisonSummary, Direction, Magnitude};
33
use crate::load::{Config, SiteCtxt, TryCommit};
44

55
use anyhow::Context as _;
@@ -632,8 +632,8 @@ async fn summarize_run(ctxt: &SiteCtxt, commit: QueuedCommit, is_master_commit:
632632
}
633633

634634
fn next_steps(
635-
primary: ComparisonSummary,
636-
secondary: ComparisonSummary,
635+
primary: ArtifactComparisonSummary,
636+
secondary: ArtifactComparisonSummary,
637637
direction: Option<Direction>,
638638
is_master_commit: bool,
639639
) -> String {
@@ -705,7 +705,7 @@ compiler perf.{next_steps}
705705
)
706706
}
707707

708-
fn generate_short_summary(summary: &ComparisonSummary) -> String {
708+
fn generate_short_summary(summary: &ArtifactComparisonSummary) -> String {
709709
// Add an "s" to a word unless there's only one.
710710
fn ending(word: &'static str, count: usize) -> std::borrow::Cow<'static, str> {
711711
if count == 1 {

0 commit comments

Comments
 (0)