Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VMR Validation Proposal #15354

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

mmitche
Copy link
Member

@mmitche mmitche commented Dec 19, 2024

To double check:


### Optional full-VMR validation is added for MSFT configurations

We will introduce the ability to trigger optional full VMR insertion validation on PRs. These will run a desired subset of the MSFT/source-only VMR builds and validation. We believe this is a net-net win overall for the product.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will the results be used?
Let's assume a specific PR got these checks and they failed. Who is going to review these optional check results and what will be the next steps from there?


### Repositories retain their existing repo-specific validation but may tweak if necessary

Aside from source-build jobs, existing repository PR/CI validation will not be removed or altered. Existing testing setups can remain as-is. Most repositories build extremely similarly between current official builds and VMR component builds. However, we may identify areas where the PR testing setup is not representative of the VMR build setup and these differences allow product breaks to flow through. This is very similar to PR/Official Build differences today. When such gaps are identified for a repository, we will remedy like any other testing gap. This is the responsibility of the product teams.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very similar to PR/Official Build differences today. When such gaps are identified for a repository, we will remedy like any other testing gap. This is the responsibility of the product teams.

Maybe this is the answer to my earlier question. Is it?


### VMR PR/CI builds a selected subset of verticals and scenarios

On insertion into a VMR, we will run a selected representative subset of scenarios. This representative subset may change over time as the product changes, however, a rough sketch may look like:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we will run a selected representative subset of scenarios.

How will these be defined? Or are these simply hand-picked legs from existing list?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have said "How are these defined?" given the Completed status.

## Principles

- Repositories have invested heavily in repo-specific testing over the life of .NET. This validation must not be compromised.
- Validation failures that block PR merge should be actionable. This may mean that action needs to be taken in another repository. For example, a flow from runtime->aspnetcore with a bug can be dealt with in runtime.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, a flow from runtime->aspnetcore with a bug can be dealt with in runtime

What does a flow from runtime to aspnetcore mean? Does it mean a backflow of runtime packages from VMR to aspnetcore?


We will introduce the ability to trigger optional full VMR insertion validation on PRs. These will run a desired subset of the MSFT/source-only VMR builds and validation. We believe this is a net-net win overall for the product.

**Scheduling: Can be implemented now.**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean there will be a similar darc-powered VMR sync happening like it happens now in SDK PRs?
Because I don't think the tooling is quite ready (but not far neither).

What happens now in SDK PRs is sort of a "paste" where we take the current SHA1 of the repo in the VMR, the SHA2 of the PR branch and we create patch SHA1_SHA2 in the repo and apply it onto the VMR (plus some patch operations, cloaking..).
But this will differ when we have the flat flow. The forward flow logic does something different based on what the last flow between the repos was and whether there is conflicting content.

The result is that the current way basically inserts/overwrites the repo contents in the VMR, the latter pushes new changes to the VMR but might be based off of an older commit of the target branch when conflicts with the tip prevent to apply the changes on top of it.

Which behaviour do we want for this insertion? I'd say the both can have cases where the VMR has some updates you won't be able to possibly include in your PR build. I'd say the latter is better because it will mimic more what will happen on a subsequent forward flow.


On insertion into a VMR, we will run a selected representative subset of scenarios. This representative subset may change over time as the product changes, however, a rough sketch may look like:
- Some set of source-only legs w/ scenario testing
- A full-stack matrix that sparsely covers at all OSs and architectures w/ scenario testing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- A full-stack matrix that sparsely covers at all OSs and architectures w/ scenario testing.
- A full-stack matrix that sparsely covers all OSs and architectures w/ scenario testing.


### VMR PR/CI does not run repo-specific validation

It will be challenging to mimic the repo-specific testing setups in VMR PR/CI builds. In addition, the quantity of testing is unlikely to be practical for full VMR validation. Running repo unit tests in VMR PRs is not a goal of Unified Build.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the following be valuable?

  • We have on-demand VMR PR pipelines - 1 per repo in the VMR - to enable additional validation when requested.
  • Each of these would be based on the same YAML file from the VMR and would contain an ID of a target repo pipeline (e.g. dotnet-runtime) which exists and runs in the original repo.
  • This VMR pipeline would trigger the target pipeline, pass to it the VMR SHA.
  • The repo pipeline would have a special checkout step that would run when the pipeline is triggered remotely like this
  • The checkout step replaces the checked out sources with the contents from the VMR and the build proceeds as usual.

Maybe it's a cheap way to get a lot of value?

## Current State

Broadly, the following validation currently takes place:
- Repositories that participate source build (i.e. this would exclude windowsdesktop, winforms, etc.) have a repo source-build job that runs on PRs and CI. This job performs an *approximation* of the source-only build that would take place in the VMR when the code reaches the SDK. This job **is** valuable at times. It is helpful in catching new prebuilts introduced by repository changes as well as some set of build differences that appear when `DotNetBuildSourceOnly == true`. This is especially important because the time betweeen a repo check-in and VMR/source-build insertion has historically been quite long. It is important to detect issues early. However, this job is also fragile. The same prebuilt detection will catch package restores that would be provided by VMR built package flow (e.g. runtime->roslyn) or previously source built artifacts, requiring costly analysis and baseline updates. Furthermore, the toolset used to may be different between the VMR and the isolated repository, leading to confusion, especially around target frameworks.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also mention TFMs here. Most of the time what is targeted in the repo source-build leg is different from the TFM that is being built in the product source-build. This is especially true for repos that dual-insert.

Copy link
Member

@ViktorHofer ViktorHofer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very well written. Thanks for spending your time on this.

@ericstj ericstj self-requested a review January 7, 2025 16:49
The purpose of this document is to lay out a strategy for:
- Validating code flow into the VMR from consituent repositories.
- Validating changes made directly in the VMR.
- Validating constituent repository changes so that they are less likely to break in the VMR.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also be validating that the VMR build is fresh and that repos don't introduce new stale binaries (prebuilts rather than VMR build output)?

- A set of scenario tests run against the VMR output in the VMR builds in the SDK. These scenario tests are designed to cover a broad swath of functionality against a finished product, rather than a repository build layout.
- Teams run manual validation on staged artifacts.

## Proposed Validation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There will be another very short leg running in the VMR validating which files are changed:
dotnet/arcade-services#2950

Not sure we want to call it out in this doc as well?

## Current State

Broadly, the following validation currently takes place:
- Repositories that participate source build (i.e. this would exclude windowsdesktop, winforms, etc.) have a repo source-build job that runs on PRs and CI. This job performs an *approximation* of the source-only build that would take place in the VMR when the code reaches the SDK. This job **is** valuable at times. It is helpful in catching new prebuilts introduced by repository changes as well as some set of build differences that appear when `DotNetBuildSourceOnly == true`. This is especially important because the time betweeen a repo check-in and VMR/source-build insertion has historically been quite long. It is important to detect issues early. However, this job is also fragile. The same prebuilt detection will catch package restores that would be provided by VMR built package flow (e.g. runtime->roslyn) or previously source built artifacts, requiring costly analysis and baseline updates. Furthermore, the toolset used to may be different between the VMR and the isolated repository, leading to confusion, especially around target frameworks.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Repositories that participate source build (i.e. this would exclude windowsdesktop, winforms, etc.) have a repo source-build job that runs on PRs and CI. This job performs an *approximation* of the source-only build that would take place in the VMR when the code reaches the SDK. This job **is** valuable at times. It is helpful in catching new prebuilts introduced by repository changes as well as some set of build differences that appear when `DotNetBuildSourceOnly == true`. This is especially important because the time betweeen a repo check-in and VMR/source-build insertion has historically been quite long. It is important to detect issues early. However, this job is also fragile. The same prebuilt detection will catch package restores that would be provided by VMR built package flow (e.g. runtime->roslyn) or previously source built artifacts, requiring costly analysis and baseline updates. Furthermore, the toolset used to may be different between the VMR and the isolated repository, leading to confusion, especially around target frameworks.
- Repositories that participate in source build (i.e. this would exclude windowsdesktop, winforms, etc.) have a repo source-build job that runs on PRs and CI. This job performs an *approximation* of the source-only build that would take place in the VMR when the code reaches the SDK. This job **is** valuable at times. It is helpful in catching new prebuilts introduced by repository changes as well as some set of build differences that appear when `DotNetBuildSourceOnly == true`. This is especially important because the time betweeen a repo check-in and VMR/source-build insertion has historically been quite long. It is important to detect issues early. However, this job is also fragile. The same prebuilt detection will catch package restores that would be provided by VMR built package flow (e.g. runtime->roslyn) or previously source built artifacts, requiring costly analysis and baseline updates. Furthermore, the toolset used may be different between the VMR and the isolated repository, leading to confusion, especially around target frameworks.


VMR PR/CI will lean heavily on scenario testing for correctness validation. These scenario tests are designed to cover large swaths of product functionality, rather than focusing on a specific component. Where necessary, scenario test quality will be increased.

**Scheduling: NA, **
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something is messed up with the formatting here. In rendered MD in GH the asterisks are appearing. Possibly from the training comma?

I see work remaining here. The Scenario Tests are lacking in many areas today. I feel like this hasn't been taken serious enough yet by repo owners to identify adequate tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the closing ** can't have a space in front of it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants