Skip to content

Conversation

@Britel
Copy link
Collaborator

@Britel Britel commented Jun 6, 2025

No description provided.

Jiri Appl and others added 30 commits May 6, 2025 17:43
# 🔍 Description

VM servicing is printing last serial log line on failures to help with diagnosing issues. Alas, for Azure VM serial logs, these sometimes ended up with output of " only. This PR fixes the filtering logic to remove these unhelpful logs.

# 🤔 Rationale

Improve diagnosability of pipeline failures.

Related work items: #12095
… getting logs with single...

# 🔍 Description

No longer need to publish the original log, as the issue has been resolved.

Related work items: #12095
# 🔍 Description

Build actual image clones instead of doing the hacky FS UUID replacement

Depends on !23032

----
#### AI description  (iteration 1)
#### PR Classification
Enhancement of end-to-end (e2e) testing infrastructure.

#### PR Summary
This pull request introduces a new build template for runtime images in e2e tests, improving the testing process by building clones. It also involves restructuring existing templates for better organization and clarity.
- Added a new file `/.pipelines/templates/stages/build_image/build-runtime.yml` to define parameters and stages for building runtime images.
- Updated `/.pipelines/templates/e2e-template.yml` to use the new `build-runtime.yml` template for building various Trident test images.
- Moved `/.pipelines/templates/stages/build_image/trident-testimg.yml` to `/.pipelines/templates/stages/build_image/build-installer.yml` without code changes.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12138
#### AI description  (iteration 1)
#### PR Classification
Bug fix: Resolve the parsing error for Prism history when partition size is null.

#### PR Summary
This pull request fixes the parsing issue by handling null partition sizes during offline initialization. The changes include:
- In `src/offline_init/mod.rs`, updating the partition size logic to use a match on an optional value and defaulting to `PartitionSize::Grow` when size is missing.
- In the test suite within `src/offline_init/mod.rs`, adding a test that validates the successful deserialization of Prism history JSON.
- In the `PrismPartition` structure, changing the type of `size` from `String` to `Option<String>` for more robust parsing.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12145
#### AI description  (iteration 1)
#### PR Classification
Bug fix addressing lsblk output parsing failures.

#### PR Summary
This PR fixes lsblk parsing errors by adding support for unknown partition table types.
- `/osutils/src/lsblk.rs`: Added an `Unknown` enum variant annotated with `#[serde(other)]` to gracefully handle unsupported partition table types.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12139
… space

# 🔍 Description

Clarifies the error message when a filesystem requires more space than the underlying block device, by including file system mount point and block device size.

----
#### AI description  (iteration 1)
#### PR Classification
Bug fix to correct the error message when the filesystem requires more space than available on the block device.

#### PR Summary
This pull request updates the error message to provide more detailed information about the filesystem and block device size mismatch.
- `trident_api/src/error.rs`: Modified the error message to include the mount point and provide a clearer comparison between filesystem size and block device size.
- `src/subsystems/storage/osimage.rs`: Updated the error handling to include the mount point and convert sizes to human-readable format for better clarity.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12067
# 🔍 Description

COSI 1.1 revision to add bootloader metadata and resolve some things that annoy me.

----
#### AI description  (iteration 1)
#### PR Classification
Documentation update for the revision of the Composable OS Image (COSI) specification to version 1.1.

#### PR Summary
This pull request revises the COSI specification document to include new fields and objects related to bootloader configuration and filesystem metadata. It introduces version 1.1 of the specification with detailed descriptions of new components.
- `Composable-OS-Image.md`: Added `Bootloader` object and related enums and objects for bootloader configuration, including `BootloaderType`, `SystemDBoot`, and `SystemDBootEntry`.
- `Composable-OS-Image.md`: Updated metadata schema to include `bootloader` and `filesystems` fields, with `images` now deprecated as an alias for `filesystems`.
- `Composable-OS-Image.md`: Revised `VerityConfig` and `ImageFile` objects to include versioning information.
- `Composable-OS-Image.md`: Updated `OsPackage` object to include versioning information for fields.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12113
Return an `ExitKind` enum all the way to the main function before running reboot. This way all the spans will be closed and all state will be dropped before the system goes down

----
#### AI description  (iteration 1)
#### PR Classification
New feature: Implemented a mechanism to ensure the system unwinds to the main state before initiating a reboot.

#### PR Summary
This pull request introduces a new feature that ensures the system unwinds to the main state before starting a reboot, enhancing system stability and error handling.
- Modified `src/lib.rs` to introduce `ExitKind` enum for handling operations that require a reboot and updated error handling logic.
- Updated `src/main.rs` to handle the new `ExitKind` return type, ensuring proper reboot execution.
- Adjusted `src/engine/clean_install.rs` and `src/engine/update.rs` to return `ExitKind` for operations, indicating when a reboot is necessary.
- Enhanced the `Trident` implementation to manage host configuration updates and reboots effectively.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12151
# 🔍 Description

Medium-level diagram about how trident works,

![image (2).png](https://dev.azure.com/mariner-org/2311650c-e79e-4301-b4d2-96543fdd84ff/_apis/git/repositories/895b6b3d-5077-488a-8001-ab6b5a14c1a3/pullRequests/23111/attachments/image%20%282%29.png)

----
#### AI description  (iteration 1)
#### PR Classification
This pull request adds a new visual install flow diagram along with supporting tooling to generate Trident architecture diagrams.

#### PR Summary
The changes introduce an SVG-based install flow diagram and implement a new diagram rendering module driven by YAML definitions, while extending the CLI and build configuration to support these features.
- **`docs/resources/trident-install.svg`**: Added a new SVG file illustrating the Trident Install Flow.
- **`docbuilder/src/trident_arch/`**: Introduced new files (`render.rs`, `nodes.rs`, `diagrams/install.yaml`, and `mod.rs`) to parse YAML and render architecture diagrams.
- **`docbuilder/src/main.rs`**: Extended the CLI with a new `TridentArch` command and related options for generating diagrams.
- **Build configuration**: Updated `Cargo.toml`, `Cargo.lock`, and `Makefile` to include dependencies (e.g., `svg`, `textwrap`) and targets for diagram generation.
- **`docs/Explanation/Install-Flow.md`**: Added documentation referencing the new install flow diagram.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12162
# 🔍 Description

trident-cicd-for-azl-preview is broken in non-installer image builds.  seems that the azl version is not being passed along to download base image or download rpms.

----
#### AI description  (iteration 1)
#### PR Classification
Bug fix for build configuration in dev-azl based builds.

#### PR Summary
This pull request fixes the build pipeline configuration by updating the image version handling in the runtime build stage.
- `/.pipelines/templates/stages/build_image/build-runtime.yml`: Changes `baseimgVersion` to use a variable substitution (`$(baseimgVersion)`) and adds `baseimgAzureLinuxVersion` to retain the original Azure Linux version reference.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12138
# 🔍 Description

Add simple e2e usr verity tests. Will add AB and other features in smaller follow ups.

Depends on !22700

----
#### AI description  (iteration 1)
#### PR Classification
This pull request introduces a new feature by adding configurations for e2e usr-verity tests.

#### PR Summary
The changes add dedicated configuration files and update pipeline targets to support usr-verity testing during end-to-end runs.
- `e2e_tests/trident_configurations/usr-verity/trident-config.yaml`: Added a new configuration file specifying storage, filesystem, and OS parameters for usr-verity.
- `/.pipelines/trident-pr-e2e.yml`: Updated the branch reference for test images to target a usr-verity specific branch.
- `e2e_tests/target-configurations.yaml`: Extended multiple test groups by appending the `usr-verity` target.
- `e2e_tests/trident_configurations/usr-verity/test-selection.yaml`: Added a new file listing compatible test bases for usr-verity.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12114
# 🔍 Description

I was annoyed at having to pass around a logger everywhere, so now we do proper stderr/stdout capturing instead.

Officially supported:
- Anything that writes to stderr
- Anything that writes to stdout (eg. `fmt.PrintX` func family)
- logrus

Other log providers may or may not be supported depending on how they resolve stderr. Logrus, for example, stores a reference to os.StdErr on startup and always writes to that. We have to manually swap it from underneath to capture it, which requires deliberate action. Other loggers may require similar support.

Related work items: #12187
# 🔍 Description

DEPENDS ON: !23139

First step to move it out of trident, make it an independent module. This also means tools and the test suite now exist in the same module and can easily share code.

No behavior changes, just moving files around and updating imports.

----
#### AI description  (iteration 1)
#### PR Classification
This pull request refactors storm’s testing and logging infrastructure to decouple it into an independent module while enhancing output capturing.

#### PR Summary
The changes introduce a new mechanism in the test runner for capturing and forwarding output, simplify test case logging, and reorganize the project structure for module independence.
- **`storm/internal/runner/runner.go`**: Added a new `captureOutput` function to redirect stdout/stderr and integrate live output forwarding (via the new watch flag) into test executions.
- **`storm/internal/testmgr/testcase.go`**: Removed the buffered logger in favor of a `collectedOutput` field and updated accessors to return captured output.
- **`storm/internal/cli/run`**: Updated helper and scenario commands to support a `watch` flag, passing it to `RegisterAndRunTests` for live output.
- **File Movements**: Moved multiple helpers, utilities, and scenario files from storm directories to the tools structure to establish module independence.
- **Dependency Updates**: Revised go.mod and go.sum entries, including module renaming from argus_toolkit to tridenttools and updating several dependency versions.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12194
We currently fail if there's any unrecognized entries in the netlaunch config, so this is needed before https://dev.azure.com/mariner-org/ECF/_git/argus-toolkit/pullrequest/23145 lands

Related work items: #12199
# 🔍 Description

Add a feature matrix to our docs landing page.

----
#### AI description  (iteration 1)
#### PR Classification
Documentation update to include a feature matrix for Trident.

#### PR Summary
This pull request adds a detailed feature matrix to the Trident documentation, outlining the capabilities and planned features for managing Azure Linux systems.
- `docs/Trident.md`: Added a comprehensive feature matrix table detailing various categories such as Runtime, Bootloader, Lifecycle, Integrity, Storage, OS Config, SELinux, Customization, and Development, along with their support status across different lifecycle stages (Install, VM-Init, Update).
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #11446
… boot times from systemd-analyze and publish as metrics

# 🔍 Description

Use storm to create new metric using `systemd-analyze` to capture kernel/initrd/userspace/etc times.

Related work items: #12142
# 🔍 Description

We continue to get notifications about pypi.org.  Try using adding `127.0.0.1 pypi.org` to /etc/hosts to try to disable any process from getting to pypi.org.  Maybe this will help determine what is reaching out to pypi.org?

requires:
* (merged) platform-pipelines: https://dev.azure.com/mariner-org/ECF/_git/platform-pipelines/pullrequest/23152

related to :
* (merged) test-images: https://dev.azure.com/mariner-org/ECF/_git/test-images/pullrequest/23151h

validation: https://dev.azure.com/mariner-org/ECF/_build/results?buildId=811288&view=results

Related work items: #10524
# 🔍 Description

What is this PR about? feature/doc/engineering/bug?

# 🤔 Rationale

Why is this PR needed?

# 📝 Checks

- [ ] Check [dev-docs/manual-validation.md](/dev-docs/manual-validation.md)

# 📌 Follow-ups

TODO:

- #0000

# 🗒️ Notes

Fix 'make download-runtime-images'

Related work items: #12207
# 🔍 Description

Previously we were scp'ing the images onto the host, but we already have netlisten running anyway, so we can simplify and use that instead.

Related work items: #12213
# 🔍 Description

PR last week introduced bug where metrics file path is specified incorrectly.

----
#### AI description  (iteration 1)
#### PR Classification
Bug fix addressing an incorrect variable reference in the metrics file path.

#### PR Summary
This pull request corrects the variable used for specifying the metrics file path in the baremetal testing pipeline configuration, ensuring that the correct environment variable is referenced.
- `.pipelines/templates/stages/testing_baremetal/baremetal-testing.yml`: Changed the metrics file parameter from `$(tridentSourceDirectory)` to `$(TRIDENT_SOURCE_DIR)`.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12214
…I+RAID

# 🔍 Description

Swap places between root-verity and usr-verity to test the latter more thoroughly and reduce the scope of root-verity to just a limited scenario in preparation for incoming changes to the feature.

Depends on: !23171

----
#### AI description  (iteration 1)
#### PR Classification
This PR implements extensive test configuration updates for usr-verity, replacing previous verity settings with new usr-verity definitions.

#### PR Summary
The pull request revises multiple e2e test configuration files to support usr-verity testing by updating storage IDs, RAID groups, mount points, and image URLs.
- `e2e_tests/trident_configurations/combined/trident-config.yaml`: Renamed storage and RAID identifiers from root/verity to usr-based names and updated the image URL to usrverity.cosi.
- `e2e_tests/trident_configurations/usr-verity-raid/trident-config.yaml`: Added a new configuration file with usr-verity RAID settings, including updated partition and RAID definitions.
- `e2e_tests/trident_configurations/usr-verity/trident-config.yaml`: Revised partition layouts, volume pairs, and verity settings to align with the usr-verity naming convention.
- `e2e_tests/target-configurations.yaml`: Adjusted test suite mappings to replace legacy verity and verity-raid entries with usr-verity and root-verity targets.
- Various test-selection and other configuration files: Updated internal parameters and image URLs from verity.cosi to usrverity.cosi for consistent usr-verity testing.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12233
…ocal VMs without using BMC

This adds a storm utility code for interacting with libvirt VMs and changes netlaunch to use it, rather than interacting over an emulated BMC. 

It avoids the boot order hacks that we previously relied on and will make netlaunch VM testing closer to how baremetal machines behave. In the future, we will have the option to configure secureboot or set other UEFI variables from outside the VM before it starts.

----
#### AI description  (iteration 1)
#### PR Classification
This PR implements a new feature to bypass the BMC emulator and have netlaunch interact directly with local VMs.

#### PR Summary
The pull request updates netlaunch to use local VMs via libvirt for testing, reducing reliance on the BMC emulator.
- `tools/cmd/netlaunch/main.go`: Introduced a conditional branch to use a local VM (via config.Netlaunch.LocalVmUuid) for HTTP boot setup and VM start; retains the BMC code path as fallback.
- `tools/storm/utils/libvirt.go`: Added a new file with utilities to initialize, set boot URI, start, and disconnect from a VM using libvirt.
- Updated dependency versions in `tools/go.sum` and `tools/go.mod` to include libvirt packages and newer versions for related modules.
- `/.pipelines/templates/stages/testing_vm/netlaunch-testing.yml`: Adjusted the testing command to run netlaunch with the appropriate privileges for libvirt.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12199
# 🔍 Description

Missed this in the last PR

----
#### AI description  (iteration 1)
#### PR Classification
This PR disables a failing test configuration by removing the memory-constraint-combined test from active runs.

#### PR Summary
The pull request addresses the broken usr-verity configuration issue by updating and then disabling the memory-constraint-combined test setup.
- In `e2e_tests/trident_configurations/memory-constraint-combined/trident-config.yaml`, RAID array IDs, volume pair mappings, and filesystem mounts are revised.
- In `e2e_tests/target-configurations.yaml`, the memory-constraint-combined test is commented out with a TODO for re-enabling, ensuring it does not execute during CI.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12241
…systems isn't present

Related work items: #12032
# 🔍 Description

Matching changes for !23217, SHOULD MERGE IMMEDIATELY AFTER THAT PR.

----
#### AI description  (iteration 1)
#### PR Classification
This pull request is a cleanup that removes obsolete `virt-deploy run` commands.

#### PR Summary
The changes eliminate redundant calls to `virt-deploy run` across pipeline configuration, documentation, and test setup, aligning with the "Cleanup virt-deploy" work item.
- `/.pipelines/templates/stages/testing_vm/netlaunch-testing.yml`: Removed the `./virt-deploy run` command during the VM creation stage.
- `/dev-docs/validating-container.md`: Deleted the `./virt-deploy run` command from the container validation example.
- `/functional_tests/test_setup.py`: Eliminated the redundant execution of `virt-deploy run` in the test setup.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12247
# 🔍 Description

First stab at a solution to generate test lists out of the critical path.

Not ideal, but this solution requires minimal changes, will iterate on it in the future.

----
#### AI description  (iteration 1)
#### PR Classification
This pull request implements an engineering enhancement to pre-generate the test matrix in the virtual machine testing pipeline.

#### PR Summary
The PR introduces a new stage that generates the test list for virtual machine testing and updates subsequent steps to depend on this pre-generated data, streamlining test configuration retrieval.
- `/.pipelines/templates/stages/testing_vm/netlaunch-testing.yml`: Added the `DefineTests_VM_${{ parameters.runtimeEnv }}` stage that invokes the `get-tests.yml` template with appropriate parameters for VM testing.
- `/.pipelines/templates/stages/testing_vm/netlaunch-testing.yml`: Modified the deployment testing stage to depend on the new test definition stage and updated the matrix reference to use outputs from the pre-generated test list.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->
# 🔍 Description

When using UKI, create RAID Arrays with `homehost=any` so that the runtime OS will adopt the arrays as its own without caring for the homehost in the metadata.

Follow ups:
- #12276
- #12277

----
#### AI description  (iteration 1)
#### PR Classification
New feature to enable RAID array creation with UKI support.

#### PR Summary
This PR introduces a new function for RAID array creation that accepts a homehost parameter and updates the RAID initialization logic to support UKI.
- `osutils/src/mdadm.rs`: Added a `create_homehost` function and modified `create_inner` to pass a `--homehost` argument when provided.
- `src/engine/storage/raid.rs`: Updated the RAID creation routine to conditionally invoke `create_homehost` based on the UKI support flag and imported the corresponding constant.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12232
# 🔍 Description

Turns out that the test does not work on bare metal, so disabling it for now until it gets fixes to stop failing the pipeline.

Work to re-enable:

-  #12291

Related work items: #12289
ayaegashi and others added 27 commits June 13, 2025 16:01
# 🔍 Description

Enable `enforcing` mode on all e2e tests that use usrverity.

Passing e2e tests: https://dev.azure.com/mariner-org/ECF/_build/results?buildId=835091&view=results

----
#### AI description  (iteration 1)
#### PR Classification
This PR implements a configuration update to enable enforcing SELinux mode for usrverity tests.

#### PR Summary
The pull request introduces SELinux configuration support and updates associated policy rules to enforce the new mode during usrverity tests.
- `src/subsystems/osconfig/mod.rs`: Added a code block to update SELinux settings when UKI support is enabled.
- `osutils/src/osmodifier.rs`: Extended the OSModifierConfig struct to include a SELinux configuration field.
- `selinux-policy-trident/trident.te`: Modified multiple SELinux policy rules (including new capabilities, file permissions, and type definitions) to support enforcing mode.
- `e2e_tests/trident_configurations/*/trident-config.yaml`: Updated test configuration files to set SELinux mode to enforcing.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12482
…ed to enforcing

# 🔍 Description

e2e test (currently running): https://dev.azure.com/mariner-org/ECF/_build/results?buildId=835447&view=results

----
#### AI description  (iteration 1)
#### PR Classification
Configuration update to enforce SELinux mode for the memory-constraint-combined test environment.

#### PR Summary
This PR updates configuration files to explicitly set SELinux to enforcing mode for memory-constraint-combined and includes it in the list of target configurations.
- `e2e_tests/trident_configurations/memory-constraint-combined/trident-config.yaml`: Added SELinux with `mode: enforcing`.
- `e2e_tests/target-configurations.yaml`: Appended `memory-constraint-combined` to the pullrequest configuration list.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12482
# 🔍 Description

Faster FTs :)

----
#### AI description  (iteration 1)
#### PR Classification
This pull request introduces performance improvements to the functional test framework by refactoring VM setup and pipeline configurations.

#### PR Summary
This PR streamlines functional tests to run faster by overhauling the VM creation process and updating related build and pipeline settings.
- **`functional_tests/test_setup.py`**: Refactored VM creation and online check logic with cloud-init configuration, added logging and timeout handling, and removed deprecated functions.
- **`functional_tests/custom/test_trident_e2e.py`**: Commented out several Trident tests to bypass unnecessary execution during functional test runs.
- **Pipeline templates (.pipelines/templates)**: Updated parameter names and artifact download steps to use a prebuilt FT image and adjust rerun logic.
- **`Makefile` and `functional_tests/conftest.py`**: Modified build targets to generate a new FT QCOW2 image and updated fixture types for improved configuration.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12544
… run in enforcing mode

# 🔍 Description

Necessary so `rerun` does not fail with !23518 .

pre2e run pulling test-images from feature branch where installers have SELinux in enforcing mode: https://dev.azure.com/mariner-org/ECF/_build/results?buildId=835951&view=results

----
#### AI description  (iteration 1)
#### PR Classification
Engineering change updating SELinux policy rules to enable the installer image to run in enforcing mode.

#### PR Summary
This PR adjusts SELinux permissions in the `trident.te` file to support enforcing mode operation.
- `selinux-policy-trident/trident.te`: Added `sys_module` to the allowed capabilities for `trident_t`.
- `selinux-policy-trident/trident.te`: Expanded `var_t:dir` permissions to include `relabelto` along with `mounton`.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12560
# 🔍 Description

`memory-constraint-combined`, `misc`, and `root-verity` now succeed when run in an installer environment that has SELinux in enforcing mode. Among other permissions, was missing ability to mount NTFS (`misc`) and relabel various directories and files (`root-verity`).
Succeeding pipeline: https://dev.azure.com/mariner-org/ECF/_build/results?buildId=837611&view=results
![image.png](https://dev.azure.com/mariner-org/2311650c-e79e-4301-b4d2-96543fdd84ff/_apis/git/repositories/895b6b3d-5077-488a-8001-ab6b5a14c1a3/pullRequests/23530/attachments/image.png)

Related work items: #12567, #12569
# 🔍 Description

Fix some typos

Related work items: #12544
…n-empty directory error occurs

# 🔍 Description

Add existing contents in warn!, example output:

`Mount path: '/tmp/.tmpV5CnFt' already exists and is non-empty: /tmp/.tmpV5CnFt/temp_dir`

Related work items: #10330
# 🔍 Description

Migrate servicing tests from bash scripts to storm helper.

qemu, uki servicing tests validated: https://dev.azure.com/mariner-org/ECF/_build/results?buildId=829876&view=results
azure servicing tests validated: https://dev.azure.com/mariner-org/ECF/_build/results?buildId=833960&view=results

(interestingly, new e2e-pr-azure servicing tests run for 20-25 minutes rather than 35-40 minutes ... 10min saving on different replication mode for azure image)

----
#### AI description  (iteration 1)
#### PR Classification
This pull request migrates the servicing tests from legacy Bash scripts to a new Storm framework-based test helper.

#### PR Summary
This PR introduces a new testing infrastructure for servicing tests by implementing utility functions and a helper in Storm, and updating pipeline definitions to use the new commands.
- **`tools/storm/utils/servicing.go`**: Added comprehensive utilities for VM deployment, update loops, log collection, image publishing, and cleanup across QEMU and Azure.
- **`tools/storm/helpers/servicing.go` & `tools/storm/helpers/init.go`**: Implemented a new `ServicingTestsHelper` that registers test cases (deploy-vm, check-deployment, update-loop, collect-logs, cleanup-vm, publish-sig-image) with Storm.
- **`/.pipelines/templates/stages/testing_servicing/testing-template.yml`**: Updated pipeline steps to call the new Storm helper with appropriate flags replacing old Bash scripts.
- **`/.pipelines/templates/MockOB.yml`**: Modified to ensure artifact publishing occurs in all conditions.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12590
# 🔍 Description

Quick convenience rule to download image.

Related work items: #12544
…logic after ESP contents can be accessed

# 🔍 Description

This PR re-factors the logic inside `engine` and `subsystems` to create a separate `EspSubsystem` and run the encryption logic after the ESP contents have been created and can be accessed. This is required so that a `pcrlock` policy can be created during the staging of the clean install.

----
#### AI description  (iteration 1)
#### PR Classification
This PR refactors the codebase by isolating ESP logic into its own dedicated subsystem.

#### PR Summary
The pull request reorganizes ESP-related functionality by moving its implementation from the boot subsystem to a new ESP subsystem and updating module references accordingly.
- `src/engine/boot/esp.rs` moved to `src/subsystems/esp.rs` without code changes.
- `src/engine/boot/mod.rs` now omits direct ESP image deployment and removes the ESP module export.
- `src/engine/clean_install.rs` and `src/engine/mod.rs` update imports to reference the new `EspSubsystem`.
- Minor parameter renaming in `src/subsystems/storage/mod.rs` and module registration added in `src/subsystems/mod.rs`.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12593
# 🔍 Description

Ensure verity hash is factored into size validation.  In existing size validation, check if the fs is verity and compare uncompressed hash file size to configured hash device size.

----
#### AI description  (iteration 1)
#### PR Classification
Bug fix addressing an error in filesystem size validation for verity devices.

#### PR Summary
This PR updates the storage size calculation for verity devices to include both the data and hash device sizes, ensuring accurate filesystem validation.
- `trident_api/src/config/host/storage/storage_graph/graph.rs`: Modified the VerityDevice branch in block_device_size to compute and sum the sizes for both data and hash devices.
- `trident_api/src/config/host/storage/storage_graph/graph.rs`: Updated test assertions to validate that the verity device size equals the combined sizes of data and hash devices.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12062
# 🔍 Description

Endure COSI download failures due to temporary network glitches

----
#### AI description  (iteration 1)
#### PR Classification
This PR introduces a new feature by adding a retry mechanism to the COSI download logic, improving the resiliency of HTTP requests.

#### PR Summary
The changes refactor the HTTP file requester to use a lambda function with a retry loop for both GET and HEAD operations, ensuring that transient HTTP failures are handled with incremental delays and proper logging.
- `src/osimage/cosi/reader.rs`: Modified the `reader` function to encapsulate HTTP requests in a lambda and route them through a new `retriable_request_sender` that retries failed requests.
- `src/osimage/cosi/reader.rs`: Implemented the `retriable_request_sender` function with a configurable retry count (set to 10), incremental sleep intervals, and warning logs for failed attempts.
- `src/osimage/cosi/reader.rs`: Updated the `HttpFile` struct and its initialization to include and pass the retry count for improved download robustness.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #9518, #11001
# 🔍 Description

Adding swap mounts to the storage graph as their own pseudo-fs non-device variant.

Related work items: #12600
# 🔍 Description

This PR removes the repeated instance of `get_first_backing_partition()` and moves the function to be a method of `EngineContext`.

----
#### AI description  (iteration 1)
#### PR Classification
Code refactoring to centralize the logic for retrieving the first backing partition.

#### PR Summary
This pull request consolidates the functionality of get_first_backing_partition by removing its standalone implementations and integrating it into the EngineContext structure. The changes simplify the codebase and ensure a single source of truth for partition retrieval.
- In `src/engine/storage/encryption.rs`, the standalone get_first_backing_partition function is removed and its calls are updated to use EngineContext’s method.
- In `src/subsystems/storage/encryption.rs`, the duplicate function and its tests are eliminated, with calls redirected to the EngineContext method.
- In `src/engine/context/mod.rs`, a new public get_first_backing_partition method is added along with corresponding tests.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12608
# 🔍 Description

Our prism cicd pipeline is failing since the introduction of 1.1.  1.1's OsPackages list contains 'noarch' packages.  Trident's definition of OsPackages does not support noarch (it uses SystemArchitecture which only has amd64 and arm64).

validation: https://dev.azure.com/mariner-org/ECF/_build/results?buildId=840708&view=results

Related work items: #12614
# 🔍 Description

Prism is listing packages like this:

    {
      "name": "gpg-pubkey",
      "version": "3135ce90",
      "release": "5e6fda74",
      "arch": "(none)"
    },

validated with prism dev branch (microsoft/azure-linux-image-tools#277): https://dev.azure.com/mariner-org/ECF/_build/results?buildId=841924&view=results

----
#### AI description  (iteration 1)
#### PR Classification
This pull request introduces a new feature by extending the OS package architecture support through the addition of a '(none)' variant.

#### PR Summary
The changes enable the system to handle OS packages with an architecture value of "(none)" in line with the new COSI metadata requirements.
- `sysdefs/src/arch.rs`: Added a new `None` variant to the `PackageArchitecture` enum with `#[serde(rename = "(none)")]`.
- `src/osimage/cosi/metadata.rs`: Introduced tests to validate deserialization of OS package JSON objects containing the "(none)" architecture.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12614
This is preparation for being able to determine UKI vs. non-UKI images based on the COSI metadata

----
#### AI description  (iteration 1)
#### PR Classification
This pull request is a code refactoring focused on centralizing and standardizing the UKI image detection logic.

#### PR Summary
The changes introduce a new centralized method, `is_uki_image`, within the engine context to replace direct checks of the UKI flag across the codebase. This refactoring enhances consistency and error handling regarding UKI image identification.
- `src/engine/context/mod.rs`: Added a new `is_uki_image` function and a corresponding `is_uki` field to centralize UKI flag checks.
- `src/subsystems/esp.rs`, `src/engine/boot/mod.rs`, and other subsystem files: Replaced direct internal parameter flag checks with calls to `is_uki_image()`.
- `src/engine/update.rs` and `src/engine/clean_install.rs`: Updated context construction to properly propagate the UKI flag.
- `osutils/src/efivar.rs`: Introduced a new helper function `current_var_set` to support related boot configuration checks.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12161
# 🔍 Description

What is this PR about? feature/doc/engineering/bug?

Publish Trident RPMs to a dev feed for easier consumption from Steamboat.

Copy of this PR (https://dev.azure.com/mariner-org/ECF/_git/trident/pullrequest/23590), but created a new one since the last had several files that were only altering white space.

----
#### AI description  (iteration 1)
#### PR Classification
This PR introduces a new pipeline feature to automatically publish RPM packages from the PR pipeline.

#### PR Summary
The pull request adds a new YAML template for RPM publishing and updates existing pipeline configurations to enable conditional publishing during PR builds.
- `/.pipelines/templates/stages/trident_rpms/publish-dev.yml`: Added a new template that checks if an RPM version already exists in the feed and, if not, copies RPMs to a staging directory and publishes them.
- `/.pipelines/templates/stages/trident_rpms/build-source.yml`: Introduced the `publishToDevFeed` parameter and conditionally includes the RPM publishing template.
- `/.pipelines/templates/stages/trident_rpms/trident-stage.yml`: Configured the pipeline to set `publishToDevFeed` to true for PR stages, enabling RPM publishing.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12637
… stopping cleanup from running

# 🔍 Description

The servicing tests has a task for cleaning up azure resources in case the tests fail/timeout.  But that task is erroring out because a variable used to determine if cleanup is needed is unbound (https://dev.azure.com/mariner-org/ECF/_build/results?buildId=838833&view=logs&j=004ade76-5a0e-5a2d-af0c-1845bd9783cc&t=6db98abd-232b-510f-d24d-4e378b1d9353).

----
#### AI description  (iteration 1)
#### PR Classification
Bug fix.

#### PR Summary
This pull request fixes an issue in the Azure servicing tests where an unbound variable error was preventing the cleanup process from running.
- `.pipelines/templates/stages/testing_servicing/testing-template.yml`: modified the bash command by removing the `-u` flag (changed from `set -eux` to `set -ex`) to prevent the script from failing on unbound variables.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12644
…have a timeout

# 🔍 Description

The scale tests are timing out because of a 20 minute task timeout I added when migrating the servicing tests to storm.  Removing the timeout here.

----
#### AI description  (iteration 1)
#### PR Classification
This pull request is an engineering configuration update that removes an unnecessary timeout setting.

#### PR Summary
The change eliminates the explicit 20-minute timeout in the pipeline template for the loop-update.sh task, ensuring that the task runs without an imposed time limit.
- `/.pipelines/templates/stages/testing_servicing/testing-template.yml`: Removed the `timeoutInMinutes: 20` line from the servicing test step.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12643
…teamboat

# 🔍 Description

This PR makes certain permissions optional, since Steamboat does not install all of the same SELinux modules that Trident does. (For example, some RAID permissions are made optional). In addition, this PR adds some permissions such that Steamboat can run Trident successfully from the `ci_unconfined_t` domain.

Steamboat validation: https://dev.azure.com/mariner-org/mariner/_build/results?buildId=845278&view=results

pr-e2e test: https://dev.azure.com/mariner-org/ECF/_build/results?buildId=845239&view=results

![image.png](https://dev.azure.com/mariner-org/2311650c-e79e-4301-b4d2-96543fdd84ff/_apis/git/repositories/895b6b3d-5077-488a-8001-ab6b5a14c1a3/pullRequests/23583/attachments/image.png)

# 🤔 Rationale

Why is this PR needed?

# 📝 Checks

- [ ] Check [dev-docs/manual-validation.md](/dev-docs/manual-validation.md)

# 📌 Follow-ups

TODO:

- #0000

# 🗒️ Notes

make optional

----
#### AI description  (iteration 1)
#### PR Classification
This pull request implements an engineering change to make the `typeattribute` optional in the SELinux policy.

#### PR Summary
The changes refactor the SELinux policy in `selinux-policy-trident/trident.te` by wrapping the `typeattribute` rule within an `optional_policy` block and removing its redundant declaration from the require block.
- `selinux-policy-trident/trident.te`: Replaces the direct `typeattribute` rule with an `optional_policy` block that includes the required type and attribute declarations.
- `selinux-policy-trident/trident.te`: Deletes the redundant `attribute can_change_object_identity;` line from the require section.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12599
# 🔍 Description

SELinux can be in `enforcing` mode with usr-verity so clarify error message to apply only to root-verity.

Related work items: #12579
… + SELinux

# 🔍 Description

Explain that root-verity + SELinux only words with UKI image

Related work items: #12579
# 🔍 Description

make helpers to work with steamboat
…ainer tests

# 🔍 Description

We are publishing before bm container tests run.  Make publish depend on bm container tests.

----
#### AI description  (iteration 1)
#### PR Classification
This pull request is a pipeline configuration update to add a dependency on baremetal container tests for the publish stage.

#### PR Summary
The changes update the pipeline definition to ensure that publishing now depends on baremetal container tests.
- `/.pipelines/templates/stages/publishing/publish.yml`: Added the `BaremetalDeploymentTesting_container` stage to the publish workflow.
<!-- GitOpsUserAgent=GitOps.Apps.Server.pullrequestcopilot -->

Related work items: #12676
@Britel Britel requested a review from a team as a code owner July 4, 2025 03:20
@Britel Britel closed this Jul 4, 2025
@Britel Britel deleted the Britel-code-ql branch July 4, 2025 03:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants