refactor(tools): Improve patch parser logic and add unit tests by sahsagar-google · Pull Request #375 · google/oracle-toolkit

sahsagar-google · 2025-10-22T19:20:10Z

Context
The tools/gen_patch_metadata.py script is critical for maintainers.

This PR modifies it in the following way-

Refactoring gen_patch_metadata.py to be much more robust, resilient, and intelligent.

Adding a new unit test script, test_patch_parser.py, to validate this new logic against our existing patch definitions.

Updating the tools/README.md to document the new script and its more helpful output.

Summary of Changes
1. gen_patch_metadata.py (Major Refactor)
This script was almost completely rewritten to improve parsing reliability and provide better maintainer guidance.

Smarter Parsing (parse_patch):

Old: Relied only on <title> tags in README.html. It would fail if a README was missing or the title was ambiguous.

New: The parse_patch function is now much more robust. It's broken into helpers that:

Read PatchSearch.xml first to definitively get the base release, patch release, and patch abstract.

Find all numeric subdirectories within the patch.

Analyze the content of both README.html and README.txt for keywords (like ojvm, database, gi, etc.) to identify the component type.

Ambiguity Handling:

Old: Would crash with an assert error if it couldn't find a GI or OJVM component.

New: If the README analysis is ambiguous, the script now logs an ERROR, makes an educated "guess," and proceeds. This allows the script to run to completion and is critical for the unit test to function.

Improved Output:

Old: Printed a single, rigid YAML block assuming a RU and RU_Combo pairing.

New: The script now prints the patch's abstract (for context) and then provides all four possible YAML snippets (e.g., GI_RU, DB_RU, RU_Combo, DB_OJVM_RU). This empowers the maintainer to use the abstract to select the correct YAML definitions.

OPatch Download:

Logic was extracted into its own download_opatch function.

It's now smarter and attempts to find the correct OPatch version for the specific database release (e.g., "19c") instead of using a hardcoded version.

2. test_patch_parser.py (New File)
This new unit test script validates the new parsing logic in gen_patch_metadata.py.

It loads all patch definitions from the production gi_patches.yml and rdbms_patches.yml.

For every 2-component combo patch, it:

Downloads the patch .zip from the gcp-oracle-software GCS bucket.

Calls the new gen_patch_metadata.parse_patch function.

Asserts that the parsed base_release and patch_release match the YAML.

Intelligently uses assertSetEqual to compare the set of subdirs. This is key, as it allows the test to pass even if the parser's "guess" (OJVM vs. Other) is different from the YAML, as long as the two correct directories were found.

It includes a skip-list for known obsolete/unavailable 12.1.0.2 patches (per team feedback) so the test suite can run to completion.

3. tools/README.md (Updated)
Updated the gen_patch_metadata.py sample output to show its new, more informative block (with the abstract and multiple YAML options).

Added a new, comprehensive section for test_patch_parser.py, detailing:

How the test works.

Step-by-step instructions for installing dependencies (pip install ..., gcloud auth ...) and running the test.

A guide to "Understanding the Test Output," explaining why Skipping... (for 21c patches) and ambiguous... GUESSING... messages are normal and expected.

google-oss-prow · 2025-10-22T19:20:15Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sahsagar-google
Once this PR has been reviewed and has the lgtm label, please assign mfielding for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

…able

sahsagar-google · 2025-10-22T20:22:49Z

Here is a sample output from a run:

python3 gen_patch_metadata.py --patch 38273558 --mosuser ***@google.com
MOS Password: 
INFO:root:Using local copy of patch file p38273558_190000_Linux-x86-64.zip
INFO:root:Abstract: COMBO OF OJVM COMPONENT 19.29.0.0.251021 + GI RU 19.29.0.0.251021
INFO:root:Found numeric subdirectories: {'38298204', '38194382'}
INFO:root:Assigned 'Other' subdir based on clear keywords: 38298204
INFO:root:Assigned remaining subdir as OJVM: 38194382
INFO:root:Found release = 19.29.0.0.251021 base = 19.3.0.0.0 Other subdir = 38298204 OJVM subdir = 38194382
INFO:root:Downloading OPatch (Patch 6880880)
INFO:root:Target OPatch file: p6880880_190000_Linux-x86-64.zip
INFO:root:Using local copy of OPatch file p6880880_190000_Linux-x86-64.zip

Please copy the following files to your GCS bucket: p38273558_190000_Linux-x86-64.zip p6880880_190000_Linux-x86-64.zip

Add the following to the appropriate sections of roles/common/defaults/main.yml:

# IMPORTANT: Review the patch abstract to make your selections.
# Abstract was: COMBO OF OJVM COMPONENT 19.29.0.0.251021 + GI RU 19.29.0.0.251021

# --- SELECTION 1: Choose the NON-OJVM component (GI or DB) ---
# --- This component is in subdir: /38298204 ---

# 1A: If this is a GI Patch (RU), uncomment this block for gi_patches:
#   gi_patches:
#   - { category: "RU", base: "19.3.0.0.0", release: "19.29.0.0.251021", patchnum: "38273558", patchfile: "p38273558_190000_Linux-x86-64.zip", patch_subdir: "/38298204", prereq_check: FALSE, method: "opatchauto apply", ocm: FALSE, upgrade: FALSE, md5sum: "i09rWxlrLpdp4m4tGWTXWA==" }

# 1B: If this is an RDBMS Patch (DB_RU), uncomment this block for db_patches:
#   db_patches:
#   - { category: "DB_RU", base: "19.3.0.0.0", release: "19.29.0.0.251021", patchnum: "38273558", patchfile: "p38273558_190000_Linux-x86-64.zip", patch_subdir: "/38298204", prereq_check: TRUE, method: "opatch apply", ocm: FALSE, upgrade: TRUE, md5sum: "i09rWxlrLpdp4m4tGWTXWA==" }

# --- SELECTION 2: Choose the OJVM component ---
# --- This component is in subdir: /38194382 ---

# 2A: If this is an OJVM package from a GI Combo (RU_Combo), uncomment this block for rdbms_patches:
#   rdbms_patches:
#   - { category: "RU_Combo", base: "19.3.0.0.0", release: "19.29.0.0.251021", patchnum: "38273558", patchfile: "p38273558_190000_Linux-x86-64.zip", patch_subdir: "/38194382", prereq_check: TRUE, method: "opatch apply", ocm: FALSE, upgrade: TRUE, md5sum: "i09rWxlrLpdp4m4tGWTXWA==" }

# 2B: If this is an OJVM + DB RU Update patch (DB_OJVM_RU), uncomment this block for rdbms_patches:
#   rdbms_patches:
#   - { category: "DB_OJVM_RU", base: "19.3.0.0.0", release: "19.29.0.0.251021", patchnum: "38273558", patchfile: "p38273558_190000_Linux-x86-64.zip", patch_subdir: "/38194382", prereq_check: TRUE, method: "opatch apply", ocm: FALSE, upgrade: TRUE, md5sum: "i09rWxlrLpdp4m4tGWTXWA==" }

mfielding

This change almost doubles the size of the file and I have trouble understanding what it's doing. Can the change be made while maintaining the existing structure?

Alternately, the whole thing could be reimplemented, but I'd hope to see unit tests to prove it works under the various if cases. Maybe actually modify the files in that case?

(Separately, we'll need all the patch categories, but I think you're already working on that)

sahsagar-google · 2025-10-22T21:20:37Z

This change almost doubles the size of the file and I have trouble understanding what it's doing. Can the change be made while maintaining the existing structure?

Alternately, the whole thing could be reimplemented, but I'd hope to see unit tests to prove it works under the various if cases. Maybe actually modify the files in that case?

(Separately, we'll need all the patch categories, but I think you're already working on that)

@mfielding do you think the sample output below covers patch categories that work for us?
#375 (comment)

tools/gen_patch_metadata.py

Fixing url regex and lower-casing boolean values in the output

mfielding · 2025-10-27T19:18:20Z

/retest

mfielding · 2025-10-27T19:18:46Z

/test oracle-toolkit-install-data-guard-on-gcp

Refactor(gen_patch_params): Generalize OJVM combo patch parsing

9741403

sahsagar-google self-assigned this Oct 22, 2025

google-oss-prow bot added the size/L label Oct 22, 2025

sahsagar-google requested a review from mfielding October 22, 2025 19:20

sahsagar-google added 2 commits October 22, 2025 20:13

Fixing indentation and the output statements to make them more accept…

e7ef060

…able

Fixing indentations of main and parse_patcher

f4c039d

mfielding reviewed Oct 22, 2025

View reviewed changes

Reimplementing patch parser, unit tests, README

bbe7793

google-oss-prow bot added size/XXL and removed size/L labels Oct 23, 2025

github-advanced-security bot found potential problems Oct 23, 2025

View reviewed changes

tools/gen_patch_metadata.py Fixed Show fixed Hide fixed

sahsagar-google changed the title ~~Refactor(gen_patch_params): Generalize OJVM combo patch parsing~~ refactor(tools): Improve patch parser logic and add unit tests Oct 23, 2025

Update gen_patch_metadata.py

7bf07db

Fixing url regex and lower-casing boolean values in the output

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

refactor(tools): Improve patch parser logic and add unit tests#375

refactor(tools): Improve patch parser logic and add unit tests#375
sahsagar-google wants to merge 5 commits intomasterfrom
patch_changes

sahsagar-google commented Oct 22, 2025 •

edited

Loading

Uh oh!

google-oss-prow bot commented Oct 22, 2025

Uh oh!

sahsagar-google commented Oct 22, 2025

Uh oh!

mfielding left a comment

Uh oh!

sahsagar-google commented Oct 22, 2025

Uh oh!

Uh oh!

mfielding commented Oct 27, 2025

Uh oh!

mfielding commented Oct 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

sahsagar-google commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

google-oss-prow bot commented Oct 22, 2025

Uh oh!

sahsagar-google commented Oct 22, 2025

Uh oh!

mfielding left a comment

Choose a reason for hiding this comment

Uh oh!

sahsagar-google commented Oct 22, 2025

Uh oh!

Uh oh!

mfielding commented Oct 27, 2025

Uh oh!

mfielding commented Oct 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sahsagar-google commented Oct 22, 2025 •

edited

Loading