Skip to content

Comments

refactor(tools): Improve patch parser logic and add unit tests#375

Open
sahsagar-google wants to merge 5 commits intomasterfrom
patch_changes
Open

refactor(tools): Improve patch parser logic and add unit tests#375
sahsagar-google wants to merge 5 commits intomasterfrom
patch_changes

Conversation

@sahsagar-google
Copy link
Collaborator

@sahsagar-google sahsagar-google commented Oct 22, 2025

Context
The tools/gen_patch_metadata.py script is critical for maintainers.

This PR modifies it in the following way-

Refactoring gen_patch_metadata.py to be much more robust, resilient, and intelligent.

Adding a new unit test script, test_patch_parser.py, to validate this new logic against our existing patch definitions.

Updating the tools/README.md to document the new script and its more helpful output.

Summary of Changes
1. gen_patch_metadata.py (Major Refactor)
This script was almost completely rewritten to improve parsing reliability and provide better maintainer guidance.

Smarter Parsing (parse_patch):

Old: Relied only on <title> tags in README.html. It would fail if a README was missing or the title was ambiguous.

New: The parse_patch function is now much more robust. It's broken into helpers that:

Read PatchSearch.xml first to definitively get the base release, patch release, and patch abstract.

Find all numeric subdirectories within the patch.

Analyze the content of both README.html and README.txt for keywords (like ojvm, database, gi, etc.) to identify the component type.

Ambiguity Handling:

Old: Would crash with an assert error if it couldn't find a GI or OJVM component.

New: If the README analysis is ambiguous, the script now logs an ERROR, makes an educated "guess," and proceeds. This allows the script to run to completion and is critical for the unit test to function.

Improved Output:

Old: Printed a single, rigid YAML block assuming a RU and RU_Combo pairing.

New: The script now prints the patch's abstract (for context) and then provides all four possible YAML snippets (e.g., GI_RU, DB_RU, RU_Combo, DB_OJVM_RU). This empowers the maintainer to use the abstract to select the correct YAML definitions.

OPatch Download:

Logic was extracted into its own download_opatch function.

It's now smarter and attempts to find the correct OPatch version for the specific database release (e.g., "19c") instead of using a hardcoded version.

2. test_patch_parser.py (New File)
This new unit test script validates the new parsing logic in gen_patch_metadata.py.

It loads all patch definitions from the production gi_patches.yml and rdbms_patches.yml.

For every 2-component combo patch, it:

Downloads the patch .zip from the gcp-oracle-software GCS bucket.

Calls the new gen_patch_metadata.parse_patch function.

Asserts that the parsed base_release and patch_release match the YAML.

Intelligently uses assertSetEqual to compare the set of subdirs. This is key, as it allows the test to pass even if the parser's "guess" (OJVM vs. Other) is different from the YAML, as long as the two correct directories were found.

It includes a skip-list for known obsolete/unavailable 12.1.0.2 patches (per team feedback) so the test suite can run to completion.

3. tools/README.md (Updated)
Updated the gen_patch_metadata.py sample output to show its new, more informative block (with the abstract and multiple YAML options).

Added a new, comprehensive section for test_patch_parser.py, detailing:

How the test works.

Step-by-step instructions for installing dependencies (pip install ..., gcloud auth ...) and running the test.

A guide to "Understanding the Test Output," explaining why Skipping... (for 21c patches) and ambiguous... GUESSING... messages are normal and expected.

@sahsagar-google sahsagar-google self-assigned this Oct 22, 2025
@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sahsagar-google
Once this PR has been reviewed and has the lgtm label, please assign mfielding for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sahsagar-google
Copy link
Collaborator Author

Here is a sample output from a run:

python3 gen_patch_metadata.py --patch 38273558 --mosuser ***@google.com
MOS Password: 
INFO:root:Using local copy of patch file p38273558_190000_Linux-x86-64.zip
INFO:root:Abstract: COMBO OF OJVM COMPONENT 19.29.0.0.251021 + GI RU 19.29.0.0.251021
INFO:root:Found numeric subdirectories: {'38298204', '38194382'}
INFO:root:Assigned 'Other' subdir based on clear keywords: 38298204
INFO:root:Assigned remaining subdir as OJVM: 38194382
INFO:root:Found release = 19.29.0.0.251021 base = 19.3.0.0.0 Other subdir = 38298204 OJVM subdir = 38194382
INFO:root:Downloading OPatch (Patch 6880880)
INFO:root:Target OPatch file: p6880880_190000_Linux-x86-64.zip
INFO:root:Using local copy of OPatch file p6880880_190000_Linux-x86-64.zip

Please copy the following files to your GCS bucket: p38273558_190000_Linux-x86-64.zip p6880880_190000_Linux-x86-64.zip

Add the following to the appropriate sections of roles/common/defaults/main.yml:

# IMPORTANT: Review the patch abstract to make your selections.
# Abstract was: COMBO OF OJVM COMPONENT 19.29.0.0.251021 + GI RU 19.29.0.0.251021

# --- SELECTION 1: Choose the NON-OJVM component (GI or DB) ---
# --- This component is in subdir: /38298204 ---

# 1A: If this is a GI Patch (RU), uncomment this block for gi_patches:
#   gi_patches:
#   - { category: "RU", base: "19.3.0.0.0", release: "19.29.0.0.251021", patchnum: "38273558", patchfile: "p38273558_190000_Linux-x86-64.zip", patch_subdir: "/38298204", prereq_check: FALSE, method: "opatchauto apply", ocm: FALSE, upgrade: FALSE, md5sum: "i09rWxlrLpdp4m4tGWTXWA==" }

# 1B: If this is an RDBMS Patch (DB_RU), uncomment this block for db_patches:
#   db_patches:
#   - { category: "DB_RU", base: "19.3.0.0.0", release: "19.29.0.0.251021", patchnum: "38273558", patchfile: "p38273558_190000_Linux-x86-64.zip", patch_subdir: "/38298204", prereq_check: TRUE, method: "opatch apply", ocm: FALSE, upgrade: TRUE, md5sum: "i09rWxlrLpdp4m4tGWTXWA==" }

# --- SELECTION 2: Choose the OJVM component ---
# --- This component is in subdir: /38194382 ---

# 2A: If this is an OJVM package from a GI Combo (RU_Combo), uncomment this block for rdbms_patches:
#   rdbms_patches:
#   - { category: "RU_Combo", base: "19.3.0.0.0", release: "19.29.0.0.251021", patchnum: "38273558", patchfile: "p38273558_190000_Linux-x86-64.zip", patch_subdir: "/38194382", prereq_check: TRUE, method: "opatch apply", ocm: FALSE, upgrade: TRUE, md5sum: "i09rWxlrLpdp4m4tGWTXWA==" }

# 2B: If this is an OJVM + DB RU Update patch (DB_OJVM_RU), uncomment this block for rdbms_patches:
#   rdbms_patches:
#   - { category: "DB_OJVM_RU", base: "19.3.0.0.0", release: "19.29.0.0.251021", patchnum: "38273558", patchfile: "p38273558_190000_Linux-x86-64.zip", patch_subdir: "/38194382", prereq_check: TRUE, method: "opatch apply", ocm: FALSE, upgrade: TRUE, md5sum: "i09rWxlrLpdp4m4tGWTXWA==" }

Copy link
Member

@mfielding mfielding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change almost doubles the size of the file and I have trouble understanding what it's doing. Can the change be made while maintaining the existing structure?

Alternately, the whole thing could be reimplemented, but I'd hope to see unit tests to prove it works under the various if cases. Maybe actually modify the files in that case?

(Separately, we'll need all the patch categories, but I think you're already working on that)

@sahsagar-google
Copy link
Collaborator Author

This change almost doubles the size of the file and I have trouble understanding what it's doing. Can the change be made while maintaining the existing structure?

Alternately, the whole thing could be reimplemented, but I'd hope to see unit tests to prove it works under the various if cases. Maybe actually modify the files in that case?

(Separately, we'll need all the patch categories, but I think you're already working on that)

@mfielding do you think the sample output below covers patch categories that work for us?
#375 (comment)

@google-oss-prow google-oss-prow bot added size/XXL and removed size/L labels Oct 23, 2025
@sahsagar-google sahsagar-google changed the title Refactor(gen_patch_params): Generalize OJVM combo patch parsing refactor(tools): Improve patch parser logic and add unit tests Oct 23, 2025
Fixing url regex and lower-casing boolean values in the output
@mfielding
Copy link
Member

/retest

@mfielding
Copy link
Member

/test oracle-toolkit-install-data-guard-on-gcp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants