Skip to content

Improve extension upgrade regression test (addendum to #2364)#2377

Merged
MuhammadTahaNaveed merged 1 commit into
apache:masterfrom
jrgemignani:update_upgrade_regression_tests
Apr 15, 2026
Merged

Improve extension upgrade regression test (addendum to #2364)#2377
MuhammadTahaNaveed merged 1 commit into
apache:masterfrom
jrgemignani:update_upgrade_regression_tests

Conversation

@jrgemignani
Copy link
Copy Markdown
Contributor

Note: This PR was created with AI tools and a human.

This is an addendum to PR #2364 with three improvements.

Makefile:

  • Replace awk-based synthetic version (minor+1) with an _upgrade_test suffix (e.g., 1.7.0 -> 1.7.0_upgrade_test). The awk approach produced numeric versions like 1.8.0 that could collide with real future upgrade scripts, and the ::int[] cast in the SQL version lookup fails on non-numeric version strings. The _upgrade_test suffix avoids both issues and is unambiguously synthetic.
  • Extend the generated cleanup script to also remove repo-root copies of the synthetic files and to self-delete, preventing stale artifacts from accumulating across repeated test runs.

Regression test (regress/sql/age_upgrade.sql):

  • Simplify version lookup to directly select the _upgrade_test version via LIKE '%_upgrade_test' instead of picking the highest non-default version with string_to_array(version, '.')::int[] DESC. The old approach would fail with a cast error on the _upgrade_test suffix and was unnecessarily indirect — the test knows exactly what synthetic version the Makefile installed.

modified: Makefile
modified: regress/expected/age_upgrade.out
modified: regress/sql/age_upgrade.sql

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refines the extension upgrade regression test infrastructure introduced in #2364 by switching to a clearly synthetic “next” version suffix and updating the regression test’s version selection and cleanup behavior.

Changes:

  • Change synthetic “next” extension versioning from minor+1 to a _upgrade_test suffix to avoid collisions with real future versions.
  • Update the upgrade regression test to select the synthetic version via pattern matching instead of numeric array casting.
  • Extend the generated cleanup script to remove additional synthetic artifacts and self-delete.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
Makefile Generates _upgrade_test synthetic version and expands cleanup script to remove more artifacts.
regress/sql/age_upgrade.sql Updates lookup logic for selecting the synthetic upgrade target version.
regress/expected/age_upgrade.out Adjusts expected output to match the updated version-lookup query.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread regress/sql/age_upgrade.sql
Comment thread Makefile
Note: This PR was created with AI tools and a human.

This is an addendum to PR apache#2364 with three improvements.

Makefile:
- Replace awk-based synthetic version (minor+1) with an _upgrade_test
  suffix (e.g., 1.7.0 -> 1.7.0_upgrade_test). The awk approach produced
  numeric versions like 1.8.0 that could collide with real future upgrade
  scripts, and the ::int[] cast in the SQL version lookup fails on
  non-numeric version strings. The _upgrade_test suffix avoids both
  issues and is unambiguously synthetic.
- Extend the generated cleanup script to also remove repo-root copies
  of the synthetic files and to self-delete, preventing stale artifacts
  from accumulating across repeated test runs.

Regression test (regress/sql/age_upgrade.sql):
- Simplify version lookup to directly select the _upgrade_test version
  via LIKE '%_upgrade_test' instead of picking the highest non-default
  version with string_to_array(version, '.')::int[] DESC. The old
  approach would fail with a cast error on the _upgrade_test suffix and
  was unnecessarily indirect — the test knows exactly what synthetic
  version the Makefile installed.

modified:   Makefile
modified:   regress/expected/age_upgrade.out
modified:   regress/sql/age_upgrade.sql

Co-authored-by: Claude <noreply@anthropic.com>
@jrgemignani jrgemignani force-pushed the update_upgrade_regression_tests branch from 1fb1be3 to e6536e1 Compare April 14, 2026 20:56
@MuhammadTahaNaveed MuhammadTahaNaveed merged commit 1847644 into apache:master Apr 15, 2026
6 checks passed
@jrgemignani jrgemignani mentioned this pull request Apr 21, 2026
jrgemignani added a commit to jrgemignani/age that referenced this pull request Apr 22, 2026
The age_upgrade regression test (added in apache#2364, improved in apache#2377, apache#2397)
was designed to validate the upgrade template (age--<VER>--y.y.y.sql) by
creating graph data before the upgrade and verifying it survived afterward.
This approach had two fundamental problems:

1. It did not detect incomplete upgrade templates. The test verified that
   graph data (vertices, edges, checksums, GIN indexes) survived ALTER
   EXTENSION UPDATE, but never checked whether new SQL objects (functions,
   views, relations, indexes, types, operators, casts, constraints) were
   actually created by the template. A developer could add a new function
   to sql/ and sql_files, forget to add it to the upgrade template, and
   all tests would pass — the function existed via the fresh CREATE
   EXTENSION install that ran before the upgrade test, but would be
   missing for users who upgraded via ALTER EXTENSION UPDATE.

2. The data-integrity checks relied on cypher queries (MATCH/RETURN) within
   the same backend session after DROP EXTENSION + CREATE EXTENSION. This
   caused intermittent failures on some PostgreSQL versions where AGE's
   internal type cache (agtype OID) was not properly refreshed after the
   extension was dropped and recreated, resulting in 'type with OID 0
   does not exist' errors. The data-integrity aspect was also redundant —
   ALTER EXTENSION UPDATE runs DDL statements and does not touch heap data,
   so data survival is guaranteed by PostgreSQL and not a meaningful test.

The fix replaces the entire test with a comprehensive catalog comparison:

  1. Snapshot the ag_catalog schema from the fresh install across seven
     PostgreSQL system catalogs:
       - pg_proc: functions, aggregates, procedures (name, args, and
         properties: volatility, strictness, kind, return type, setof)
       - pg_class: tables, views, sequences, indexes (name, kind)
       - pg_type: types (name, type category)
       - pg_operator: operators (name, left/right operand types)
       - pg_cast: casts involving AGE types (source, target, context)
       - pg_opclass: operator classes (name, access method)
       - pg_constraint: constraints (name, type, table, referenced table)
  2. DROP EXTENSION, CREATE EXTENSION at the synthetic initial version,
     then ALTER EXTENSION UPDATE to the current version via the stamped
     upgrade template.
  3. Snapshot the catalog again after upgrade.
  4. Compare: any object present in the fresh snapshot but missing after
     upgrade means the template is incomplete. Any object present after
     upgrade but not in the fresh snapshot means the template creates
     something unexpected. Function properties (volatility, strictness,
     prokind, return type) are also compared for functions that exist in
     both — catching cases where a CREATE OR REPLACE in the template
     changes a function's signature or behavior.

Additional improvements from code review feedback:

  - Graph cleanup in Step 1 uses a DO block with PERFORM and suppressed
    NOTICEs to produce deterministic output regardless of prior test state.
  - The pg_class snapshot includes indexes (relkind 'i') in addition to
    tables, views, and sequences.
  - Diagnostic output includes relkind/typtype suffixes for actionable diffs.
  - Summary uses boolean equality checks (funcs_match, rels_match, etc.)
    instead of absolute counts, so the expected output does not need
    updating when new objects are added to AGE. Developers who correctly
    add objects to both sql/ and the template will never need to modify
    this test or its expected output.

This approach:
  - Catches the actual failure mode: incomplete upgrade templates.
  - Covers all SQL object categories: functions (including aggregates),
    relations, types, operators, casts, operator classes, and constraints.
  - Detects property changes on existing functions (volatility, strictness,
    kind, return type changes).
  - Uses only plain SQL catalog queries — no cypher, no .so cache issues.
  - Works reliably across all PostgreSQL versions.
  - Reports the exact missing/extra/changed object in the diff output.
  - Is maintenance-free: no expected output changes needed when AGE grows.

Makefile: updated step 5 comment to reflect catalog comparison approach.

All 33 regression tests pass.

Co-authored-by: Claude <noreply@anthropic.com>

modified:   Makefile
modified:   regress/expected/age_upgrade.out
modified:   regress/sql/age_upgrade.sql
jrgemignani added a commit to jrgemignani/age that referenced this pull request Apr 25, 2026
The age_upgrade regression test (added in apache#2364, improved in apache#2377, apache#2397)
was designed to validate the upgrade template (age--<VER>--y.y.y.sql) by
creating graph data before the upgrade and verifying it survived afterward.
This approach had two fundamental problems:

1. It did not detect incomplete upgrade templates. The test verified that
   graph data (vertices, edges, checksums, GIN indexes) survived ALTER
   EXTENSION UPDATE, but never checked whether new SQL objects (functions,
   views, relations, indexes, types, operators, casts, constraints) were
   actually created by the template. A developer could add a new function
   to sql/ and sql_files, forget to add it to the upgrade template, and
   all tests would pass — the function existed via the fresh CREATE
   EXTENSION install that ran before the upgrade test, but would be
   missing for users who upgraded via ALTER EXTENSION UPDATE.

2. The data-integrity checks relied on cypher queries (MATCH/RETURN) within
   the same backend session after DROP EXTENSION + CREATE EXTENSION. This
   caused intermittent failures on some PostgreSQL versions where AGE's
   internal type cache (agtype OID) was not properly refreshed after the
   extension was dropped and recreated, resulting in 'type with OID 0
   does not exist' errors. The data-integrity aspect was also redundant —
   ALTER EXTENSION UPDATE runs DDL statements and does not touch heap data,
   so data survival is guaranteed by PostgreSQL and not a meaningful test.

The fix replaces the entire test with a comprehensive catalog comparison:

  1. Snapshot the ag_catalog schema from the fresh install across seven
     PostgreSQL system catalogs:
       - pg_proc: functions, aggregates, procedures (name, args, and
         properties: volatility, strictness, kind, return type, setof)
       - pg_class: tables, views, sequences, indexes (name, kind)
       - pg_type: types (name, type category)
       - pg_operator: operators (name, left/right operand types)
       - pg_cast: casts involving AGE types (source, target, context)
       - pg_opclass: operator classes (name, access method)
       - pg_constraint: constraints (name, type, table, referenced table)
  2. DROP EXTENSION, CREATE EXTENSION at the synthetic initial version,
     then ALTER EXTENSION UPDATE to the current version via the stamped
     upgrade template.
  3. Snapshot the catalog again after upgrade.
  4. Compare: any object present in the fresh snapshot but missing after
     upgrade means the template is incomplete. Any object present after
     upgrade but not in the fresh snapshot means the template creates
     something unexpected. Function properties (volatility, strictness,
     prokind, return type) are also compared for functions that exist in
     both — catching cases where a CREATE OR REPLACE in the template
     changes a function's signature or behavior.

Additional improvements from code review feedback:

  - Graph cleanup in Step 1 uses a DO block with PERFORM and suppressed
    NOTICEs to produce deterministic output regardless of prior test state.
  - The pg_class snapshot includes indexes (relkind 'i') in addition to
    tables, views, and sequences.
  - Diagnostic output includes relkind/typtype suffixes for actionable diffs.
  - Summary uses boolean equality checks (funcs_match, rels_match, etc.)
    instead of absolute counts, so the expected output does not need
    updating when new objects are added to AGE. Developers who correctly
    add objects to both sql/ and the template will never need to modify
    this test or its expected output.

This approach:
  - Catches the actual failure mode: incomplete upgrade templates.
  - Covers all SQL object categories: functions (including aggregates),
    relations, types, operators, casts, operator classes, and constraints.
  - Detects property changes on existing functions (volatility, strictness,
    kind, return type changes).
  - Uses only plain SQL catalog queries — no cypher, no .so cache issues.
  - Works reliably across all PostgreSQL versions.
  - Reports the exact missing/extra/changed object in the diff output.
  - Is maintenance-free: no expected output changes needed when AGE grows.

Makefile: updated step 5 comment to reflect catalog comparison approach.

All 33 regression tests pass.

Co-authored-by: Claude <noreply@anthropic.com>

modified:   Makefile
modified:   regress/expected/age_upgrade.out
modified:   regress/sql/age_upgrade.sql
jrgemignani added a commit to jrgemignani/age that referenced this pull request Apr 27, 2026
The age_upgrade regression test (added in apache#2364, improved in apache#2377, apache#2397)
was designed to validate the upgrade template (age--<VER>--y.y.y.sql) by
creating graph data before the upgrade and verifying it survived afterward.
This approach had two fundamental problems:

1. It did not detect incomplete upgrade templates. The test verified that
   graph data (vertices, edges, checksums, GIN indexes) survived ALTER
   EXTENSION UPDATE, but never checked whether new SQL objects (functions,
   views, relations, indexes, types, operators, casts, constraints) were
   actually created by the template. A developer could add a new function
   to sql/ and sql_files, forget to add it to the upgrade template, and
   all tests would pass — the function existed via the fresh CREATE
   EXTENSION install that ran before the upgrade test, but would be
   missing for users who upgraded via ALTER EXTENSION UPDATE.

2. The data-integrity checks relied on cypher queries (MATCH/RETURN) within
   the same backend session after DROP EXTENSION + CREATE EXTENSION. This
   caused intermittent failures on some PostgreSQL versions where AGE's
   internal type cache (agtype OID) was not properly refreshed after the
   extension was dropped and recreated, resulting in 'type with OID 0
   does not exist' errors. The data-integrity aspect was also redundant —
   ALTER EXTENSION UPDATE runs DDL statements and does not touch heap data,
   so data survival is guaranteed by PostgreSQL and not a meaningful test.

The fix replaces the entire test with a comprehensive catalog comparison:

  1. Snapshot the ag_catalog schema from the fresh install across seven
     PostgreSQL system catalogs:
       - pg_proc: functions, aggregates, procedures (name, args, and
         properties: volatility, strictness, kind, return type, setof)
       - pg_class: tables, views, sequences, indexes (name, kind)
       - pg_type: types (name, type category)
       - pg_operator: operators (name, left/right operand types)
       - pg_cast: casts involving AGE types (source, target, context)
       - pg_opclass: operator classes (name, access method)
       - pg_constraint: constraints (name, type, table, referenced table)
  2. DROP EXTENSION, CREATE EXTENSION at the synthetic initial version,
     then ALTER EXTENSION UPDATE to the current version via the stamped
     upgrade template.
  3. Snapshot the catalog again after upgrade.
  4. Compare: any object present in the fresh snapshot but missing after
     upgrade means the template is incomplete. Any object present after
     upgrade but not in the fresh snapshot means the template creates
     something unexpected. Function properties (volatility, strictness,
     prokind, return type) are also compared for functions that exist in
     both — catching cases where a CREATE OR REPLACE in the template
     changes a function's signature or behavior.

Additional improvements from code review feedback:

  - Graph cleanup in Step 1 uses a DO block with PERFORM and suppressed
    NOTICEs to produce deterministic output regardless of prior test state.
  - The pg_class snapshot includes indexes (relkind 'i') in addition to
    tables, views, and sequences.
  - Diagnostic output includes relkind/typtype suffixes for actionable diffs.
  - Summary uses boolean equality checks (funcs_match, rels_match, etc.)
    instead of absolute counts, so the expected output does not need
    updating when new objects are added to AGE. Developers who correctly
    add objects to both sql/ and the template will never need to modify
    this test or its expected output.

This approach:
  - Catches the actual failure mode: incomplete upgrade templates.
  - Covers all SQL object categories: functions (including aggregates),
    relations, types, operators, casts, operator classes, and constraints.
  - Detects property changes on existing functions (volatility, strictness,
    kind, return type changes).
  - Uses only plain SQL catalog queries — no cypher, no .so cache issues.
  - Works reliably across all PostgreSQL versions.
  - Reports the exact missing/extra/changed object in the diff output.
  - Is maintenance-free: no expected output changes needed when AGE grows.

Makefile: updated step 5 comment to reflect catalog comparison approach.

All 33 regression tests pass.

Co-authored-by: Claude <noreply@anthropic.com>

modified:   Makefile
modified:   regress/expected/age_upgrade.out
modified:   regress/sql/age_upgrade.sql
MuhammadTahaNaveed pushed a commit that referenced this pull request Apr 30, 2026
…on (#2403)

The age_upgrade regression test (added in #2364, improved in #2377, #2397)
was designed to validate the upgrade template (age--<VER>--y.y.y.sql) by
creating graph data before the upgrade and verifying it survived afterward.
This approach had two fundamental problems:

1. It did not detect incomplete upgrade templates. The test verified that
   graph data (vertices, edges, checksums, GIN indexes) survived ALTER
   EXTENSION UPDATE, but never checked whether new SQL objects (functions,
   views, relations, indexes, types, operators, casts, constraints) were
   actually created by the template. A developer could add a new function
   to sql/ and sql_files, forget to add it to the upgrade template, and
   all tests would pass — the function existed via the fresh CREATE
   EXTENSION install that ran before the upgrade test, but would be
   missing for users who upgraded via ALTER EXTENSION UPDATE.

2. The data-integrity checks relied on cypher queries (MATCH/RETURN) within
   the same backend session after DROP EXTENSION + CREATE EXTENSION. This
   caused intermittent failures on some PostgreSQL versions where AGE's
   internal type cache (agtype OID) was not properly refreshed after the
   extension was dropped and recreated, resulting in 'type with OID 0
   does not exist' errors. The data-integrity aspect was also redundant —
   ALTER EXTENSION UPDATE runs DDL statements and does not touch heap data,
   so data survival is guaranteed by PostgreSQL and not a meaningful test.

The fix replaces the entire test with a comprehensive catalog comparison:

  1. Snapshot the ag_catalog schema from the fresh install across seven
     PostgreSQL system catalogs:
       - pg_proc: functions, aggregates, procedures (name, args, and
         properties: volatility, strictness, kind, return type, setof)
       - pg_class: tables, views, sequences, indexes (name, kind)
       - pg_type: types (name, type category)
       - pg_operator: operators (name, left/right operand types)
       - pg_cast: casts involving AGE types (source, target, context)
       - pg_opclass: operator classes (name, access method)
       - pg_constraint: constraints (name, type, table, referenced table)
  2. DROP EXTENSION, CREATE EXTENSION at the synthetic initial version,
     then ALTER EXTENSION UPDATE to the current version via the stamped
     upgrade template.
  3. Snapshot the catalog again after upgrade.
  4. Compare: any object present in the fresh snapshot but missing after
     upgrade means the template is incomplete. Any object present after
     upgrade but not in the fresh snapshot means the template creates
     something unexpected. Function properties (volatility, strictness,
     prokind, return type) are also compared for functions that exist in
     both — catching cases where a CREATE OR REPLACE in the template
     changes a function's signature or behavior.

Additional improvements from code review feedback:

  - Graph cleanup in Step 1 uses a DO block with PERFORM and suppressed
    NOTICEs to produce deterministic output regardless of prior test state.
  - The pg_class snapshot includes indexes (relkind 'i') in addition to
    tables, views, and sequences.
  - Diagnostic output includes relkind/typtype suffixes for actionable diffs.
  - Summary uses boolean equality checks (funcs_match, rels_match, etc.)
    instead of absolute counts, so the expected output does not need
    updating when new objects are added to AGE. Developers who correctly
    add objects to both sql/ and the template will never need to modify
    this test or its expected output.

This approach:
  - Catches the actual failure mode: incomplete upgrade templates.
  - Covers all SQL object categories: functions (including aggregates),
    relations, types, operators, casts, operator classes, and constraints.
  - Detects property changes on existing functions (volatility, strictness,
    kind, return type changes).
  - Uses only plain SQL catalog queries — no cypher, no .so cache issues.
  - Works reliably across all PostgreSQL versions.
  - Reports the exact missing/extra/changed object in the diff output.
  - Is maintenance-free: no expected output changes needed when AGE grows.

Makefile: updated step 5 comment to reflect catalog comparison approach.

All 33 regression tests pass.

Co-authored-by: Claude <noreply@anthropic.com>

modified:   Makefile
modified:   regress/expected/age_upgrade.out
modified:   regress/sql/age_upgrade.sql
muhammadshoaib pushed a commit that referenced this pull request May 5, 2026
Fix upgrade test: allow function removal and detect more deficiencies.

The age_upgrade regression test (added in #2364, refined in #2377, #2397,
install and a synthetic-initial -> current upgrade. Three gaps surfaced
in practice:

1. Function removal forced permanent C stubs.
   The synthetic '_initial' install is built from a fixed historical
   commit. CREATE EXTENSION resolves every CREATE FUNCTION ... AS
   '$libdir/age', '<symbol>' via dlsym at install time when
   check_function_bodies is on (the default). If a developer retires a
   C entry point in HEAD's age.so, step 10 aborts with "could not find
   function ... in file age.so" -- even though the immediately-following
   ALTER EXTENSION UPDATE would DROP that SQL declaration. The only way
   to keep the test green was to leave a permanent error-raising stub
   in age.so, and to remember to add a DROP to the upgrade template.

2. Modifications were under-detected.
   The function-property-change query did not compare probin or prosrc,
   so a C function whose symbol was renamed in the upgrade template, or
   a SQL/plpgsql function whose body changed in either path, slipped
   through.

3. Extension membership was not checked.
   A template that CREATEs an object but never ALTER EXTENSION ADDs it
   leaves a row in pg_proc/pg_class but no pg_depend deptype='e' link.
   pg_dump --extension would diverge, but the existing per-catalog diff
   queries all returned 0 rows.

Changes (regress/sql/age_upgrade.sql + regress/expected/age_upgrade.out):

* Step 10 wraps the synthetic CREATE EXTENSION in
  SET check_function_bodies = off; ... RESET check_function_bodies;
  Symbol resolution is deferred to call time. Step 11's ALTER EXTENSION
  UPDATE then DROPs any retired functions before any plan can call them.
  Step 35's fresh CREATE EXTENSION runs at the GUC default, so HEAD's
  sql/ <-> HEAD's age.so consistency is still enforced on the production
  install path.

* Steps 2 and 13 add probin and prosrc to the function snapshot.
  Step 21 reports probin and prosrc divergences alongside the existing
  property-change columns.

* Steps 7b and 18b add an extension-membership snapshot from
  pg_depend deptype='e' filtered to the AGE extension OID. Every member
  is labeled by stable identity (regprocedure, regtype, regoperator,
  opfname+strategy+types, etc.), never by raw OID, so OID drift between
  fresh and upgrade installs cannot produce false positives. Steps 33a
  and 33b report MISSING / EXTRA members. Step 34 adds extmembers_match
  to the summary row.

* Section-header step ranges updated to include the new sub-steps.

The change is fully self-contained: only regress/sql/age_upgrade.sql and
regress/expected/age_upgrade.out are modified. No production C, SQL,
build, or test files are touched. All 34 regression tests pass on a
clean tree.

Mutation-tested with 8 cases against the unmutated tree: baseline pass;
remove-function-with-DROP pass (no stub needed); remove-function-forget-
DROP fail; add-function-with-CREATE pass; add-function-forget-CREATE
fail; volatility-change-no-template fail; volatility-change-with-CREATE-
OR-REPLACE pass; C-symbol-rename-no-template fail. All eight expected
outcomes observed.

All 34 regression tests pass.

Co-authored-by: Claude <noreply@anthropic.com>

modified:   regress/expected/age_upgrade.out
modified:   regress/sql/age_upgrade.sql
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants