Skip to content

Refactor PostgresHook and associated runtime tests#66893

Open
SameerMesiah97 wants to merge 2 commits into
apache:mainfrom
SameerMesiah97:PostgresHook-Refactor
Open

Refactor PostgresHook and associated runtime tests#66893
SameerMesiah97 wants to merge 2 commits into
apache:mainfrom
SameerMesiah97:PostgresHook-Refactor

Conversation

@SameerMesiah97
Copy link
Copy Markdown
Contributor

Description

This change refactors psycopg-specific connection handling in PostgresHook and consolidates duplicated runtime test coverage shared between psycopg2 and psycopg3 implementations.

Connection setup responsibilities previously embedded directly inside get_conn have been extracted into dedicated helper methods. _get_cursor_config now centralizes cursor configuration handling, while _create_connection encapsulates driver-specific connection initialization, including adapter registration and notice handler setup.

The accompanying test suite has also been reorganized to reduce duplicated behavioral coverage. Shared runtime tests have been moved into _BasePostgresHookRuntimeTests, allowing psycopg2- and psycopg3-specific test classes to focus only on behavior that differs between implementations.

Rationale

PostgresHook.get_conn previously contained duplicated psycopg2- and psycopg3-specific connection setup logic. Maintaining separate inline code paths for cursor configuration, adapter registration, and notice handler initialization increases maintenance overhead and makes future changes harder to validate consistently across both drivers.

Extracting _get_cursor_config and _create_connection simplifies the control flow in get_conn, reduces branching complexity, and makes psycopg-specific behavior easier to reason about and extend independently.

The test suite had a similar issue, with substantial overlap between psycopg2 and psycopg3 runtime behavioral tests. Consolidating identical coverage into _BasePostgresHookRuntimeTests improves readability, reduces repeated CI execution for equivalent scenarios, and keeps implementation-specific tests scoped to behavior that actually differs.

Tests

  • Consolidated duplicated runtime behavioral tests into _BasePostgresHookRuntimeTests.
  • Updated psycopg2 and psycopg3 runtime test classes to inherit shared coverage.
  • Preserved existing psycopg2-specific lineage, execute_batch, and notice handling tests.
  • Preserved existing psycopg3-specific copy_expert and notice handling coverage.
  • Updated and reorganized existing tests where appropriate.

Backwards Compatibility

This change only refactors test code and does not modify production behavior or public APIs.

@eladkal eladkal requested review from Dev-iL and dabla May 14, 2026 05:00
@Dev-iL Dev-iL added the all versions If set, the CI build will be forced to use all versions of Python/K8S/DBs label May 14, 2026
Copy link
Copy Markdown
Collaborator

@Dev-iL Dev-iL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you - this is a much-needed refactor I never got around to doing. Mostly LGTM with two inquiries/concerns.

conn_args["cursor_factory"] = self._get_cursor(raw_cursor)
if raw_cursor:
key, value = self._get_cursor_config(raw_cursor)
conn_args[key] = value
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the PPG2 branch's handling of raw_cursor was kept:

raw_cursor = conn.extra_dejson.get("cursor")  # PPG3, default None
raw_cursor = conn.extra_dejson.get("cursor", False)  # PPG2, default False

Although both are falsy - are you sure this is safe?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe keeping the truthy check i.e. if raw_cursor should preserve existing behavior. Before the refactoring raw_cursor in the PPG2 path would default to False is no raw_cursor is found and the truthy check would prevent the cursor from being evaluated. The only difference now is that None is kept as None, which fails the truthy check. In other words, I see no need to change this.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm alright. I hope there was no good reason for that difference existing in the first place.

Comment on lines +252 to +254
return cast("CompatConnection", connection)

return cast("CompatConnection", ppg2_connect(**conn_args))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing how you're already refactoring this, is there a way to avoid the casts? Do you think we still need CompatConnection?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed the cast the from both.

Sameer Mesiah added 2 commits May 14, 2026 19:32
…sycopg3. Consolidate duplicated insert/upsert and dialect tests into a shared base class while preserving version-specific behavior and lineage coverage.
@SameerMesiah97 SameerMesiah97 force-pushed the PostgresHook-Refactor branch from 6e2a98e to 2561091 Compare May 14, 2026 18:32
@SameerMesiah97 SameerMesiah97 requested a review from Dev-iL May 14, 2026 19:26
@SameerMesiah97
Copy link
Copy Markdown
Contributor Author

@Dev-iL

I have addressed your feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

all versions If set, the CI build will be forced to use all versions of Python/K8S/DBs area:providers provider:postgres

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants