Skip to content

fix: (CDK) (Manifest Migration) - fix the request_body_json / request_body_data > request_body migration #521

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 30, 2025

Conversation

bazarnov
Copy link
Contributor

@bazarnov bazarnov commented Apr 29, 2025

What

After this change merged, there is a need to adjust the actual schema and the routing logic expected by Form-Based UI.

How

  • add several new component type options to use within the new HttpRequester.request_body field:
    • RequestBodyUrlEncodedForm --> to encode as body_data
    • RequestBodyGraphQL + RequestBodyGraphQlQuery (as a value) --> to encode as body_json
    • RequestBodyJsonObject --> to encode as body_json
    • RequestBodyPlainText --> to encode as body_json
  • adjust the existing unit_tests to align with the new component types
  • adjust migration logic in airbyte_cdk.manifest_migrations.migrations.http_requester_request_body_json_data_to_request_body.py to correctly resolve the previous structure to use the new one instead.
  • correct the existing migration test to cover the cases.

Summary by CodeRabbit

  • New Features

    • Enhanced support for specifying HTTP request bodies with distinct types: plain text, URL-encoded form, JSON object, and GraphQL, enabling more precise and flexible API request configurations.
  • Bug Fixes

    • Improved migration logic to accurately convert legacy request body formats to the new, explicit request body types.
  • Tests

    • Expanded and updated test coverage to validate the new request body type handling and migration behavior.

@bazarnov bazarnov marked this pull request as ready for review April 30, 2025 09:01
Copy link
Contributor

coderabbitai bot commented Apr 30, 2025

📝 Walkthrough

Walkthrough

This set of changes refactors and expands the handling of HTTP request body types in the declarative component schema, migration logic, and related code. The migration for unifying request_body_json and request_body_data into a single request_body key now assigns more specific types based on the value structure, such as plain text, URL-encoded form, JSON object, or GraphQL. The schema and Pydantic models are updated to reflect these distinct types, and the request options provider and its tests are adjusted to handle and validate the new request body classes. Test fixtures are enhanced to cover the new migration logic and data structures.

Changes

File(s) Change Summary
airbyte_cdk/manifest_migrations/migrations/http_requester_request_body_json_data_to_request_body.py Refactored migration logic to use helper methods for granular type assignment when migrating request_body_json and request_body_data to a unified request_body key.
airbyte_cdk/manifest_migrations/migrations/registry.yaml Incremented migration version from 6.48.2 to 6.48.3.
airbyte_cdk/sources/declarative/declarative_component_schema.yaml Removed generic RequestBody definition; added four explicit request body types (PlainText, UrlEncodedForm, JsonObject, GraphQL); updated request_body property to accept these types.
airbyte_cdk/sources/declarative/models/declarative_component_schema.py Removed generic RequestBody class; added specific request body classes; updated HttpRequester model to accept a union of the new types.
airbyte_cdk/sources/declarative/requesters/request_options/interpolated_request_options_provider.py Updated type annotations and logic to handle multiple specific request body types; added error handling for unsupported types.
unit_tests/manifest_migrations/conftest.py Updated fixtures for migration version; added and adjusted test cases to cover new request body types and migration logic.
unit_tests/sources/declarative/requesters/request_options/test_interpolated_request_options_provider.py Updated tests to use new request body classes as inputs; added test for GraphQL request body; adjusted imports and parameters accordingly.

Sequence Diagram(s)

sequenceDiagram
    participant Manifest as Manifest
    participant Migration as HttpRequesterRequestBodyJsonDataToRequestBody
    participant Helper as Helper Methods

    Manifest->>Migration: Call migrate(manifest)
    Migration->>Manifest: Check for request_body_json or request_body_data
    alt request_body_json exists
        Migration->>Helper: _migrate_body_json(manifest, key)
        Helper->>Manifest: Assign request_body with type based on value (PlainText, GraphQL, JsonObject)
        Helper->>Manifest: Remove request_body_json key
    else request_body_data exists
        Migration->>Helper: _migrate_body_data(manifest, key)
        Helper->>Manifest: Assign request_body with type UrlEncodedForm
        Helper->>Manifest: Remove request_body_data key
    end
    Migration->>Manifest: Validate migration
Loading

Possibly related PRs

  • airbytehq/airbyte-python-cdk#485: Introduces the initial framework and class for manifest migration, which this PR directly builds upon and refines with granular request body type handling.

Suggested labels

declarative-component-schema

Suggested reviewers

  • lmossman
  • maxi297

Would it make sense to add a few more test cases for edge-case GraphQL payloads, just to be extra sure the new migration logic covers all scenarios—wdyt?


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 427875b and 976ea28.

📒 Files selected for processing (2)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml (2 hunks)
  • airbyte_cdk/sources/declarative/models/declarative_component_schema.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml
  • airbyte_cdk/sources/declarative/models/declarative_component_schema.py
⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-amplitude' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: SDM Docker Image Build
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
airbyte_cdk/manifest_migrations/migrations/http_requester_request_body_json_data_to_request_body.py (1)

44-61: New helper method for JSON body migration

This new method determines the appropriate type for JSON bodies based on their structure:

  • Strings are treated as RequestBodyPlainText
  • Dictionaries with a query key are treated as RequestBodyGraphQL
  • Other dictionaries are treated as RequestBodyJsonObject

One consideration: when checking for GraphQL queries, it only looks for the existence of a query key. Would it be more robust to also check the structure or type of the query value to avoid false positives? wdyt?

airbyte_cdk/sources/declarative/declarative_component_schema.yaml (4)

4079-4092: Add examples for plain-text payload and unify naming
The RequestBodyPlainText schema is clear, but it lacks an examples section. Could we add a snippet like:

examples:
  - type: RequestBodyPlainText
    value: "some plain text"

Also, since the old examples use RequestBodyData, should we rename either the definition or the examples to keep them aligned? wdyt?


4093-4106: Clarify URL-encoded form payload shape with examples
For RequestBodyUrlEncodedForm, it might help to include an example object, e.g.:

examples:
  - type: RequestBodyUrlEncodedForm
    value:
      field1: "a"
      field2: "b"

Would that make the intended structure clearer? wdyt?


4108-4120: Consider simplifying RequestBodyJsonObject name or adding examples
RequestBodyJsonObject is explicit but slightly verbose compared to the previous RequestBodyJson naming. Would you consider renaming it to RequestBodyJson? If we keep the current name, could we at least add an example like:

examples:
  - type: RequestBodyJsonObject
    value:
      key: "value"
      nested:
        subkey: 123

wdyt?


4122-4134: Unify GraphQL naming and add usage example
The GraphQL schema uses RequestBodyGraphQL and references RequestBodyGraphQlQuery (mixed casing). Would renaming the query definition to RequestBodyGraphQLQuery improve clarity? Also, could we include an example under RequestBodyGraphQL such as:

examples:
  - type: RequestBodyGraphQL
    value:
      query:
        query: "{ users { id name } }"

wdyt?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a1dd40b and 427875b.

📒 Files selected for processing (7)
  • airbyte_cdk/manifest_migrations/migrations/http_requester_request_body_json_data_to_request_body.py (1 hunks)
  • airbyte_cdk/manifest_migrations/migrations/registry.yaml (1 hunks)
  • airbyte_cdk/sources/declarative/declarative_component_schema.yaml (2 hunks)
  • airbyte_cdk/sources/declarative/models/declarative_component_schema.py (3 hunks)
  • airbyte_cdk/sources/declarative/requesters/request_options/interpolated_request_options_provider.py (3 hunks)
  • unit_tests/manifest_migrations/conftest.py (20 hunks)
  • unit_tests/sources/declarative/requesters/request_options/test_interpolated_request_options_provider.py (3 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
airbyte_cdk/sources/declarative/requesters/request_options/interpolated_request_options_provider.py (2)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (4)
  • RequestBodyGraphQL (1926-1928)
  • RequestBodyJsonObject (1512-1514)
  • RequestBodyPlainText (1502-1504)
  • RequestBodyUrlEncodedForm (1507-1509)
airbyte_cdk/sources/streams/http/http.py (2)
  • request_body_data (212-227)
  • request_body_json (229-240)
airbyte_cdk/manifest_migrations/migrations/http_requester_request_body_json_data_to_request_body.py (3)
airbyte_cdk/manifest_migrations/manifest_migration.py (1)
  • validate (68-73)
airbyte_cdk/manifest_migrations/migrations/http_requester_path_to_url.py (1)
  • validate (49-57)
airbyte_cdk/manifest_migrations/migrations/http_requester_url_base_to_url.py (1)
  • validate (33-41)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (41)
airbyte_cdk/manifest_migrations/migrations/registry.yaml (1)

6-6: Version bump for the new migration looks good!

This change increments the version from 6.48.2 to 6.48.3, corresponding to the enhanced HTTP request body migration logic updates. The version bump is necessary to track the migration changes.

unit_tests/sources/declarative/requesters/request_options/test_interpolated_request_options_provider.py (20)

7-13: Updated imports to support the new request body types

The imports now include all the necessary request body type models to support the refactored approach to handling HTTP request bodies. These align with the model changes in the declarative component schema.


146-149: Test now uses the typed RequestBodyJsonObject instead of raw dict

This test case has been updated to use the new typed request body class. The test still validates the same functionality but now uses the proper model structure.


153-157: Test updated with typed model correctly preserving test context

The test has been refactored to use the RequestBodyJsonObject while maintaining the original test's meaning and assertions.


161-165: Test now properly uses typed request body

Good update to use the structured RequestBodyJsonObject with appropriate type annotation.


169-171: Test updated correctly to use RequestBodyJsonObject

This refactoring preserves the test's functionality while adapting it to the new type system.


176-179: Updated to use the structured RequestBodyJsonObject

The test continues to validate the interpolation of keys but now uses the proper request body model structure.


182-213: Tests properly updated to use typed request body models

All these test cases have been correctly migrated to use the explicit RequestBodyJsonObject model instead of raw dictionaries, preserving the same test assertions and validation logic.


216-219: Test now uses RequestBodyPlainText for string content

Good choice to use RequestBodyPlainText for the test case that involves a JSON string literal. This matches the expected behavior where string literals are interpreted as plain text.


224-226: Nested object case properly updated

This test case for nested objects has been correctly migrated to use the typed model.


231-234: Test for interpolated keys in nested objects updated correctly

The test has been refactored to use the typed request body model while preserving the test's assertions.


237-246: New test case for GraphQL queries - great addition!

This new test case ensures that GraphQL queries are properly handled and transformed into the expected format with a query field. It's a good addition to cover the new GraphQL support.


251-251: Added return type annotation for consistency

Added type hint for the function return type, which improves code readability and type safety. This is a good practice!


254-254: Updated parameter to use the unified request_body field

The provider constructor now takes a request_body parameter instead of separate JSON and data parameters, which aligns with the new unified approach.


307-310: Test updated to use RequestBodyUrlEncodedForm

The test for form data has been updated to use the appropriate RequestBodyUrlEncodedForm model, preserving the test's validation logic.


315-318: Form test correctly updated to use typed model

This test case for URL-encoded form data with stream slice dependencies has been properly updated.


323-326: Config-dependent form data test updated correctly

The test case has been migrated to use the typed form data model appropriately.


329-336: Empty dict test case updated for form data

The test for default empty dictionary behavior has been correctly migrated to use an empty value object with the proper type.


339-344: Form data interpolation test properly updated

The test with interpolated keys has been refactored to use the typed model, preserving the original test's assertions.


351-351: Added return type annotation for consistency

Type hint added to match the similar change made to the JSON test function. Consistent typing improves code quality.


354-354: Updated parameter to use the unified request_body field

The provider constructor now uses the unified request_body parameter, matching the pattern used in the JSON tests.

airbyte_cdk/sources/declarative/requesters/request_options/interpolated_request_options_provider.py (3)

9-14: Updated imports to include all request body type models

The imports now include all the specific request body types that will be handled by the provider, improving type safety and code clarity.


47-54: Enhanced type annotation with explicit request body types

The type annotation for request_body now explicitly lists all supported request body types, which makes the code more maintainable and enables better IDE autocompletion and type checking.


96-107: Improved request body resolution with explicit type handling

The _resolve_request_body method now handles each request body type explicitly and raises errors for unsupported types. The new implementation:

  1. Handles URL-encoded forms (mapping to request_body_data)
  2. Handles GraphQL queries with a special structure
  3. Handles both JSON objects and plain text
  4. Includes proper error handling for unsupported types

This is a good improvement that aligns with the changes to the declarative component schema.

airbyte_cdk/manifest_migrations/migrations/http_requester_request_body_json_data_to_request_body.py (4)

36-36: Delegated JSON body migration to a helper method

The migration logic now delegates to a specialized helper method for JSON bodies. This improves code organization and makes the migration more maintainable.


38-38: Delegated form data migration to a helper method

Similar to the JSON case, form data migration is now handled by a specialized helper method, making the code more maintainable.


62-67: New helper method for form data migration

This method handles the migration of form data to the new typed structure, using RequestBodyUrlEncodedForm as the type. Simple and clear implementation.


68-76: Shared helper for value migration with good reuse

This shared helper method sets the new key with the appropriate type and value, and removes the old key. This reduces code duplication and makes the migration logic more maintainable.

airbyte_cdk/sources/declarative/models/declarative_component_schema.py (6)

1502-1505: New class RequestBodyPlainText looks good!

This new model for plain text request bodies with string values is a nice clean implementation. It effectively replaces part of the functionality of the deprecated generic RequestBody.


1507-1510: RequestBodyUrlEncodedForm implementation looks solid

This model for URL-encoded form data properly uses a dictionary of string key-value pairs, which matches how form data is typically structured.


1512-1515: RequestBodyJsonObject implementation is well-defined

Using Dict[str, Any] for the value field is appropriate here as JSON objects can contain values of various types.


1517-1521: GraphQL query model with flexible configuration

The extra configuration with extra = Extra.allow makes sense for GraphQL queries which can have varying structures. This allows for flexibility in the query format.


1926-1929: RequestBodyGraphQL wrapper implementation looks good

This wrapper model correctly encapsulates the GraphQL query as its value, providing a clean type-safe approach to handling GraphQL requests.


2328-2336: Good implementation of the unified request_body field

The Union type correctly includes all the new request body types, providing type safety while maintaining flexibility. The deprecation of the old fields with clear migration messages is a good practice.

Should we consider adding validation to ensure that when users migrate from the deprecated request_body_json and request_body_data fields, they correctly choose the appropriate new type? Maybe through documentation or examples? Just a thought, wdyt?

unit_tests/manifest_migrations/conftest.py (7)

200-200: Version number updates to 6.48.3

Good consistency in updating all version references to match the new version.

Also applies to: 497-497, 503-503, 515-515, 835-835, 1198-1198, 1204-1204, 1210-1210


641-645: Dictionary-based request_body_data test case

Nice test case for migrating a dictionary-based request_body_data with interpolation. This will ensure the migration correctly handles complex URL-encoded form data including configuration values.


668-668: String-based request_body_data test case

Good test case for migrating a string-based request_body_data. This covers an important use case where form data is provided as a pre-formatted string.


740-745: GraphQL request_body_json test case

Excellent addition of a GraphQL query test case. This ensures the migration correctly identifies and converts GraphQL queries to the new typed format.

I'm curious about how the migration determines whether a JSON object is a GraphQL query - is it just checking for the presence of a "query" field? This could potentially match non-GraphQL payloads that happen to have a "query" field. Might be worth considering a more robust detection mechanism if possible, wdyt?


850-856: Expected migration results with appropriate types

The expected migration results correctly use the new typed request body models:

  • RequestBodyUrlEncodedForm for form data (both dict and string formats)
  • RequestBodyJsonObject for JSON objects
  • RequestBodyPlainText for string JSON content
  • RequestBodyGraphQL for GraphQL queries

This comprehensive coverage ensures the migration logic handles all relevant cases.

Also applies to: 883-885, 912-918, 945-947, 974-981


999-999: Empty requesters after migration

Good to see that the migration properly handles the requester_A reference by excluding request body fields at the requester definition level and only including them at the stream level where they're actually used.


1195-1215: Well-structured metadata for applied migrations

The metadata section properly records all applied migrations with consistent versioning. This history will be valuable for debugging and understanding the evolution of manifests.

@bazarnov bazarnov merged commit a5273a5 into main Apr 30, 2025
26 of 28 checks passed
@bazarnov bazarnov deleted the baz/cdk/fix-migration-to-request-body branch April 30, 2025 10:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants