-
Notifications
You must be signed in to change notification settings - Fork 29
feat(default-error-handler): Add max_time
parameter to DefaultErrorHandler
#648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This CDK VersionYou can test this version of the CDK using the following: # Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@tolik0/default-error-handler/add-max-time-parameter#egg=airbyte-python-cdk[dev]' --help
# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch tolik0/default-error-handler/add-max-time-parameter Helpful ResourcesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
|
📝 WalkthroughWalkthroughA new Changes
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🔭 Outside diff range comments (1)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
2268-2275
:max_time
is Optional here butDefaultErrorHandler
expectsint
– mypy is red-flagging
model.max_time
is declared asOptional[int]
in the generated model, but theDefaultErrorHandler
constructor still types this argument as plainint
. PassingNone
therefore violates the signature and breaks the linters (see pipeline error).Would it make sense to coerce the value before the call (e.g. fall back to the same default the schema uses) or, alternatively, amend the
DefaultErrorHandler
signature to acceptOptional[int]
? For the quick fix on this call-site, something like the diff below keeps static typing happy while preserving behaviour – wdyt?- max_time=model.max_time, + max_time=model.max_time if model.max_time is not None else 600,(Replace
600
withDefaultErrorHandler.DEFAULT_MAX_TIME
if such a constant exists.)
🧹 Nitpick comments (3)
airbyte_cdk/sources/declarative/requesters/error_handlers/default_error_handler.py (2)
89-95
: Makemax_time
optional for symmetry withmax_retries
, wdyt?
max_retries
is declared asOptional[int]
, suggesting callers may omit it.
max_time
is declared as a plainint
, yet it also has a sensible default.
Declaring it asOptional[int] = 60 * 10
(and reflecting that in the docstring) would keep the public surface consistent and avoid mypy/IDE hints that the argument is mandatory.
101-104
: Is the private field_max_time
still needed or could we rely onmax_time
directly?
max_time
and_max_time
currently hold identical default values, but the latter is never read inside this class.
Duplicating state can drift over time and complicate maintenance. Would dropping_max_time
(or turning it into a cached/derived property) simplify things? wdyt?airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)
1739-1743
: Consider allowing interpolated values formax_time
, similar to other numeric settings, wdyt?Most numeric knobs in the schema (e.g.,
backoff_time_in_seconds
at line 469 andfactor
at line 1834) accept either aninteger
or astring
with interpolation context so connector authors can surface the value fromconfig
.
Would doing the same here improve consistency and flexibility?max_time: title: Max cumulative retry wait time (seconds) description: The maximum total time (in seconds) spent waiting between retries, that is the sum of all backoff intervals, before giving up and failing. - type: integer + anyOf: + - type: integer # literal seconds + - type: string # "{{ config['max_retry_time'] }}" + interpolation_context: + - config default: 600This keeps the default numeric but unlocks templating without a breaking change.
Happy to hear your thoughts!
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
(1 hunks)airbyte_cdk/sources/declarative/models/declarative_component_schema.py
(1 hunks)airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(1 hunks)airbyte_cdk/sources/declarative/requesters/error_handlers/default_error_handler.py
(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (2)
undefined
<retrieved_learning>
Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, the declarative_component_schema.py
file is auto-generated from declarative_component_schema.yaml
and should be ignored in the recommended reviewing order.
</retrieved_learning>
<retrieved_learning>
Learnt from: ChristoGrab
PR: #58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the YamlDeclarativeSource
class in airbyte_cdk/sources/declarative/yaml_declarative_source.py
, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
</retrieved_learning>
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)
undefined
<retrieved_learning>
Learnt from: ChristoGrab
PR: #58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the YamlDeclarativeSource
class in airbyte_cdk/sources/declarative/yaml_declarative_source.py
, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
</retrieved_learning>
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)
undefined
<retrieved_learning>
Learnt from: aaronsteers
PR: #174
File: airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:1093-1102
Timestamp: 2025-01-14T00:20:32.310Z
Learning: In the airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
file, the strict module name checks in _get_class_from_fully_qualified_class_name
(requiring module_name
to be "components" and module_name_full
to be "source_declarative_manifest.components") are intentionally designed to provide early, clear feedback when class declarations won't be found later in execution. These restrictions may be loosened in the future if the requirements for class definition locations change.
</retrieved_learning>
<retrieved_learning>
Learnt from: ChristoGrab
PR: #58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the YamlDeclarativeSource
class in airbyte_cdk/sources/declarative/yaml_declarative_source.py
, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
</retrieved_learning>
🧬 Code Graph Analysis (2)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (3)
airbyte_cdk/sources/streams/http/http.py (1)
max_time
(149-153)airbyte_cdk/sources/declarative/requesters/error_handlers/composite_error_handler.py (1)
max_time
(56-57)airbyte_cdk/sources/streams/http/error_handlers/http_status_error_handler.py (1)
max_time
(45-46)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4)
airbyte_cdk/sources/streams/http/http.py (1)
max_time
(149-153)airbyte_cdk/sources/declarative/requesters/error_handlers/composite_error_handler.py (1)
max_time
(56-57)airbyte_cdk/sources/streams/http/error_handlers/error_handler.py (1)
max_time
(26-30)airbyte_cdk/sources/streams/http/error_handlers/http_status_error_handler.py (1)
max_time
(45-46)
🪛 GitHub Actions: Linters
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
[error] 2270-2270: mypy: Argument "max_time" to "DefaultErrorHandler" has incompatible type "int | None"; expected "int" [arg-type]
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: Check: source-shopify
- GitHub Check: Check: source-intercom
- GitHub Check: Check: destination-motherduck
- GitHub Check: Check: source-pokeapi
- GitHub Check: Check: source-hardcoded-records
- GitHub Check: Pytest (Fast)
- GitHub Check: SDM Docker Image Build
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Analyze (python)
🔇 Additional comments (2)
airbyte_cdk/sources/declarative/requesters/error_handlers/default_error_handler.py (1)
135-149
:max_time
is defined but never enforced – missing guard inbackoff_time
?The new parameter promises “maximum cumulative wait time across retries”, yet
backoff_time()
only delegates to the strategies and returns their value; cumulative elapsed time is never tracked.Could you confirm whether the parent
ErrorHandler
or the caller layer caps the total wait?
If not, perhaps we need bookkeeping similar to the existing_last_request_to_attempt_count
, e.g.:@@ for backoff_strategy in self.backoff_strategies: backoff = backoff_strategy.backoff_time( response_or_exception=response_or_exception, attempt_count=attempt_count ) if backoff: - return backoff + if self._max_time is None: + return backoff + self._elapsed_time += backoff + if self._elapsed_time > self._max_time: + return None # give up + return backoff(or any equivalent approach).
Let me know if this lives elsewhere; otherwise adding the guard would keep the runtime behaviour aligned with the new config knob. wdyt?airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
1939-1943
: LGTM! The newmax_time
field looks well-implemented.The field definition follows the same pattern as the existing
max_retries
field above it, with a reasonable default of 600 seconds that matches the timeout I see in the HTTP stream implementation. The description clearly explains that this represents cumulative wait time across all retry attempts, which complements the existing retry count limit nicely. Since this is auto-generated from the YAML schema, the implementation looks mechanically correct - wdyt?
Resolves: https://github.com/airbytehq/oncall/issues/7994
Summary by CodeRabbit
New Features
Documentation