-
Notifications
You must be signed in to change notification settings - Fork 30
fix: make properties from endpoint cache thread safe #840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This CDK VersionYou can test this version of the CDK using the following: # Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@maxi297/fix_properties_from_endpoint_cache#egg=airbyte-python-cdk[dev]' --help
# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch maxi297/fix_properties_from_endpoint_cacheHelpful ResourcesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
|
|
/autofix
|
📝 WalkthroughWalkthroughAdds thread-safety to the Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor T1 as Thread 1
actor T2 as Thread 2
participant C as PropertiesFromEndpoint
par Concurrent Access
T1->>C: Get cached properties (check 1)
T2->>C: Get cached properties (check 1)
and
Note over T1,T2: Both see cache uninitialized
end
par Lock Acquisition
T1->>C: Acquire lock
T2->>C: Wait for lock
and
T1->>C: Check cache again (check 2)
T1->>C: Populate cache
T1->>C: Release lock
end
T2->>C: Acquire lock
T2->>C: Check cache (already populated)
T2->>C: Return cached value
T2->>C: Release lock
Note over C: Double-checked locking<br/>prevents race conditions
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
airbyte_cdk/sources/declarative/requesters/query_properties/properties_from_endpoint.py (1)
28-28: Good lock initialization, though Lock might suffice?The
RLock(reentrant lock) correctly enables thread safety. Since there's no recursive acquisition pattern inget_properties_from_endpoint, would a simplethreading.Lock()work just as well with slightly less overhead, or do you foresee a need for reentrancy? wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
airbyte_cdk/sources/declarative/requesters/query_properties/properties_from_endpoint.py(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte_cdk/sources/declarative/requesters/query_properties/properties_from_endpoint.py (2)
airbyte_cdk/sources/declarative/interpolation/interpolated_string.py (1)
InterpolatedString(13-79)airbyte_cdk/sources/declarative/retrievers/retriever.py (1)
Retriever(14-58)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
- GitHub Check: Check: source-intercom
- GitHub Check: Check: source-pokeapi
- GitHub Check: Check: destination-motherduck
- GitHub Check: Check: source-shopify
- GitHub Check: Check: source-hardcoded-records
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.13, Ubuntu)
- GitHub Check: Pytest (All, Python 3.12, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: SDM Docker Image Build
- GitHub Check: Manifest Server Docker Image Build
- GitHub Check: Analyze (python)
🔇 Additional comments (2)
airbyte_cdk/sources/declarative/requesters/query_properties/properties_from_endpoint.py (2)
2-10: LGTM! Clean import additions.The
threadingimport and import reorganization support the thread-safety changes well.
34-44: Code is thread-safe as implemented.Good news: your double-checked locking pattern already handles the concern. Here's why:
_get_propertyis thread-safe: it only reads fromselfand performs read-only operations on the parameter.read_recordsis protected: While Airbyte CDK'sread_recordsisn't guaranteed to be thread-safe, your lock ensures only one thread calls it during initialization. Subsequent calls simply return the cached result without re-acquiring the lock—no synchronization needed there.- Immutable initialization parameter:
records_schema={}andstream_slice=Noneare safe read-only inputs.The pattern correctly follows the "one instance per thread or explicit synchronization" guidance that the CDK recommends. Nice implementation!
PyTest Results (Full)3 820 tests 3 808 ✅ 10m 44s ⏱️ Results for commit b0001b8. |
What
Given that multiple threads for the same stream starts at the same time, we do load the properties from endpoint cache multiple times
Summary by CodeRabbit