-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zep PostgresDatasource returns a list of batches. #6341
Conversation
✅ Deploy Preview for niobium-lead-7998 ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
👇 Click on the image for a new way to code review
Legend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will go back and review it in more detail. But I don't see any issues based on the config work that I'm doing.
def _valid_batch_request_options(self, options: BatchRequestOptions) -> bool: | ||
return set(options.keys()).issubset( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit
On closer look, I don't think my suggestion below makes sense given how open batch_request_options_template
is.
Leaving the comment up in-case it inspires someone to try this technique elsewhere.
This may not be that useful. But we might be to do 2 things to improve the type narrowing here.
- Define a more specific
PostgresBatchOptions
TypedDict
- Make this
_valid_batch_request_options
method aTypeGuard
for thePostgresBatchOptions
type.
I'm not totally sure how a TypedDict
works with extra keys though.
from typing_extensions import TypedDict, TypeGuard
class PostgresBatchOptions(TypedDict):
foo: str
bar: int
def _valid_batch_request_options(self, options: BatchRequestOptions) -> TypeGaurd[PostgresBatchOptions]:
return set(options.keys()).issubset(
set(self.batch_request_options_template().keys())
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra arguments are not supported in TypedDicts
. Here's the github issue about it with some interesting discussion: python/mypy#4617
Originally I had implemented BatchOptions
using generics and had it more strongly typed. I abandoned that after getting some UX feedback and have reverted it to this untyped dict which I'm validated at runtime for a particular DataAsset. To help with the pain of "what can I put here" I've made the batch_request_options_template
method. I would be nice to make better checks in static analysis but I'm not sure how to do that with the current requirements.
I am not familiar with TypeGuard
s and will look into that.
default_option_values.append( | ||
self.column_splitter.param_defaults[option] | ||
) | ||
for option_values in itertools.product(*default_option_values): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to use itertools
more 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to implement this but then did some searching. Happy to have found it!
* develop: [MAINTENANCE] Update column_reflection_fallback to also use schema name for Trino (#6350) [MAINTENANCE] Misc cleanup of GX Cloud helpers (#6352) [MAINTENANCE] Pin `mypy` to `0.990` (#6361) Subject: Support to include ID/PK in validation result for each row t… (#5876) [MAINTENANCE] Fix computed metrics type hint in ExecutionEngine.resolve_metrics() method (#6347) [MAINTENANCE] Refactor `usage_stats_opt_out` method in DataContext (#5339) Zep PostgresDatasource returns a list of batches. (#6341) [DOCS] Add `yarn snippet-check` command (#6351) [MAINTENANCE] Refactor out `termcolor` dependency (#6348) [MAINTENANCE] Move Cloud-specific enums to `cloud_constants.py` (#6349)
Previously
get_batch_list_from_batch_request
always sent out a list of 1 item since the batch request had to fully specify a batch. The PR allows one to partial specify a batch request and get a list of batches that match that criteria. Note, along any dimension, one can set it exactly or leave it unset (eg, you can see all months in a year). Functionality to filter on a dimension (eg, just the last 3 months of a year) is coming in a later PR.As part of this work I removed the nascent namespacing support for batch parameters. These were unused and not fully thought out making this work harder than necessary. We can add namespacing back in a better way when we figure out the use case better.
Thank you for submitting!