-
Notifications
You must be signed in to change notification settings - Fork 111
[PECOBLR-330] Support for complex params #559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
3026b6f
to
37c89b8
Compare
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for complex parameters by introducing handling for ARRAY and MAP types in parameter binding and query escaping. Key changes include:
- New tests for ARRAY and MAP parameters and recursive complex type handling.
- Updates to parameter conversion functions to support new complex types.
- Adjustments in SQL escaping functions and client queries to use ARRAY(…) and MAP(…) syntax.
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
tests/unit/test_thrift_backend.py | Minor formatting updates for readability in test helpers. |
tests/unit/test_parameters.py | Added tests for ARRAY and MAP parameters and updated parameter enums. |
tests/unit/test_param_escaper.py | Updated SQL escaping for sequences and mappings to new ARRAY/MAP syntax. |
tests/unit/test_client.py | Updated expected query outputs to match new ARRAY syntax. |
tests/e2e/test_parameterized_queries.py | Extended inline and native parameter tests with complex types and qualified table names. |
tests/e2e/test_complex_types.py | Added tests for new complex type fields in tables. |
src/databricks/sql/utils.py | Modified SQL escape functions and adjusted time format to include timezone. |
src/databricks/sql/parameters/native.py | Introduced ArrayParameter and MapParameter with corresponding TSpark conversions. |
src/databricks/sql/parameters/init.py | Exported new parameter types. |
src/databricks/sql/client.py | Minor addition to support new TSpark parameters. |
Comments suppressed due to low confidence (1)
src/databricks/sql/utils.py:432
- The updated TIME_FORMAT now includes a timezone ('%z'), which could lead to formatting issues if datetime objects do not provide timezone data. Ensure that datetime values passed to escape_datetime have the correct timezone info or are adjusted accordingly.
_TIME_FORMAT = "%H:%M:%S.%f %z"
INSERT_QUERY = f"INSERT INTO pysql_e2e_inline_param_test_table (`{target_column}`) VALUES (%(p)s)" | ||
SELECT_QUERY = f"SELECT {target_column} `col` FROM pysql_e2e_inline_param_test_table LIMIT 1" | ||
DELETE_QUERY = "DELETE FROM pysql_e2e_inline_param_test_table" | ||
INSERT_QUERY = f"INSERT INTO ___________________first.jprakash.pysql_e2e_inline_param_test_table (`{target_column}`) VALUES (%(p)s)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider externalizing the hard-coded fully-qualified table name in the inline parameter roundtrip tests to improve test portability and reduce maintenance burden when schema or account names change.
Copilot uses AI. Check for mistakes.
tsp.ordinal = False | ||
elif not named: | ||
tsp.ordinal = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't ordinal an int field? Should we set it to parameter index here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, in this case we don't need to set the index. ordinal is only supported when the params have ?
, so it auto fills the variables in the order they appear
The `name` argument to this function would be `my_param`. | ||
""" | ||
self.name = name | ||
self.value = [dbsql_parameter_from_primitive(val) for val in value] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we able to handle nested types like array<map<int,int>> this way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it handles the recursive cases. Tests are also added to support this
return DecimalParameter(value=value, name=name) | ||
elif isinstance(value, dict): | ||
return MapParameter(value=value, name=name) | ||
elif isinstance(value, Sequence) and not isinstance(value, str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Range
is also a Sequence in python, are we allowing that also as parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we don't want users to pass lazy types like Range
. They are expected to pass initialized values like list
or tuples
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's better to add and not isinstance(value, Range)
@@ -429,7 +429,7 @@ def user_friendly_error_message(self, no_retry_reason, attempt, elapsed): | |||
# Taken from PyHive | |||
class ParamEscaper: | |||
_DATE_FORMAT = "%Y-%m-%d" | |||
_TIME_FORMAT = "%H:%M:%S.%f" | |||
_TIME_FORMAT = "%H:%M:%S.%f %z" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Timezone will not be there for TIMESTAMP_NTZ param.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, would like to know how we've accounted/tested for NTZ
INSERT_QUERY = f"INSERT INTO pysql_e2e_inline_param_test_table (`{target_column}`) VALUES (%(p)s)" | ||
SELECT_QUERY = f"SELECT {target_column} `col` FROM pysql_e2e_inline_param_test_table LIMIT 1" | ||
DELETE_QUERY = "DELETE FROM pysql_e2e_inline_param_test_table" | ||
INSERT_QUERY = f"INSERT INTO ___________________first.jprakash.pysql_e2e_inline_param_test_table (`{target_column}`) VALUES (%(p)s)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we using this ______first catalog and jprakash namespace? can you move into a team catalog and schema?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@@ -429,7 +429,7 @@ def user_friendly_error_message(self, no_retry_reason, attempt, elapsed): | |||
# Taken from PyHive | |||
class ParamEscaper: | |||
_DATE_FORMAT = "%Y-%m-%d" | |||
_TIME_FORMAT = "%H:%M:%S.%f" | |||
_TIME_FORMAT = "%H:%M:%S.%f %z" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, would like to know how we've accounted/tested for NTZ
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this not an opt-in feature, can't user disable this in favor of existing behavior?
Description
This PR introduces comprehensive support for complex parameter types (
ARRAY
andMAP
) in the Databricks SQL Python client.Key Changes
ArrayParameter
andMapParameter
dbsql_parameter_from_primitive
is updated to recognize and construct these parameter types, including for nested complex values.TSparkParamValueArgs
which is necessary to be passed as arguements for complex typesTesting
e2e
ARRAY<ARRAY<STRING>>, ARRAY<MAP<STRING,INTEGER>>, MAP<STRING,ARRAY<STRING>>
unit
Note to reviewer
Please ignore the failing tests for the time being. Some server side changes needs to go in for the tests to pass