Skip to content

Commit 93e207e

Browse files
Upgrade mypy (#406)
* Upgrade mypy This commit removes the flag (and cd step) from f53aa37 which we added to get mypy to treat namespaces correctly. This was apparently a bug in mypy, or behavior they decided to change. To get the new behavior, we must upgrade mypy. (This also allows us to remove a couple `# type: ignore` comment that are no longer needed.) This commit runs changes the version of mypy and runs `poetry lock`. It also conforms the whitespace of files in this project to the expectations of various tools and standard (namely: removing trailing whitespace as expected by git and enforcing the existence of one and only one newline at the end of a file as expected by unix and github.) It also uses https://github.com/hauntsaninja/no_implicit_optional to automatically upgrade codebase due to a change in mypy behavior. For a similar reason, it also fixes a new type (or otherwise) errors: * "Return type 'Retry' of 'new' incompatible with return type 'DatabricksRetryPolicy' in supertype 'Retry'" * databricks/sql/auth/retry.py:225: error: object has no attribute update [attr-defined] * /test_param_escaper.py:31: DeprecationWarning: invalid escape sequence \) [as it happens, I think it was also wrong for the string not to be raw, because I'm pretty sure it wants all of its backslashed single-quotes to appear literally with the backslashes, which wasn't happening until now] * ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject [this is like a numpy version thing, which I fixed by being stricter about numpy version] --------- Signed-off-by: wyattscarpenter <[email protected]> * Incorporate suggestion. I decided the most expedient way of dealing with this type error was just adding the type ignore comment back in, but with a `[attr-defined]` specifier this time. I mean, otherwise I would have to restructure the code or figure out the proper types for a TypedDict for the dict and I don't think that's worth it at the moment. Signed-off-by: wyattscarpenter <[email protected]> --------- Signed-off-by: wyattscarpenter <[email protected]>
1 parent f53aa37 commit 93e207e

32 files changed

+272
-267
lines changed

.github/.github/pull_request_template.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
<!-- We welcome contributions. All patches must include a sign-off. Please see CONTRIBUTING.md for details -->
22

33

4-
## What type of PR is this?
4+
## What type of PR is this?
55
<!-- Check all that apply, delete what doesn't apply. -->
66

77
- [ ] Refactor
@@ -13,8 +13,8 @@
1313

1414
## How is this tested?
1515

16-
- [ ] Unit tests
17-
- [ ] E2E Tests
16+
- [ ] Unit tests
17+
- [ ] E2E Tests
1818
- [ ] Manually
1919
- [ ] N/A
2020

.github/workflows/code-quality-checks.yml

+1-2
Original file line numberDiff line numberDiff line change
@@ -161,6 +161,5 @@ jobs:
161161
#----------------------------------------------
162162
- name: Mypy
163163
run: |
164-
cd src # Need to be in the actual databricks/ folder or mypy does the wrong thing.
165164
mkdir .mypy_cache # Workaround for bad error message "error: --install-types failed (no mypy cache directory)"; see https://github.com/python/mypy/issues/10768#issuecomment-2178450153
166-
poetry run mypy --config-file ../pyproject.toml --install-types --non-interactive --namespace-packages databricks
165+
poetry run mypy --install-types --non-interactive src

.github/workflows/publish.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -61,4 +61,4 @@ jobs:
6161
- name: Build and publish to pypi
6262
uses: JRubics/[email protected]
6363
with:
64-
pypi_token: ${{ secrets.PROD_PYPI_TOKEN }}
64+
pypi_token: ${{ secrets.PROD_PYPI_TOKEN }}

.gitignore

+1-1
Original file line numberDiff line numberDiff line change
@@ -207,4 +207,4 @@ build/
207207
.vscode
208208

209209
# don't commit authentication info to source control
210-
test.env
210+
test.env

CONTRIBUTING.md

+2-3
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ If you set your `user.name` and `user.email` git configs, you can sign your comm
7474
This project uses [Poetry](https://python-poetry.org/) for dependency management, tests, and linting.
7575

7676
1. Clone this respository
77-
2. Run `poetry install`
77+
2. Run `poetry install`
7878

7979
### Run tests
8080

@@ -167,5 +167,4 @@ Modify the dependency specification (syntax can be found [here](https://python-p
167167
- `poetry update`
168168
- `rm poetry.lock && poetry install`
169169

170-
Sometimes `poetry update` can freeze or run forever. Deleting the `poetry.lock` file and calling `poetry install` is guaranteed to update everything but is usually _slower_ than `poetry update` **if `poetry update` works at all**.
171-
170+
Sometimes `poetry update` can freeze or run forever. Deleting the `poetry.lock` file and calling `poetry install` is guaranteed to update everything but is usually _slower_ than `poetry update` **if `poetry update` works at all**.

docs/parameters.md

+6-6
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ SELECT * FROM table WHERE field = %(value)s
4343

4444
## Python Syntax
4545

46-
This connector follows the [PEP-249 interface](https://peps.python.org/pep-0249/#id20). The expected structure of the parameter collection follows the paramstyle of the variables in your query.
46+
This connector follows the [PEP-249 interface](https://peps.python.org/pep-0249/#id20). The expected structure of the parameter collection follows the paramstyle of the variables in your query.
4747

4848
### `named` paramstyle Usage Example
4949

@@ -85,7 +85,7 @@ The result of the above two examples is identical.
8585

8686
Databricks Runtime expects variable markers to use either `named` or `qmark` paramstyles. Historically, this connector used `pyformat` which Databricks Runtime does not support. So to assist assist customers transitioning their codebases from `pyformat``named`, we can dynamically rewrite the variable markers before sending the query to Databricks. This happens only when `use_inline_params=False`.
8787

88-
This dynamic rewrite will be deprecated in a future release. New queries should be written using the `named` paramstyle instead. And users should update their client code to replace `pyformat` markers with `named` markers.
88+
This dynamic rewrite will be deprecated in a future release. New queries should be written using the `named` paramstyle instead. And users should update their client code to replace `pyformat` markers with `named` markers.
8989

9090
For example:
9191

@@ -106,7 +106,7 @@ SELECT field1, field2, :param1 FROM table WHERE field4 = :param2
106106

107107
Under the covers, parameter values are annotated with a valid Databricks SQL type. As shown in the examples above, this connector accepts primitive Python types like `int`, `str`, and `Decimal`. When this happens, the connector infers the corresponding Databricks SQL type (e.g. `INT`, `STRING`, `DECIMAL`) automatically. This means that the parameters passed to `cursor.execute()` are always wrapped in a `TDbsqlParameter` subtype prior to execution.
108108

109-
Automatic inferrence is sufficient for most usages. But you can bypass the inference by explicitly setting the Databricks SQL type in your client code. All supported Databricks SQL types have `TDbsqlParameter` implementations which you can import from `databricks.sql.parameters`.
109+
Automatic inferrence is sufficient for most usages. But you can bypass the inference by explicitly setting the Databricks SQL type in your client code. All supported Databricks SQL types have `TDbsqlParameter` implementations which you can import from `databricks.sql.parameters`.
110110

111111
`TDbsqlParameter` objects must always be passed within a list. Either paramstyle (`:named` or `?`) may be used. However, if your query uses the `named` paramstyle, all `TDbsqlParameter` objects must be provided a `name` when they are constructed.
112112

@@ -158,7 +158,7 @@ Rendering parameters inline is supported on all versions of DBR since these quer
158158

159159
## SQL Syntax
160160

161-
Variables in your SQL query can look like `%(param)s` or like `%s`.
161+
Variables in your SQL query can look like `%(param)s` or like `%s`.
162162

163163
#### Example
164164

@@ -172,7 +172,7 @@ SELECT * FROM table WHERE field = %s
172172

173173
## Python Syntax
174174

175-
This connector follows the [PEP-249 interface](https://peps.python.org/pep-0249/#id20). The expected structure of the parameter collection follows the paramstyle of the variables in your query.
175+
This connector follows the [PEP-249 interface](https://peps.python.org/pep-0249/#id20). The expected structure of the parameter collection follows the paramstyle of the variables in your query.
176176

177177
### `pyformat` paramstyle Usage Example
178178

@@ -210,7 +210,7 @@ with sql.connect(..., use_inline_params=True) as conn:
210210

211211
The result of the above two examples is identical.
212212

213-
**Note**: `%s` is not compliant with PEP-249 and only works due to the specific implementation of our inline renderer.
213+
**Note**: `%s` is not compliant with PEP-249 and only works due to the specific implementation of our inline renderer.
214214

215215
**Note:** This `%s` syntax overlaps with valid SQL syntax around the usage of `LIKE` DML. For example if your query includes a clause like `WHERE field LIKE '%sequence'`, the parameter inlining function will raise an exception because this string appears to include an inline marker but none is provided. This means that connector versions below 3.0.0 it has been impossible to execute a query that included both parameters and LIKE wildcards. When `use_inline_params=False`, we will pass `%s` occurrences along to the database, allowing it to be used as expected in `LIKE` statements.
216216

examples/README.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ We provide example scripts so you can see the connector in action for basic usag
77
- DATABRICKS_TOKEN
88

99
Follow the quick start in our [README](../README.md) to install `databricks-sql-connector` and see
10-
how to find the hostname, http path, and access token. Note that for the OAuth examples below a
10+
how to find the hostname, http path, and access token. Note that for the OAuth examples below a
1111
personal access token is not needed.
1212

1313

@@ -38,7 +38,7 @@ To run all of these examples you can clone the entire repository to your disk. O
3838
- **`set_user_agent.py`** shows how to customize the user agent header used for Thrift commands. In
3939
this example the string `ExamplePartnerTag` will be added to the the user agent on every request.
4040
- **`staging_ingestion.py`** shows how the connector handles Databricks' experimental staging ingestion commands `GET`, `PUT`, and `REMOVE`.
41-
- **`sqlalchemy.py`** shows a basic example of connecting to Databricks with [SQLAlchemy 2.0](https://www.sqlalchemy.org/).
41+
- **`sqlalchemy.py`** shows a basic example of connecting to Databricks with [SQLAlchemy 2.0](https://www.sqlalchemy.org/).
4242
- **`custom_cred_provider.py`** shows how to pass a custom credential provider to bypass connector authentication. Please install databricks-sdk prior to running this example.
4343
- **`v3_retries_query_execute.py`** shows how to enable v3 retries in connector version 2.9.x including how to enable retries for non-default retry cases.
44-
- **`parameters.py`** shows how to use parameters in native and inline modes.
44+
- **`parameters.py`** shows how to use parameters in native and inline modes.

examples/insert_data.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -18,4 +18,4 @@
1818
result = cursor.fetchall()
1919

2020
for row in result:
21-
print(row)
21+
print(row)

examples/persistent_oauth.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -23,10 +23,10 @@
2323
class SampleOAuthPersistence(OAuthPersistence):
2424
def persist(self, hostname: str, oauth_token: OAuthToken):
2525
"""To be implemented by the end user to persist in the preferred storage medium.
26-
26+
2727
OAuthToken has two properties:
2828
1. OAuthToken.access_token
29-
2. OAuthToken.refresh_token
29+
2. OAuthToken.refresh_token
3030
3131
Both should be persisted.
3232
"""

examples/query_cancel.py

+5-5
Original file line numberDiff line numberDiff line change
@@ -19,13 +19,13 @@ def execute_really_long_query():
1919
print("It looks like this query was cancelled.")
2020

2121
exec_thread = threading.Thread(target=execute_really_long_query)
22-
22+
2323
print("\n Beginning to execute long query")
2424
exec_thread.start()
25-
25+
2626
# Make sure the query has started before cancelling
2727
print("\n Waiting 15 seconds before canceling", end="", flush=True)
28-
28+
2929
seconds_waited = 0
3030
while seconds_waited < 15:
3131
seconds_waited += 1
@@ -34,15 +34,15 @@ def execute_really_long_query():
3434

3535
print("\n Cancelling the cursor's operation. This can take a few seconds.")
3636
cursor.cancel()
37-
37+
3838
print("\n Now checking the cursor status:")
3939
exec_thread.join(5)
4040

4141
assert not exec_thread.is_alive()
4242
print("\n The previous command was successfully canceled")
4343

4444
print("\n Now reusing the cursor to run a separate query.")
45-
45+
4646
# We can still execute a new command on the cursor
4747
cursor.execute("SELECT * FROM range(3)")
4848

examples/query_execute.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -10,4 +10,4 @@
1010
result = cursor.fetchall()
1111

1212
for row in result:
13-
print(row)
13+
print(row)

examples/sqlalchemy.py

+12-12
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
99
Our dialect implements the majority of SQLAlchemy 2.0's API. Because of the extent of SQLAlchemy's
1010
capabilities it isn't feasible to provide examples of every usage in a single script, so we only
11-
provide a basic one here. Learn more about usage in README.sqlalchemy.md in this repo.
11+
provide a basic one here. Learn more about usage in README.sqlalchemy.md in this repo.
1212
"""
1313

1414
# fmt: off
@@ -89,17 +89,17 @@ class SampleObject(Base):
8989

9090
# Output SQL is:
9191
# CREATE TABLE pysql_sqlalchemy_example_table (
92-
# bigint_col BIGINT NOT NULL,
93-
# string_col STRING,
94-
# tinyint_col SMALLINT,
95-
# int_col INT,
96-
# numeric_col DECIMAL(10, 2),
97-
# boolean_col BOOLEAN,
98-
# date_col DATE,
99-
# datetime_col TIMESTAMP,
100-
# datetime_col_ntz TIMESTAMP_NTZ,
101-
# time_col STRING,
102-
# uuid_col STRING,
92+
# bigint_col BIGINT NOT NULL,
93+
# string_col STRING,
94+
# tinyint_col SMALLINT,
95+
# int_col INT,
96+
# numeric_col DECIMAL(10, 2),
97+
# boolean_col BOOLEAN,
98+
# date_col DATE,
99+
# datetime_col TIMESTAMP,
100+
# datetime_col_ntz TIMESTAMP_NTZ,
101+
# time_col STRING,
102+
# uuid_col STRING,
103103
# PRIMARY KEY (bigint_col)
104104
# ) USING DELTA
105105

examples/staging_ingestion.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@
2424
2525
Additionally, the connection can only manipulate files within the cloud storage location of the authenticated user.
2626
27-
To run this script:
27+
To run this script:
2828
2929
1. Set the INGESTION_USER constant to the account email address of the authenticated user
3030
2. Set the FILEPATH constant to the path of a file that will be uploaded (this example assumes its a CSV file)

examples/v3_retries_query_execute.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
# This flag will be deprecated in databricks-sql-connector~=3.0.0 as it will become the default.
66
#
77
# The new retry behaviour is defined in src/databricks/sql/auth/retry.py
8-
#
8+
#
99
# The new retry behaviour allows users to force the connector to automatically retry requests that fail with codes
1010
# that are not retried by default (in most cases only codes 429 and 503 are retried by default). Additional HTTP
1111
# codes to retry are specified as a list passed to `_retry_dangerous_codes`.
@@ -16,7 +16,7 @@
1616
# the SQL gateway / load balancer. So there is no risk that retrying the request would result in a doubled
1717
# (or tripled etc) command execution. These codes are always accompanied by a Retry-After header, which we honour.
1818
#
19-
# However, if your use-case emits idempotent queries such as SELECT statements, it can be helpful to retry
19+
# However, if your use-case emits idempotent queries such as SELECT statements, it can be helpful to retry
2020
# for 502 (Bad Gateway) codes etc. In these cases, there is a possibility that the initial command _did_ reach
2121
# Databricks compute and retrying it could result in additional executions. Retrying under these conditions uses
2222
# an exponential back-off since a Retry-After header is not present.

0 commit comments

Comments
 (0)