Skip to content

Commit 1a0575a

Browse files
Complete Fetch Phase (EXTERNAL_LINKS disposition and ARROW format) (#598)
* large query results Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary changes covered by #588 Signed-off-by: varun-edachali-dbx <[email protected]> * simplify test module Signed-off-by: varun-edachali-dbx <[email protected]> * logging -> debug level Signed-off-by: varun-edachali-dbx <[email protected]> * change table name in log Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary changes Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary backend cahnges Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-needed GetChunksResponse Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-needed GetChunksResponse only relevant in Fetch phase Signed-off-by: varun-edachali-dbx <[email protected]> * reduce code duplication in response parsing Signed-off-by: varun-edachali-dbx <[email protected]> * reduce code duplication Signed-off-by: varun-edachali-dbx <[email protected]> * more clear docstrings Signed-off-by: varun-edachali-dbx <[email protected]> * introduce strongly typed ChunkInfo Signed-off-by: varun-edachali-dbx <[email protected]> * remove is_volume_operation from response Signed-off-by: varun-edachali-dbx <[email protected]> * add is_volume_op and more ResultData fields Signed-off-by: varun-edachali-dbx <[email protected]> * add test scripts Signed-off-by: varun-edachali-dbx <[email protected]> * Revert "Merge branch 'sea-migration' into exec-models-sea" This reverts commit 8bd12d8, reversing changes made to 030edf8. * Revert "Merge branch 'exec-models-sea' into exec-phase-sea" This reverts commit be1997e, reversing changes made to 37813ba. * change logging level Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary changes Signed-off-by: varun-edachali-dbx <[email protected]> * remove excess changes Signed-off-by: varun-edachali-dbx <[email protected]> * remove excess changes Signed-off-by: varun-edachali-dbx <[email protected]> * remove _get_schema_bytes (for now) Signed-off-by: varun-edachali-dbx <[email protected]> * redundant comments Signed-off-by: varun-edachali-dbx <[email protected]> * remove fetch phase methods Signed-off-by: varun-edachali-dbx <[email protected]> * reduce code repetititon + introduce gaps after multi line pydocs Signed-off-by: varun-edachali-dbx <[email protected]> * remove unused imports Signed-off-by: varun-edachali-dbx <[email protected]> * move description extraction to helper func Signed-off-by: varun-edachali-dbx <[email protected]> * formatting (black) Signed-off-by: varun-edachali-dbx <[email protected]> * add more unit tests Signed-off-by: varun-edachali-dbx <[email protected]> * streamline unit tests Signed-off-by: varun-edachali-dbx <[email protected]> * test getting the list of allowed configurations Signed-off-by: varun-edachali-dbx <[email protected]> * reduce diff Signed-off-by: varun-edachali-dbx <[email protected]> * reduce diff Signed-off-by: varun-edachali-dbx <[email protected]> * house constants in enums for readability and immutability Signed-off-by: varun-edachali-dbx <[email protected]> * add note on hybrid disposition Signed-off-by: varun-edachali-dbx <[email protected]> * [squashed from cloudfetch-sea] introduce external links + arrow functionality Signed-off-by: varun-edachali-dbx <[email protected]> * reduce responsibility of Queue Signed-off-by: varun-edachali-dbx <[email protected]> * reduce repetition in arrow tablee creation Signed-off-by: varun-edachali-dbx <[email protected]> * reduce redundant code in CloudFetchQueue Signed-off-by: varun-edachali-dbx <[email protected]> * move chunk link progression to separate func Signed-off-by: varun-edachali-dbx <[email protected]> * remove redundant log Signed-off-by: varun-edachali-dbx <[email protected]> * improve logging Signed-off-by: varun-edachali-dbx <[email protected]> * remove reliance on schema_bytes in SEA Signed-off-by: varun-edachali-dbx <[email protected]> * remove redundant note on arrow_schema_bytes Signed-off-by: varun-edachali-dbx <[email protected]> * use more fetch methods Signed-off-by: varun-edachali-dbx <[email protected]> * remove redundant schema_bytes from parent constructor Signed-off-by: varun-edachali-dbx <[email protected]> * only call get_chunk_link with non null chunk index Signed-off-by: varun-edachali-dbx <[email protected]> * align SeaResultSet structure with ThriftResultSet Signed-off-by: varun-edachali-dbx <[email protected]> * remvoe _fill_result_buffer from SeaResultSet Signed-off-by: varun-edachali-dbx <[email protected]> * reduce code repetition Signed-off-by: varun-edachali-dbx <[email protected]> * align SeaResultSet with ext-links-sea Signed-off-by: varun-edachali-dbx <[email protected]> * remove redundant methods Signed-off-by: varun-edachali-dbx <[email protected]> * update unit tests Signed-off-by: varun-edachali-dbx <[email protected]> * remove accidental venv changes Signed-off-by: varun-edachali-dbx <[email protected]> * pre-fetch next chunk link on processing current Signed-off-by: varun-edachali-dbx <[email protected]> * reduce nesting Signed-off-by: varun-edachali-dbx <[email protected]> * line break after multi line pydoc Signed-off-by: varun-edachali-dbx <[email protected]> * re-introduce schema_bytes for better abstraction (likely temporary) Signed-off-by: varun-edachali-dbx <[email protected]> * add fetchmany_arrow and fetchall_arrow Signed-off-by: varun-edachali-dbx <[email protected]> * remove accidental changes in sea backend tests Signed-off-by: varun-edachali-dbx <[email protected]> * remove irrelevant changes Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary test changes Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary changes in thrift backend tests Signed-off-by: varun-edachali-dbx <[email protected]> * remove unimplemented methods test Signed-off-by: varun-edachali-dbx <[email protected]> * remove unimplemented method tests Signed-off-by: varun-edachali-dbx <[email protected]> * modify example scripts to include fetch calls Signed-off-by: varun-edachali-dbx <[email protected]> * add GetChunksResponse Signed-off-by: varun-edachali-dbx <[email protected]> * remove changes to sea test Signed-off-by: varun-edachali-dbx <[email protected]> * re-introduce accidentally removed description extraction method Signed-off-by: varun-edachali-dbx <[email protected]> * fix type errors (ssl_options, CHUNK_PATH_WITH_ID..., etc.) Signed-off-by: varun-edachali-dbx <[email protected]> * access ssl_options through connection Signed-off-by: varun-edachali-dbx <[email protected]> * DEBUG level Signed-off-by: varun-edachali-dbx <[email protected]> * remove explicit multi chunk test Signed-off-by: varun-edachali-dbx <[email protected]> * move cloud fetch queues back into utils.py Signed-off-by: varun-edachali-dbx <[email protected]> * remove excess docstrings Signed-off-by: varun-edachali-dbx <[email protected]> * move ThriftCloudFetchQueue above SeaCloudFetchQueue Signed-off-by: varun-edachali-dbx <[email protected]> * fix sea connector tests Signed-off-by: varun-edachali-dbx <[email protected]> * correct patch module path in cloud fetch queue tests Signed-off-by: varun-edachali-dbx <[email protected]> * remove unimplemented methods test Signed-off-by: varun-edachali-dbx <[email protected]> * correct add_link docstring Signed-off-by: varun-edachali-dbx <[email protected]> * remove invalid import Signed-off-by: varun-edachali-dbx <[email protected]> * better align queries with JDBC impl Signed-off-by: varun-edachali-dbx <[email protected]> * line breaks after multi-line PRs Signed-off-by: varun-edachali-dbx <[email protected]> * remove unused imports Signed-off-by: varun-edachali-dbx <[email protected]> * fix: introduce ExecuteResponse import Signed-off-by: varun-edachali-dbx <[email protected]> * remove unimplemented metadata methods test, un-necessary imports Signed-off-by: varun-edachali-dbx <[email protected]> * introduce unit tests for metadata methods Signed-off-by: varun-edachali-dbx <[email protected]> * remove verbosity in ResultSetFilter docstring Co-authored-by: jayant <[email protected]> * remove un-necessary info in ResultSetFilter docstring Signed-off-by: varun-edachali-dbx <[email protected]> * remove explicit type checking, string literals around forward annotations Signed-off-by: varun-edachali-dbx <[email protected]> * house SQL commands in constants Signed-off-by: varun-edachali-dbx <[email protected]> * convert complex types to string if not _use_arrow_native_complex_types Signed-off-by: varun-edachali-dbx <[email protected]> * introduce unit tests for altered functionality Signed-off-by: varun-edachali-dbx <[email protected]> * Revert "Merge branch 'fetch-json-inline' into ext-links-sea" This reverts commit dabba55, reversing changes made to dd7dc6a. Signed-off-by: varun-edachali-dbx <[email protected]> * reduce verbosity of ResultSetFilter docstring Signed-off-by: varun-edachali-dbx <[email protected]> * remove unused imports Signed-off-by: varun-edachali-dbx <[email protected]> * Revert "Merge branch 'fetch-json-inline' into ext-links-sea" This reverts commit 3a999c0, reversing changes made to a1f9b9c. * Revert "reduce verbosity of ResultSetFilter docstring" This reverts commit a1f9b9c. * Reapply "Merge branch 'fetch-json-inline' into ext-links-sea" This reverts commit 48ad7b3. * Revert "Merge branch 'fetch-json-inline' into ext-links-sea" This reverts commit dabba55, reversing changes made to dd7dc6a. * remove un-necessary filters changes Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary backend changes Signed-off-by: varun-edachali-dbx <[email protected]> * remove constants changes Signed-off-by: varun-edachali-dbx <[email protected]> * remove changes in filters tests Signed-off-by: varun-edachali-dbx <[email protected]> * remove unit test backend and JSON queue changes Signed-off-by: varun-edachali-dbx <[email protected]> * remove changes in sea result set testing Signed-off-by: varun-edachali-dbx <[email protected]> * Revert "remove changes in sea result set testing" This reverts commit d210ccd. * Revert "remove unit test backend and JSON queue changes" This reverts commit f6c5950. * Revert "remove changes in filters tests" This reverts commit f3f795a. * Revert "remove constants changes" This reverts commit 802d045. * Revert "remove un-necessary backend changes" This reverts commit 20822e4. * Revert "remove un-necessary filters changes" This reverts commit 5e75fb5. * remove unused imports Signed-off-by: varun-edachali-dbx <[email protected]> * working version Signed-off-by: varun-edachali-dbx <[email protected]> * adopy _wait_until_command_done Signed-off-by: varun-edachali-dbx <[email protected]> * introduce metadata commands Signed-off-by: varun-edachali-dbx <[email protected]> * use new backend structure Signed-off-by: varun-edachali-dbx <[email protected]> * constrain backend diff Signed-off-by: varun-edachali-dbx <[email protected]> * remove changes to filters Signed-off-by: varun-edachali-dbx <[email protected]> * make _parse methods in models internal Signed-off-by: varun-edachali-dbx <[email protected]> * reduce changes in unit tests Signed-off-by: varun-edachali-dbx <[email protected]> * run small queries with SEA during integration tests Signed-off-by: varun-edachali-dbx <[email protected]> * run some tests for sea Signed-off-by: varun-edachali-dbx <[email protected]> * allow empty schema bytes for alignment with SEA Signed-off-by: varun-edachali-dbx <[email protected]> * pass is_vl_op to Sea backend ExecuteResponse Signed-off-by: varun-edachali-dbx <[email protected]> * remove catalog requirement in get_tables Signed-off-by: varun-edachali-dbx <[email protected]> * move filters.py to SEA utils Signed-off-by: varun-edachali-dbx <[email protected]> * ensure SeaResultSet Signed-off-by: varun-edachali-dbx <[email protected]> * prevent circular imports Signed-off-by: varun-edachali-dbx <[email protected]> * remove unused imports Signed-off-by: varun-edachali-dbx <[email protected]> * remove cast, throw error if not SeaResultSet Signed-off-by: varun-edachali-dbx <[email protected]> * pass param as TSparkParameterValue Signed-off-by: varun-edachali-dbx <[email protected]> * remove failing test (temp) Signed-off-by: varun-edachali-dbx <[email protected]> * remove SeaResultSet type assertion Signed-off-by: varun-edachali-dbx <[email protected]> * change errors to align with spec, instead of arbitrary ValueError Signed-off-by: varun-edachali-dbx <[email protected]> * make SEA backend methods return SeaResultSet Signed-off-by: varun-edachali-dbx <[email protected]> * use spec-aligned Exceptions in SEA backend Signed-off-by: varun-edachali-dbx <[email protected]> * remove defensive row type check Signed-off-by: varun-edachali-dbx <[email protected]> * raise ProgrammingError for invalid id Signed-off-by: varun-edachali-dbx <[email protected]> * make is_volume_operation strict bool Signed-off-by: varun-edachali-dbx <[email protected]> * remove complex types code Signed-off-by: varun-edachali-dbx <[email protected]> * Revert "remove complex types code" This reverts commit 138359d. * introduce type conversion for primitive types for JSON + INLINE Signed-off-by: varun-edachali-dbx <[email protected]> * remove SEA running on metadata queries (known failures Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary docstrings Signed-off-by: varun-edachali-dbx <[email protected]> * align expected types with databricks sdk Signed-off-by: varun-edachali-dbx <[email protected]> * link rest api reference to validate types Signed-off-by: varun-edachali-dbx <[email protected]> * remove test_catalogs_returns_arrow_table test metadata commands not expected to pass Signed-off-by: varun-edachali-dbx <[email protected]> * fix fetchall_arrow and fetchmany_arrow Signed-off-by: varun-edachali-dbx <[email protected]> * remove thrift aligned test_cancel_during_execute from SEA tests Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary changes in example scripts Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary chagnes in example scripts Signed-off-by: varun-edachali-dbx <[email protected]> * _convert_json_table -> _create_json_table Signed-off-by: varun-edachali-dbx <[email protected]> * remove accidentally removed test Signed-off-by: varun-edachali-dbx <[email protected]> * remove new unit tests (to be re-added based on new arch) Signed-off-by: varun-edachali-dbx <[email protected]> * remove changes in sea_result_set functionality (to be re-added) Signed-off-by: varun-edachali-dbx <[email protected]> * introduce more integration tests Signed-off-by: varun-edachali-dbx <[email protected]> * remove SEA tests in parameterized queries Signed-off-by: varun-edachali-dbx <[email protected]> * remove partial parameter fix changes Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary timestamp tests (pass with minor disparity) Signed-off-by: varun-edachali-dbx <[email protected]> * slightly stronger typing of _convert_json_types Signed-off-by: varun-edachali-dbx <[email protected]> * stronger typing of json utility func s Signed-off-by: varun-edachali-dbx <[email protected]> * stronger typing of fetch*_json Signed-off-by: varun-edachali-dbx <[email protected]> * remove unused helper methods in SqlType Signed-off-by: varun-edachali-dbx <[email protected]> * line breaks after multi line pydocs, remove excess logs Signed-off-by: varun-edachali-dbx <[email protected]> * line breaks after multi line pydocs, reduce diff of redundant changes Signed-off-by: varun-edachali-dbx <[email protected]> * reduce diff of redundant changes Signed-off-by: varun-edachali-dbx <[email protected]> * mandate ResultData in SeaResultSet constructor Signed-off-by: varun-edachali-dbx <[email protected]> * remove complex type conversion Signed-off-by: varun-edachali-dbx <[email protected]> * correct fetch*_arrow Signed-off-by: varun-edachali-dbx <[email protected]> * recover old sea tests Signed-off-by: varun-edachali-dbx <[email protected]> * move queue and result set into SEA specific dir Signed-off-by: varun-edachali-dbx <[email protected]> * pass ssl_options into CloudFetchQueue Signed-off-by: varun-edachali-dbx <[email protected]> * reduce diff Signed-off-by: varun-edachali-dbx <[email protected]> * remove redundant conversion.py Signed-off-by: varun-edachali-dbx <[email protected]> * fix type issues Signed-off-by: varun-edachali-dbx <[email protected]> * ValueError not ProgrammingError Signed-off-by: varun-edachali-dbx <[email protected]> * reduce diff Signed-off-by: varun-edachali-dbx <[email protected]> * introduce SEA cloudfetch e2e tests Signed-off-by: varun-edachali-dbx <[email protected]> * allow empty cloudfetch result Signed-off-by: varun-edachali-dbx <[email protected]> * add unit tests for CloudFetchQueue and SeaResultSet Signed-off-by: varun-edachali-dbx <[email protected]> * skip pyarrow dependent tests Signed-off-by: varun-edachali-dbx <[email protected]> * simplify download process: no pre-fetching Signed-off-by: varun-edachali-dbx <[email protected]> * correct class name in logs Signed-off-by: varun-edachali-dbx <[email protected]> * align with old impl Signed-off-by: varun-edachali-dbx <[email protected]> * align next_n_rows with prev imple Signed-off-by: varun-edachali-dbx <[email protected]> * align remaining_rows with prev impl Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary Optional params Signed-off-by: varun-edachali-dbx <[email protected]> * remove un-necessary changes in thrift field if tests Signed-off-by: varun-edachali-dbx <[email protected]> * remove unused imports Signed-off-by: varun-edachali-dbx <[email protected]> * run large queries Signed-off-by: varun-edachali-dbx <[email protected]> * move link fetching immediately before table creation so link expiry is not an issue Signed-off-by: varun-edachali-dbx <[email protected]> * formatting (black) Signed-off-by: varun-edachali-dbx <[email protected]> * fix types Signed-off-by: varun-edachali-dbx <[email protected]> * fix param type in unit tests Signed-off-by: varun-edachali-dbx <[email protected]> * correct param extraction Signed-off-by: varun-edachali-dbx <[email protected]> * remove common constructor for databricks client abc Signed-off-by: varun-edachali-dbx <[email protected]> * make SEA Http Client instance a private member Signed-off-by: varun-edachali-dbx <[email protected]> * make GetChunksResponse model more robust Signed-off-by: varun-edachali-dbx <[email protected]> * add link to doc of GetChunk response model Signed-off-by: varun-edachali-dbx <[email protected]> * pass result_data instead of "initial links" into SeaCloudFetchQueue Signed-off-by: varun-edachali-dbx <[email protected]> * move download_manager init into parent CloudFetchQueue Signed-off-by: varun-edachali-dbx <[email protected]> * raise ServerOperationError for no 0th chunk Signed-off-by: varun-edachali-dbx <[email protected]> * unused iports Signed-off-by: varun-edachali-dbx <[email protected]> * return None in case of empty respose Signed-off-by: varun-edachali-dbx <[email protected]> * ensure table is empty on no initial link s Signed-off-by: varun-edachali-dbx <[email protected]> * iterate over chunk indexes instead of link Signed-off-by: varun-edachali-dbx <[email protected]> * stronger typing Signed-off-by: varun-edachali-dbx <[email protected]> * remove string literals around type defs Signed-off-by: varun-edachali-dbx <[email protected]> * introduce DownloadManager import Signed-off-by: varun-edachali-dbx <[email protected]> * return None for immediate out of bounds Signed-off-by: varun-edachali-dbx <[email protected]> --------- Signed-off-by: varun-edachali-dbx <[email protected]> Co-authored-by: jayant <[email protected]>
1 parent 922c448 commit 1a0575a

17 files changed

+1232
-338
lines changed

src/databricks/sql/backend/sea/backend.py

Lines changed: 53 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
import re
66
from typing import Any, Dict, Tuple, List, Optional, Union, TYPE_CHECKING, Set
77

8-
from databricks.sql.backend.sea.models.base import ResultManifest
8+
from databricks.sql.backend.sea.models.base import ExternalLink, ResultManifest
99
from databricks.sql.backend.sea.utils.constants import (
1010
ALLOWED_SESSION_CONF_TO_DEFAULT_VALUES_MAP,
1111
ResultFormat,
@@ -28,7 +28,7 @@
2828
BackendType,
2929
ExecuteResponse,
3030
)
31-
from databricks.sql.exc import DatabaseError, ProgrammingError, ServerOperationError
31+
from databricks.sql.exc import DatabaseError, ServerOperationError
3232
from databricks.sql.backend.sea.utils.http_client import SeaHttpClient
3333
from databricks.sql.types import SSLOptions
3434

@@ -44,6 +44,7 @@
4444
GetStatementResponse,
4545
CreateSessionResponse,
4646
)
47+
from databricks.sql.backend.sea.models.responses import GetChunksResponse
4748

4849
logger = logging.getLogger(__name__)
4950

@@ -88,6 +89,7 @@ class SeaDatabricksClient(DatabricksClient):
8889
STATEMENT_PATH = BASE_PATH + "statements"
8990
STATEMENT_PATH_WITH_ID = STATEMENT_PATH + "/{}"
9091
CANCEL_STATEMENT_PATH_WITH_ID = STATEMENT_PATH + "/{}/cancel"
92+
CHUNK_PATH_WITH_ID_AND_INDEX = STATEMENT_PATH + "/{}/result/chunks/{}"
9193

9294
# SEA constants
9395
POLL_INTERVAL_SECONDS = 0.2
@@ -123,18 +125,22 @@ def __init__(
123125
)
124126

125127
self._max_download_threads = kwargs.get("max_download_threads", 10)
128+
self._ssl_options = ssl_options
129+
self._use_arrow_native_complex_types = kwargs.get(
130+
"_use_arrow_native_complex_types", True
131+
)
126132

127133
# Extract warehouse ID from http_path
128134
self.warehouse_id = self._extract_warehouse_id(http_path)
129135

130136
# Initialize HTTP client
131-
self.http_client = SeaHttpClient(
137+
self._http_client = SeaHttpClient(
132138
server_hostname=server_hostname,
133139
port=port,
134140
http_path=http_path,
135141
http_headers=http_headers,
136142
auth_provider=auth_provider,
137-
ssl_options=ssl_options,
143+
ssl_options=self._ssl_options,
138144
**kwargs,
139145
)
140146

@@ -173,7 +179,7 @@ def _extract_warehouse_id(self, http_path: str) -> str:
173179
f"Note: SEA only works for warehouses."
174180
)
175181
logger.error(error_message)
176-
raise ProgrammingError(error_message)
182+
raise ValueError(error_message)
177183

178184
@property
179185
def max_download_threads(self) -> int:
@@ -220,7 +226,7 @@ def open_session(
220226
schema=schema,
221227
)
222228

223-
response = self.http_client._make_request(
229+
response = self._http_client._make_request(
224230
method="POST", path=self.SESSION_PATH, data=request_data.to_dict()
225231
)
226232

@@ -245,7 +251,7 @@ def close_session(self, session_id: SessionId) -> None:
245251
session_id: The session identifier returned by open_session()
246252
247253
Raises:
248-
ProgrammingError: If the session ID is invalid
254+
ValueError: If the session ID is invalid
249255
OperationalError: If there's an error closing the session
250256
"""
251257

@@ -260,7 +266,7 @@ def close_session(self, session_id: SessionId) -> None:
260266
session_id=sea_session_id,
261267
)
262268

263-
self.http_client._make_request(
269+
self._http_client._make_request(
264270
method="DELETE",
265271
path=self.SESSION_PATH_WITH_ID.format(sea_session_id),
266272
data=request_data.to_dict(),
@@ -342,7 +348,7 @@ def _results_message_to_execute_response(
342348

343349
# Check for compression
344350
lz4_compressed = (
345-
response.manifest.result_compression == ResultCompression.LZ4_FRAME
351+
response.manifest.result_compression == ResultCompression.LZ4_FRAME.value
346352
)
347353

348354
execute_response = ExecuteResponse(
@@ -424,7 +430,7 @@ def execute_command(
424430
enforce_embedded_schema_correctness: Whether to enforce schema correctness
425431
426432
Returns:
427-
ResultSet: A SeaResultSet instance for the executed command
433+
SeaResultSet: A SeaResultSet instance for the executed command
428434
"""
429435

430436
if session_id.backend_type != BackendType.SEA:
@@ -471,7 +477,7 @@ def execute_command(
471477
result_compression=result_compression,
472478
)
473479

474-
response_data = self.http_client._make_request(
480+
response_data = self._http_client._make_request(
475481
method="POST", path=self.STATEMENT_PATH, data=request.to_dict()
476482
)
477483
response = ExecuteStatementResponse.from_dict(response_data)
@@ -505,7 +511,7 @@ def cancel_command(self, command_id: CommandId) -> None:
505511
command_id: Command identifier to cancel
506512
507513
Raises:
508-
ProgrammingError: If the command ID is invalid
514+
ValueError: If the command ID is invalid
509515
"""
510516

511517
if command_id.backend_type != BackendType.SEA:
@@ -516,7 +522,7 @@ def cancel_command(self, command_id: CommandId) -> None:
516522
raise ValueError("Not a valid SEA command ID")
517523

518524
request = CancelStatementRequest(statement_id=sea_statement_id)
519-
self.http_client._make_request(
525+
self._http_client._make_request(
520526
method="POST",
521527
path=self.CANCEL_STATEMENT_PATH_WITH_ID.format(sea_statement_id),
522528
data=request.to_dict(),
@@ -530,7 +536,7 @@ def close_command(self, command_id: CommandId) -> None:
530536
command_id: Command identifier to close
531537
532538
Raises:
533-
ProgrammingError: If the command ID is invalid
539+
ValueError: If the command ID is invalid
534540
"""
535541

536542
if command_id.backend_type != BackendType.SEA:
@@ -541,7 +547,7 @@ def close_command(self, command_id: CommandId) -> None:
541547
raise ValueError("Not a valid SEA command ID")
542548

543549
request = CloseStatementRequest(statement_id=sea_statement_id)
544-
self.http_client._make_request(
550+
self._http_client._make_request(
545551
method="DELETE",
546552
path=self.STATEMENT_PATH_WITH_ID.format(sea_statement_id),
547553
data=request.to_dict(),
@@ -558,7 +564,7 @@ def get_query_state(self, command_id: CommandId) -> CommandState:
558564
CommandState: The current state of the command
559565
560566
Raises:
561-
ProgrammingError: If the command ID is invalid
567+
ValueError: If the command ID is invalid
562568
"""
563569

564570
if command_id.backend_type != BackendType.SEA:
@@ -569,7 +575,7 @@ def get_query_state(self, command_id: CommandId) -> CommandState:
569575
raise ValueError("Not a valid SEA command ID")
570576

571577
request = GetStatementRequest(statement_id=sea_statement_id)
572-
response_data = self.http_client._make_request(
578+
response_data = self._http_client._make_request(
573579
method="GET",
574580
path=self.STATEMENT_PATH_WITH_ID.format(sea_statement_id),
575581
data=request.to_dict(),
@@ -609,7 +615,7 @@ def get_execution_result(
609615
request = GetStatementRequest(statement_id=sea_statement_id)
610616

611617
# Get the statement result
612-
response_data = self.http_client._make_request(
618+
response_data = self._http_client._make_request(
613619
method="GET",
614620
path=self.STATEMENT_PATH_WITH_ID.format(sea_statement_id),
615621
data=request.to_dict(),
@@ -631,6 +637,35 @@ def get_execution_result(
631637
arraysize=cursor.arraysize,
632638
)
633639

640+
def get_chunk_link(self, statement_id: str, chunk_index: int) -> ExternalLink:
641+
"""
642+
Get links for chunks starting from the specified index.
643+
Args:
644+
statement_id: The statement ID
645+
chunk_index: The starting chunk index
646+
Returns:
647+
ExternalLink: External link for the chunk
648+
"""
649+
650+
response_data = self._http_client._make_request(
651+
method="GET",
652+
path=self.CHUNK_PATH_WITH_ID_AND_INDEX.format(statement_id, chunk_index),
653+
)
654+
response = GetChunksResponse.from_dict(response_data)
655+
656+
links = response.external_links or []
657+
link = next((l for l in links if l.chunk_index == chunk_index), None)
658+
if not link:
659+
raise ServerOperationError(
660+
f"No link found for chunk index {chunk_index}",
661+
{
662+
"operation-id": statement_id,
663+
"diagnostic-info": None,
664+
},
665+
)
666+
667+
return link
668+
634669
# == Metadata Operations ==
635670

636671
def get_catalogs(

src/databricks/sql/backend/sea/models/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@
2727
ExecuteStatementResponse,
2828
GetStatementResponse,
2929
CreateSessionResponse,
30+
GetChunksResponse,
3031
)
3132

3233
__all__ = [
@@ -49,4 +50,5 @@
4950
"ExecuteStatementResponse",
5051
"GetStatementResponse",
5152
"CreateSessionResponse",
53+
"GetChunksResponse",
5254
]

src/databricks/sql/backend/sea/models/responses.py

Lines changed: 35 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
These models define the structures used in SEA API responses.
55
"""
66

7-
from typing import Dict, Any
7+
from typing import Dict, Any, List, Optional
88
from dataclasses import dataclass
99

1010
from databricks.sql.backend.types import CommandState
@@ -154,3 +154,37 @@ class CreateSessionResponse:
154154
def from_dict(cls, data: Dict[str, Any]) -> "CreateSessionResponse":
155155
"""Create a CreateSessionResponse from a dictionary."""
156156
return cls(session_id=data.get("session_id", ""))
157+
158+
159+
@dataclass
160+
class GetChunksResponse:
161+
"""
162+
Response from getting chunks for a statement.
163+
164+
The response model can be found in the docs, here:
165+
https://docs.databricks.com/api/workspace/statementexecution/getstatementresultchunkn
166+
"""
167+
168+
data: Optional[List[List[Any]]] = None
169+
external_links: Optional[List[ExternalLink]] = None
170+
byte_count: Optional[int] = None
171+
chunk_index: Optional[int] = None
172+
next_chunk_index: Optional[int] = None
173+
next_chunk_internal_link: Optional[str] = None
174+
row_count: Optional[int] = None
175+
row_offset: Optional[int] = None
176+
177+
@classmethod
178+
def from_dict(cls, data: Dict[str, Any]) -> "GetChunksResponse":
179+
"""Create a GetChunksResponse from a dictionary."""
180+
result = _parse_result({"result": data})
181+
return cls(
182+
data=result.data,
183+
external_links=result.external_links,
184+
byte_count=result.byte_count,
185+
chunk_index=result.chunk_index,
186+
next_chunk_index=result.next_chunk_index,
187+
next_chunk_internal_link=result.next_chunk_internal_link,
188+
row_count=result.row_count,
189+
row_offset=result.row_offset,
190+
)

0 commit comments

Comments
 (0)