Skip to content

Commit d8d68bf

Browse files
authored
Expose LSN Header Information (#539)
# Expose LSN Header Information in API Responses ## Overview This PR implements exposure of LSN (Log Sequence Number) header information from Pinecone API responses through a new `_response_info` attribute on response objects. This enables faster test suite execution by using LSN-based freshness checks instead of polling `describe_index_stats()`. ## Motivation Integration tests currently rely on polling `describe_index_stats()` to verify data freshness, which is slow and inefficient. The Pinecone API includes LSN headers in responses that can be used to determine data freshness more efficiently: - `x-pinecone-request-lsn`: Committed LSN from write operations (upsert, delete) - `x-pinecone-max-indexed-lsn`: Reconciled LSN from read operations (query) By extracting and exposing these headers, tests can use LSN-based polling to reduce test execution time significantly. Testing so far shows this will cut the time needed to run db data plane integration times down by half or more. ## Changes ### Core Implementation #### Response Info Module - Created `pinecone/utils/response_info.py` with: - `ResponseInfo` TypedDict for structured response metadata - `extract_response_info()` function to extract and normalize raw headers - Fields: `raw_headers` (dictionary of all response headers normalized to lowercase) - Case-insensitive header matching - LSN extraction is handled by test utilities (`lsn_utils`) rather than in `ResponseInfo` #### REST API Client Integration - Updated `api_client.py` and `asyncio_api_client.py` to automatically attach `_response_info` to db data plane response objects - Always attaches `_response_info` to ensure `raw_headers` are always available, even when LSN fields are not present #### gRPC Integration - Updated `grpc_runner.py` to capture initial metadata from gRPC calls - Modified all parser functions in `grpc/utils.py` to accept optional `initial_metadata` parameter - Updated `index_grpc.py` to pass initial metadata to parser functions - Updated `future.py` to extract initial metadata from gRPC futures #### Response Dataclasses - Created `QueryResponse` and `UpsertResponse` dataclasses in `pinecone/db_data/dataclasses/` - Added `_response_info` field to `FetchResponse`, `FetchByMetadataResponse`, `QueryResponse`, and `UpsertResponse` - All response dataclasses inherit from `DictLike` for dictionary-style access - `_response_info` is a required field (always present) with default `{"raw_headers": {}}` #### Index Classes - Updated `index.py` and `index_asyncio.py` to: - Convert OpenAPI responses to dataclasses with `_response_info` attached - Handle `async_req=True` with `ApplyResult` wrapper for proper dataclass conversion - Extract `_response_info` from `upsert_records()` responses ### Test Infrastructure #### LSN Utilities - Created `tests/integration/helpers/lsn_utils.py` with helper functions for extracting LSN values - Created compatibility shim `pinecone/utils/lsn_utils.py` (deprecated) #### Polling Helpers - Updated `poll_until_lsn_reconciled()` to use query operations for LSN-based freshness checks - Added `poll_until_lsn_reconciled_async()` for async tests - Falls back to old polling methods when LSN not available #### Integration Test Updates - Updated multiple integration tests to use LSN-based polling: - `test_query.py`, `test_upsert_dense.py`, `test_search_and_upsert_records.py` - `test_fetch.py`, `test_fetch_by_metadata.py`, `test_upsert_hybrid.py` - `test_query_namespaces.py`, `seed.py` - Async versions: `test_query.py` (async) - Added assertions to verify `_response_info` is present when expected ### Documentation - Created `docs/maintainers/lsn-headers-discovery.md` documenting discovered headers - Created `scripts/inspect_lsn_headers.py` for header discovery ## Usage Examples ### Accessing Response Info The `_response_info` attribute is always available on all Index response objects: ```python from pinecone import Pinecone pc = Pinecone(api_key="your-api-key") index = pc.Index("my-index") # Upsert operation - get committed LSN upsert_response = index.upsert( vectors=[("id1", [0.1, 0.2, 0.3]), ("id2", [0.4, 0.5, 0.6])] ) # Access raw headers (always present, contains all response headers) raw_headers = upsert_response._response_info.get("raw_headers") print(f"Raw headers: {raw_headers}") # Example output: Raw headers: { # 'x-pinecone-request-lsn': '12345', # 'x-pinecone-api-version': '2025-10', # 'content-type': 'application/json', # 'server': 'envoy', # ... # } # Extract LSN from raw headers using test utilities (for testing/polling) from tests.integration.helpers.lsn_utils import extract_lsn_committed lsn_committed = extract_lsn_committed(raw_headers) print(f"Committed LSN: {lsn_committed}") # Example output: Committed LSN: 12345 # Query operation query_response = index.query( vector=[0.1, 0.2, 0.3], top_k=10 ) # Access raw headers raw_headers = query_response._response_info.get("raw_headers") print(f"Raw headers: {raw_headers}") # Example output: Raw headers: { # 'x-pinecone-max-indexed-lsn': '12345', # 'x-pinecone-api-version': '2025-10', # 'content-type': 'application/json', # ... # } # Extract LSN from raw headers using test utilities from tests.integration.helpers.lsn_utils import extract_lsn_reconciled lsn_reconciled = extract_lsn_reconciled(raw_headers) print(f"Reconciled LSN: {lsn_reconciled}") # Example output: Reconciled LSN: 12345 # Fetch operation - response info always available fetch_response = index.fetch(ids=["id1", "id2"]) print(f"Response info: {fetch_response._response_info}") # Example output: # Response info: { # 'raw_headers': { # 'x-pinecone-max-indexed-lsn': '12345', # 'x-pinecone-api-version': '2025-10', # 'content-type': 'application/json', # ... # } # } ``` ### Dictionary-Style Access All response dataclasses inherit from `DictLike`, enabling dictionary-style access: ```python query_response = index.query(vector=[...], top_k=10) # Attribute access (existing) matches = query_response.matches # Dictionary-style access (new) matches = query_response["matches"] # Response info access response_info = query_response._response_info # Example: {'raw_headers': {'x-pinecone-max-indexed-lsn': '12345', 'x-pinecone-api-version': '2025-10', 'content-type': 'application/json', ...}} ``` ## Technical Details ### Response Info Flow 1. **REST API**: - HTTP headers → `api_client.py` extracts → attaches `_response_info` to OpenAPI model → Index classes convert to dataclasses 2. **gRPC**: - Initial metadata → `grpc_runner.py` captures → parser functions extract → attach `_response_info` to response objects ### Backward Compatibility - All existing method signatures remain unchanged - `_response_info` is always present on response objects (required field) - `raw_headers` in `_response_info` always contains response headers (may be empty dict if no headers) - Test utilities (`poll_until_lsn_reconciled`, `poll_until_lsn_reconciled_async`) accept `_response_info` directly and extract LSN internally - Response objects maintain all existing attributes and behavior ### Type Safety - Added proper type hints for `_response_info` fields - Updated return type annotations to reflect dataclass usage - Added `type: ignore` comments where necessary (e.g., `ApplyResult` wrapping) ### Dataclass Enhancements - All response dataclasses now inherit from `DictLike` for dictionary-style access - `QueryResponse` and `UpsertResponse` are new dataclasses replacing OpenAPI models - `_response_info` field: `ResponseInfo = field(default_factory=lambda: cast(ResponseInfo, {"raw_headers": {}}), repr=True, compare=False)` - Always present (required field) - `repr=True` for all response dataclasses to aid debugging - `raw_headers` always contains response headers (may be empty dict) - `ResponseInfo` only contains `raw_headers` ## Testing ### Unit Tests - ✅ All gRPC upsert tests pass (32/32) - ✅ All unit tests pass (340+ passed) - ✅ Created unit tests for `extract_response_info()` function - ✅ Created unit tests for LSN utility functions ### Integration Tests - ✅ Updated integration tests to use LSN-based polling - ✅ 38 integration tests pass - ✅ LSN-based polling working correctly (faster test execution) - ✅ `_response_info` assertions added to verify LSN data is present ## Breaking Changes **None** - This is a backward-compatible enhancement. ### Response Type Changes - `QueryResponse` and `UpsertResponse` are now dataclasses instead of OpenAPI models - **Impact**: Minimal - dataclasses are compatible for attribute access and dictionary-style access (via `DictLike`) - **Mitigation**: Public API exports remain the same (`from pinecone import QueryResponse, UpsertResponse`) - **Note**: If users were doing `isinstance()` checks against OpenAPI models, they should still work when importing from `pinecone` ### New Attribute - `_response_info` is added to all Index response objects (`QueryResponse`, `UpsertResponse`, `FetchResponse`, `FetchByMetadataResponse`) - **Impact**: Minimal - it's a required attribute with underscore prefix (indicates internal use) - **Mitigation**: Underscore prefix indicates it's not part of the public API contract - **Note**: `_response_info` is always present and contains `raw_headers`. ### Compatibility Notes - All response dataclasses inherit from `DictLike`, enabling dictionary-style access (`response['matches']`) - Attribute access remains unchanged (`response.matches`, `response.namespace`, etc.) - OpenAPI-specific methods like `to_dict()` were not part of the public API ## Related Issues - Enables faster test suite execution through LSN-based polling - Provides foundation for future LSN-based features
1 parent c27d3c2 commit d8d68bf

File tree

91 files changed

+2229
-1010
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

91 files changed

+2229
-1010
lines changed

.github/actions/project-create/action.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ runs:
4040
- name: Set up Python
4141
uses: actions/setup-python@v5
4242
with:
43-
python-version: 3.9
43+
python-version: '3.10'
4444

4545
- name: Install deps
4646
shell: bash

.github/actions/project-delete/action.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ runs:
2828
- name: Set up Python
2929
uses: actions/setup-python@v5
3030
with:
31-
python-version: 3.9
31+
python-version: '3.10'
3232

3333
- name: Install deps
3434
shell: bash

.github/actions/run-integration-test/action.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ runs:
3333
- name: Run tests
3434
id: run-tests
3535
shell: bash
36-
run: poetry run pytest tests/integration/${{ inputs.test_suite }} --retries 2 --retry-delay 35 -s -vv --log-cli-level=DEBUG
36+
run: poetry run pytest tests/integration/${{ inputs.test_suite }} --retries 2 --retry-delay 35 -s -vv --log-cli-level=DEBUG --durations=20
3737
env:
3838
PINECONE_API_KEY: ${{ steps.decrypt-api-key.outputs.decrypted_secret }}
3939
PINECONE_ADDITIONAL_HEADERS: ${{ inputs.PINECONE_ADDITIONAL_HEADERS }}

.github/actions/setup-poetry/action.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ inputs:
2020
python_version:
2121
description: 'Python version to use'
2222
required: true
23-
default: '3.9'
23+
default: '3.10'
2424

2525
runs:
2626
using: 'composite'

.github/actions/test-dependency-asyncio-rest/action.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ inputs:
1515
python_version:
1616
description: 'The version of Python to use'
1717
required: false
18-
default: '3.9'
18+
default: '3.10'
1919
aiohttp_version:
2020
description: 'The version of aiohttp to install'
2121
required: true

.github/actions/test-dependency-grpc/action.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ inputs:
1515
python_version:
1616
description: 'The version of Python to use'
1717
required: false
18-
default: '3.9'
18+
default: '3.10'
1919
grpcio_version:
2020
description: 'The version of grpcio to install'
2121
required: true

.github/actions/test-dependency-rest/action.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ inputs:
1515
python_version:
1616
description: 'The version of Python to use'
1717
required: false
18-
default: '3.9'
18+
default: '3.10'
1919
urllib3_version:
2020
description: 'The version of urllib3 to install'
2121
required: true

.github/workflows/on-merge.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ jobs:
3535
uses: './.github/workflows/testing-unit.yaml'
3636
secrets: inherit
3737
with:
38-
python_versions_json: '["3.9", "3.13"]'
38+
python_versions_json: '["3.10", "3.13"]'
3939

4040
create-project:
4141
uses: './.github/workflows/project-setup.yaml'
@@ -51,7 +51,7 @@ jobs:
5151
- create-project
5252
with:
5353
encrypted_project_api_key: ${{ needs.create-project.outputs.encrypted_project_api_key }}
54-
python_versions_json: '["3.9", "3.13"]'
54+
python_versions_json: '["3.10", "3.13"]'
5555
dependency-tests:
5656
uses: './.github/workflows/testing-dependency.yaml'
5757
secrets: inherit
@@ -85,7 +85,7 @@ jobs:
8585
runs-on: ubuntu-latest
8686
strategy:
8787
matrix:
88-
python-version: [3.9, 3.13]
88+
python-version: ['3.10', '3.13']
8989
steps:
9090
- uses: actions/checkout@v4
9191
- name: Setup Poetry

.github/workflows/on-pr.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ jobs:
3838
uses: './.github/workflows/testing-unit.yaml'
3939
secrets: inherit
4040
with:
41-
python_versions_json: '["3.9"]'
41+
python_versions_json: '["3.10"]'
4242

4343
determine-test-suites:
4444
name: Determine test suites
@@ -112,7 +112,7 @@ jobs:
112112
- determine-test-suites
113113
with:
114114
encrypted_project_api_key: ${{ needs.create-project.outputs.encrypted_project_api_key }}
115-
python_versions_json: '["3.13", "3.9"]'
115+
python_versions_json: '["3.10"]'
116116
rest_sync_suites_json: ${{ needs.determine-test-suites.outputs.rest_sync_suites || '' }}
117117
rest_asyncio_suites_json: ${{ needs.determine-test-suites.outputs.rest_asyncio_suites || '' }}
118118
grpc_sync_suites_json: ${{ needs.determine-test-suites.outputs.grpc_sync_suites || '' }}

.github/workflows/project-cleanup.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ jobs:
2020
- uses: actions/checkout@v4
2121
- uses: ./.github/actions/setup-poetry
2222
with:
23-
python_version: 3.9
23+
python_version: '3.10'
2424
- uses: ./.github/actions/project-delete
2525
with:
2626
FERNET_ENCRYPTION_KEY: '${{ secrets.FERNET_ENCRYPTION_KEY }}'

0 commit comments

Comments
 (0)