Skip to content

Commit c95c4b0

Browse files
authored
Typing improvements (#444)
## Problem The more type hints we have in our package, the better people will be able to understand how to use it. ## Solution Much of the untyped code in the package is derived from old generated code that we have extracted. So there's a fair amount of refactoring in this difif to try to break out some smaller classes and functions with seams that you can start to analyze and type; when everything is just big mutable state blobs it is quite hard to reason about. Along the way I uncovered that bulk import features were in a broken state because some of those operation ids was modified since the last release and these functions are not well-covered with automated tests. This sort of thing really highlights why we need better type coverage in the package. ## Type of Change - [x] Bug fix (non-breaking change which fixes an issue) - [x] None of the above: Refactoring to improve type safety
1 parent 980ac3b commit c95c4b0

File tree

143 files changed

+3405
-3497
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

143 files changed

+3405
-3497
lines changed

codegen/apis

Submodule apis updated from 63e97dc to eb79d8e

codegen/build-oas.sh

Lines changed: 3 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ update_templates_repo() {
3434
echo "Updating templates repo"
3535
pushd codegen/python-oas-templates
3636
git fetch
37-
git checkout main
37+
git checkout jhamon/core-sdk
3838
git pull
3939
popd
4040
}
@@ -119,19 +119,6 @@ remove_shared_classes() {
119119
rm "${source_directory}/${file}.py"
120120
done
121121
done
122-
123-
# Adjust import paths in every file
124-
find "${destination}" -name "*.py" | while IFS= read -r file; do
125-
sed -i '' "s/from \.\.model_utils/from pinecone\.openapi_support\.model_utils/g" "$file"
126-
127-
for module in "${modules[@]}"; do
128-
sed -i '' "s/from pinecone\.$py_module_name\.openapi\.$module import rest/from pinecone\.openapi_support import rest/g" "$file"
129-
130-
for sharedFile in "${sharedFiles[@]}"; do
131-
sed -i '' "s/from pinecone\.$py_module_name\.openapi\.$module\.$sharedFile/from pinecone\.openapi_support/g" "$file"
132-
done
133-
done
134-
done
135122
}
136123

137124
# Generated Python code attempts to internally map OpenAPI fields that begin
@@ -203,3 +190,5 @@ remove_shared_classes
203190

204191
# Format generated files
205192
poetry run ruff format "${destination}"
193+
194+
rm -rf "$build_dir"

mypy.ini

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
[mypy]
2+
; pretty = True
3+
; disallow_untyped_calls = True
4+
; check_untyped_defs = True
5+
; disallow_untyped_defs = True
6+
; warn_return_any = True
7+
; warn_unused_configs = True
8+
9+
# Per-module options:
10+
11+
; [mypy-mycode.foo.*]
12+
; disallow_untyped_defs = True
13+
14+
[mypy-google.api.*]
15+
ignore_missing_imports = True

pinecone/config/config.py

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,17 @@
44
from pinecone.exceptions.exceptions import PineconeConfigurationError
55
from pinecone.config.openapi import OpenApiConfigFactory
66
from pinecone.openapi_support.configuration import Configuration as OpenApiConfiguration
7-
from pinecone.utils import normalize_host
8-
from pinecone.utils.constants import SOURCE_TAG
7+
8+
9+
# Duplicated this util to help resolve circular imports
10+
def normalize_host(host: Optional[str]) -> str:
11+
if host is None:
12+
return ""
13+
if host.startswith("https://"):
14+
return host
15+
if host.startswith("http://"):
16+
return host
17+
return "https://" + host
918

1019

1120
class Config(NamedTuple):
@@ -50,7 +59,7 @@ def build(
5059
api_key = api_key or kwargs.pop("api_key", None) or os.getenv("PINECONE_API_KEY")
5160
host = host or kwargs.pop("host", None)
5261
host = normalize_host(host)
53-
source_tag = kwargs.pop(SOURCE_TAG, None)
62+
source_tag = kwargs.pop("source_tag", None)
5463

5564
if not api_key:
5665
raise PineconeConfigurationError("You haven't specified an Api-Key.")

pinecone/control/index_host_store.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,9 @@ def __call__(cls, *args, **kwargs):
1818

1919

2020
class IndexHostStore(metaclass=SingletonMeta):
21-
def __init__(self):
21+
_indexHosts: Dict[str, str]
22+
23+
def __init__(self) -> None:
2224
self._indexHosts = {}
2325

2426
def _key(self, config: Config, index_name: str) -> str:

pinecone/control/pinecone.py

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
import time
22
import logging
33
from typing import Optional, Dict, Union
4+
from multiprocessing import cpu_count
45

56
from .index_host_store import IndexHostStore
67
from .pinecone_interface import PineconeDBControlInterface
@@ -60,7 +61,7 @@ def __init__(
6061
ssl_verify: Optional[bool] = None,
6162
config: Optional[Config] = None,
6263
additional_headers: Optional[Dict[str, str]] = {},
63-
pool_threads: Optional[int] = 1,
64+
pool_threads: Optional[int] = None,
6465
**kwargs,
6566
):
6667
if config:
@@ -86,7 +87,11 @@ def __init__(
8687
)
8788

8889
self.openapi_config = ConfigBuilder.build_openapi_config(self.config, **kwargs)
89-
self.pool_threads = pool_threads
90+
91+
if pool_threads is None:
92+
self.pool_threads = 5 * cpu_count()
93+
else:
94+
self.pool_threads = pool_threads
9095

9196
self._inference = None # Lazy initialization
9297

@@ -102,7 +107,9 @@ def __init__(
102107
self.index_host_store = IndexHostStore()
103108
""" @private """
104109

105-
self.load_plugins()
110+
self.load_plugins(
111+
config=self.config, openapi_config=self.openapi_config, pool_threads=self.pool_threads
112+
)
106113

107114
@property
108115
def inference(self):
@@ -164,7 +171,7 @@ def create_index_for_model(
164171
def __poll_describe_index_until_ready(self, name: str, timeout: Optional[int] = None):
165172
description = None
166173

167-
def is_ready():
174+
def is_ready() -> bool:
168175
nonlocal description
169176
description = self.describe_index(name=name)
170177
return description.status.ready
@@ -203,17 +210,14 @@ def delete_index(self, name: str, timeout: Optional[int] = None):
203210
self.index_api.delete_index(name)
204211
self.index_host_store.delete_host(self.config, name)
205212

206-
def get_remaining():
207-
return name in self.list_indexes().names()
208-
209213
if timeout == -1:
210214
return
211215

212216
if timeout is None:
213-
while get_remaining():
217+
while self.has_index(name):
214218
time.sleep(5)
215219
else:
216-
while get_remaining() and timeout >= 0:
220+
while self.has_index(name) and timeout >= 0:
217221
time.sleep(5)
218222
timeout -= 5
219223
if timeout and timeout < 0:

pinecone/control/pinecone_asyncio.py

Lines changed: 20 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,12 @@
22
import asyncio
33
from typing import Optional, Dict, Union
44

5-
from .index_host_store import IndexHostStore
6-
7-
from pinecone.config import PineconeConfig, Config, ConfigBuilder
5+
from pinecone.config import PineconeConfig, ConfigBuilder
86

97
from pinecone.core.openapi.db_control.api.manage_indexes_api import AsyncioManageIndexesApi
108
from pinecone.openapi_support import AsyncioApiClient
119

12-
from pinecone.utils import normalize_host, setup_openapi_client
10+
from pinecone.utils import normalize_host, setup_async_openapi_client
1311
from pinecone.core.openapi.db_control import API_VERSION
1412
from pinecone.models import (
1513
ServerlessSpec,
@@ -56,52 +54,31 @@ def __init__(
5654
proxy_headers: Optional[Dict[str, str]] = None,
5755
ssl_ca_certs: Optional[str] = None,
5856
ssl_verify: Optional[bool] = None,
59-
config: Optional[Config] = None,
6057
additional_headers: Optional[Dict[str, str]] = {},
6158
**kwargs,
6259
):
63-
if config:
64-
if not isinstance(config, Config):
65-
raise TypeError("config must be of type pinecone.config.Config")
66-
else:
67-
self.config = config
68-
else:
69-
self.config = PineconeConfig.build(
70-
api_key=api_key,
71-
host=host,
72-
additional_headers=additional_headers,
73-
proxy_url=proxy_url,
74-
proxy_headers=proxy_headers,
75-
ssl_ca_certs=ssl_ca_certs,
76-
ssl_verify=ssl_verify,
77-
**kwargs,
78-
)
79-
80-
if kwargs.get("openapi_config", None):
81-
raise Exception(
82-
"Passing openapi_config is no longer supported. Please pass settings such as proxy_url, proxy_headers, ssl_ca_certs, and ssl_verify directly to the Pinecone constructor as keyword arguments. See the README at https://github.com/pinecone-io/pinecone-python-client for examples."
83-
)
84-
60+
self.config = PineconeConfig.build(
61+
api_key=api_key,
62+
host=host,
63+
additional_headers=additional_headers,
64+
proxy_url=proxy_url,
65+
proxy_headers=proxy_headers,
66+
ssl_ca_certs=ssl_ca_certs,
67+
ssl_verify=ssl_verify,
68+
**kwargs,
69+
)
8570
self.openapi_config = ConfigBuilder.build_openapi_config(self.config, **kwargs)
8671

87-
# I am pretty sure these aren't used, but need to pass a
88-
# value for now to satisfy the method signatures
89-
self.pool_threads = 1
90-
9172
self._inference = None # Lazy initialization
9273

93-
self.index_api = setup_openapi_client(
74+
self.index_api = setup_async_openapi_client(
9475
api_client_klass=AsyncioApiClient,
9576
api_klass=AsyncioManageIndexesApi,
9677
config=self.config,
9778
openapi_config=self.openapi_config,
98-
pool_threads=self.pool_threads,
9979
api_version=API_VERSION,
10080
)
10181

102-
self.index_host_store = IndexHostStore()
103-
""" @private """
104-
10582
async def __aenter__(self):
10683
return self
10784

@@ -171,7 +148,7 @@ async def create_index_for_model(
171148
async def __poll_describe_index_until_ready(self, name: str, timeout: Optional[int] = None):
172149
description = None
173150

174-
async def is_ready():
151+
async def is_ready() -> bool:
175152
nonlocal description
176153
description = await self.describe_index(name=name)
177154
return description.status.ready
@@ -208,20 +185,15 @@ async def is_ready():
208185

209186
async def delete_index(self, name: str, timeout: Optional[int] = None):
210187
await self.index_api.delete_index(name)
211-
self.index_host_store.delete_host(self.config, name)
212-
213-
async def get_remaining():
214-
available_indexes = await self.list_indexes()
215-
return name in available_indexes.names()
216188

217189
if timeout == -1:
218190
return
219191

220192
if timeout is None:
221-
while await get_remaining():
193+
while await self.has_index(name):
222194
await asyncio.sleep(5)
223195
else:
224-
while await get_remaining() and timeout >= 0:
196+
while await self.has_index(name) and timeout >= 0:
225197
await asyncio.sleep(5)
226198
timeout -= 5
227199
if timeout and timeout < 0:
@@ -239,9 +211,6 @@ async def list_indexes(self) -> IndexList:
239211

240212
async def describe_index(self, name: str) -> IndexModel:
241213
description = await self.index_api.describe_index(name)
242-
host = description.host
243-
self.index_host_store.set_host(self.config, name, host)
244-
245214
return IndexModel(description)
246215

247216
async def has_index(self, name: str) -> bool:
@@ -284,25 +253,17 @@ async def delete_collection(self, name: str):
284253
async def describe_collection(self, name: str):
285254
return await self.index_api.describe_collection(name).to_dict()
286255

287-
def Index(self, name: str = "", host: str = "", **kwargs):
288-
if name == "" and host == "":
289-
raise ValueError("Either name or host must be specified")
290-
291-
pt = kwargs.pop("pool_threads", None) or self.pool_threads
256+
def Index(self, host: str, **kwargs):
292257
api_key = self.config.api_key
293258
openapi_config = self.openapi_config
294259

295-
if host != "":
296-
# Use host url if it is provided
297-
index_host = normalize_host(host)
298-
else:
299-
# Otherwise, get host url from describe_index using the index name
300-
index_host = self.index_host_store.get_host(self.index_api, self.config, name)
260+
index_host = normalize_host(host)
261+
if index_host == "":
262+
raise ValueError("host must be specified")
301263

302264
return _AsyncioIndex(
303265
host=index_host,
304266
api_key=api_key,
305-
pool_threads=pt,
306267
openapi_config=openapi_config,
307268
source_tag=self.config.source_tag,
308269
**kwargs,

pinecone/control/pinecone_interface.py

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,6 @@
33
from typing import Optional, Dict, Union
44

55

6-
from pinecone.config import Config
7-
86
from pinecone.core.openapi.db_control.api.manage_indexes_api import ManageIndexesApi
97

108

@@ -39,7 +37,6 @@ def __init__(
3937
proxy_headers: Optional[Dict[str, str]] = None,
4038
ssl_ca_certs: Optional[str] = None,
4139
ssl_verify: Optional[bool] = None,
42-
config: Optional[Config] = None,
4340
additional_headers: Optional[Dict[str, str]] = {},
4441
pool_threads: Optional[int] = 1,
4542
index_api: Optional[ManageIndexesApi] = None,
@@ -61,8 +58,6 @@ def __init__(
6158
:type ssl_ca_certs: str, optional
6259
:param ssl_verify: SSL verification is performed by default, but can be disabled using the boolean flag. Default: `True`
6360
:type ssl_verify: bool, optional
64-
:param config: A `pinecone.config.Config` object. If passed, the `api_key` and `host` parameters will be ignored.
65-
:type config: pinecone.config.Config, optional
6661
:param additional_headers: Additional headers to pass to the API. Default: `{}`
6762
:type additional_headers: Dict[str, str], optional
6863
:param pool_threads: The number of threads to use for the connection pool. Default: `1`

pinecone/control/pinecone_interface_asyncio.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -529,7 +529,7 @@ async def describe_collection(self, name: str):
529529
pass
530530

531531
@abstractmethod
532-
async def Index(self, name: str = "", host: str = "", **kwargs):
532+
async def Index(self, host: str, **kwargs):
533533
"""
534534
Target an index for data operations.
535535

pinecone/core/grpc/protos/db_data_2025_01_pb2_grpc.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ class VectorServiceStub(object):
1010
This service could also be called a `gRPC` service or a `REST`-like api.
1111
"""
1212

13-
def __init__(self, channel):
13+
def __init__(self, channel) -> None:
1414
"""Constructor.
1515
1616
Args:

pinecone/core/openapi/db_control/__init__.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,18 +5,19 @@
55
66
Pinecone is a vector database that makes it easy to search and retrieve billions of high-dimensional vectors. # noqa: E501
77
8+
This file is @generated using OpenAPI.
9+
810
The version of the OpenAPI document: 2025-01
911
10-
Generated by: https://openapi-generator.tech
1112
"""
1213

1314
__version__ = "1.0.0"
1415

1516
# import ApiClient
16-
from pinecone.openapi_support import ApiClient
17+
from pinecone.openapi_support.api_client import ApiClient
1718

1819
# import Configuration
19-
from pinecone.openapi_support import Configuration
20+
from pinecone.openapi_support.configuration import Configuration
2021

2122
# import exceptions
2223
from pinecone.openapi_support.exceptions import PineconeException

0 commit comments

Comments
 (0)