Skip to content

Commit 3780924

Browse files
authored
[Chore] Automatically cleanup old resources each night (#400)
## Problem Sometimes when test cleanup steps fail, indexes and collections get left behind. ## Solution Create a nightly job to cleanup leftover indexes. Inspect the names of each index and collection to see whether they are more than 24 hours old prior to deleting. This should prevent deleting resources out from underneath any tests that may be running at the same time as the delete job. ## Type of Change - [x] Infrastructure change (CI configs, etc)
1 parent c13a249 commit 3780924

File tree

3 files changed

+82
-4
lines changed

3 files changed

+82
-4
lines changed
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
name: 'Cleanup All Indexes/Collections (Nightly)'
2+
3+
on:
4+
schedule:
5+
- cron: '5 22 * * *' # 5 minutes after 10pm UTC, every day
6+
7+
jobs:
8+
cleanup-all:
9+
name: Cleanupu all indexes/collections
10+
runs-on: ubuntu-latest
11+
steps:
12+
- uses: actions/checkout@v4
13+
- name: Cleanup all
14+
uses: ./.github/actions/cleanup-all
15+
with:
16+
PINECONE_API_KEY: ${{ secrets.PINECONE_API_KEY }}
17+
DELETE_ALL: false

.github/workflows/cleanup.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,4 +12,5 @@ jobs:
1212
- name: Cleanup all
1313
uses: ./.github/actions/cleanup-all
1414
with:
15-
PINECONE_API_KEY: ${{ secrets.PINECONE_API_KEY }}
15+
PINECONE_API_KEY: ${{ secrets.PINECONE_API_KEY }}
16+
DELETE_ALL: true

scripts/cleanup-all.py

Lines changed: 63 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
import os
2+
import re
23
from pinecone import Pinecone
4+
from datetime import datetime, timedelta
35

46

5-
def main():
6-
pc = Pinecone(api_key=os.environ.get("PINECONE_API_KEY", None))
7-
7+
def delete_everything(pc):
88
for collection in pc.list_collections().names():
99
try:
1010
print("Deleting collection: " + collection)
@@ -22,5 +22,65 @@ def main():
2222
pass
2323

2424

25+
def parse_date(resource_name):
26+
match = re.search(r"-\d{8}-", resource_name)
27+
if match:
28+
date_string = match.group(0).strip("-")
29+
return datetime.strptime(date_string, "%Y%m%d")
30+
else:
31+
return None
32+
33+
34+
def is_resource_old(resource_name):
35+
print(f"Checking resource name: {resource_name}")
36+
resource_datetime = parse_date(resource_name)
37+
if resource_datetime is None:
38+
return False
39+
current_time = datetime.now()
40+
41+
# Calculate the difference
42+
time_difference = current_time - resource_datetime
43+
44+
# Check if the time difference is greater than 24 hours
45+
print(f"Resource timestamp: {resource_datetime}")
46+
print(f"Time difference: {time_difference}")
47+
return time_difference > timedelta(hours=24)
48+
49+
50+
def delete_old(pc):
51+
for collection in pc.list_collections().names():
52+
if is_resource_old(collection):
53+
try:
54+
print("Deleting collection: " + collection)
55+
pc.delete_collection(collection)
56+
except Exception as e:
57+
print("Failed to delete collection: " + collection + " " + str(e))
58+
pass
59+
else:
60+
print("Skipping collection, not old enough: " + collection)
61+
62+
for index in pc.list_indexes().names():
63+
if is_resource_old(index):
64+
try:
65+
print("Deleting index: " + index)
66+
pc.delete_index(index)
67+
except Exception as e:
68+
print("Failed to delete index: " + index + " " + str(e))
69+
pass
70+
else:
71+
print("Skipping index, not old enough: " + index)
72+
73+
74+
def main():
75+
pc = Pinecone(api_key=os.environ.get("PINECONE_API_KEY", None))
76+
77+
if os.environ.get("DELETE_ALL", None) == "true":
78+
print("Deleting everything")
79+
delete_everything(pc)
80+
else:
81+
print("Deleting old resources")
82+
delete_old(pc)
83+
84+
2585
if __name__ == "__main__":
2686
main()

0 commit comments

Comments
 (0)