This guide will walk you through creating a DigitalOcean Spaces object store to host the dataset for the Kubernetes AI Agent, extracting the dataset contents, and uploading the prepared data.
To begin, you'll need to create a DigitalOcean Spaces object store where the dataset will be stored. Follow these steps:
-
Log in to your DigitalOcean account.
-
Navigate to the Spaces creation page.
-
Click Create Bucket and configure the following:
- Select a datacenter region closest to your target audience.
- Choose a unique name for your Space (e.g.,
kubernetes-agent-dataset
). - Leave the default settings for permissions unless you require public access.
-
Once the Space is created, note the endpoint URL for later use.
For more details, refer to the Spaces Quickstart Guide.
The dataset for the Kubernetes AI Agent is available in the GitHub repository. Extract the contents of the dataset archive as follows:
-
If you haven't done this already, clone the workshop repository:
git clone https://github.com/do-community/genai-agent-workshop.git
-
Navigate to the dataset folder:
cd genai-agent-workshop/kubernetes-walkthrough
-
Extract the contents of the ZIP file:
unzip kubernetes-agent.zip -d kubernetes-agent-dataset
This note shows how we prepared the dataset in step 1.2 above. There is no need to do this if you are using the genai-agent-workshop/kubernetes-walkthrough
zip file. This is just for your information.
This dataset was prepared by combining files from the following repositories:
From these repositories, only the markdown (.md
) files were retained. Below is an example of how this process was carried out:
# Clone the required repositories
git clone https://github.com/kelseyhightower/kubernetes-the-hard-way.git
git clone https://github.com/kubernetes/website.git
# Retain only markdown files and delete other contents
cd kubernetes-agent
find ./hardway -type f ! -name "*.md" -delete
find ./website -type f ! -name "*.md" -delete
# Delete empty folders
find ./hardway -type d -empty -delete
find ./website -type d -empty -delete
We need to upload the dataset to the spaces bucket. There are two methods for doing this...
Navigate to your kubernetes-agent-dataset
folder, then drag and drop the following folders in your spaces bucket (this will create folders and upload the files):
- website
- hardway
Verify the folders have been uploaded and then move onto the next section.
Using the Create
button, create 2 folders in your buckets:
- website
- hardway
For each of the folders your create, click the upload button and upload the contents of the two folders within the dataset.
Your dataset is now ready to be accessed by the Kubernetes AI Agent! Verify the folders have been uploaded and then move onto the next section.
→ Next Up: Creating Your Knowledge base on DigitalOcean Spaces