Step 1: Creating Your Dataset on DigitalOcean Spaces

This guide will walk you through creating a DigitalOcean Spaces object store to host the dataset for the Kubernetes AI Agent, extracting the dataset contents, and uploading the prepared data.

1.1 Create a Spaces Object Store

To begin, you'll need to create a DigitalOcean Spaces object store where the dataset will be stored. Follow these steps:

Log in to your DigitalOcean account.
Navigate to the Spaces creation page.
Click Create Bucket and configure the following:
- Select a datacenter region closest to your target audience.
- Choose a unique name for your Space (e.g., kubernetes-agent-dataset).
- Leave the default settings for permissions unless you require public access.
Once the Space is created, note the endpoint URL for later use.

For more details, refer to the Spaces Quickstart Guide.

1.2 Extract the Dataset Contents

The dataset for the Kubernetes AI Agent is available in the GitHub repository. Extract the contents of the dataset archive as follows:

If you haven't done this already, clone the workshop repository:

git clone https://github.com/do-community/genai-agent-workshop.git

Navigate to the dataset folder:

cd genai-agent-workshop/kubernetes-walkthrough

Extract the contents of the ZIP file:

unzip kubernetes-agent.zip -d kubernetes-agent-dataset

(Optional) More Details on the Dataset Creation

This note shows how we prepared the dataset in step 1.2 above. There is no need to do this if you are using the genai-agent-workshop/kubernetes-walkthrough zip file. This is just for your information.

This dataset was prepared by combining files from the following repositories:

Kubernetes the Hard Way
Kubernetes Website

From these repositories, only the markdown (.md) files were retained. Below is an example of how this process was carried out:

# Clone the required repositories
git clone https://github.com/kelseyhightower/kubernetes-the-hard-way.git
git clone https://github.com/kubernetes/website.git

# Retain only markdown files and delete other contents
cd kubernetes-agent
find ./hardway -type f ! -name "*.md" -delete
find ./website -type f ! -name "*.md" -delete

# Delete empty folders
find ./hardway -type d -empty -delete
find ./website -type d -empty -delete

1.3 Upload the Dataset to the Spaces Object Store

We need to upload the dataset to the spaces bucket. There are two methods for doing this...

(Recommended) Easy Method

Navigate to your kubernetes-agent-dataset folder, then drag and drop the following folders in your spaces bucket (this will create folders and upload the files):

website
hardway

Verify the folders have been uploaded and then move onto the next section.

(Not Recommended) The Hard Method

Using the Create button, create 2 folders in your buckets:

website
hardway

For each of the folders your create, click the upload button and upload the contents of the two folders within the dataset.

Your dataset is now ready to be accessed by the Kubernetes AI Agent! Verify the folders have been uploaded and then move onto the next section.

Next Steps...

→ Next Up: Creating Your Knowledge base on DigitalOcean Spaces

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

STEP1_SPACES_DATASET.md

STEP1_SPACES_DATASET.md

Step 1: Creating Your Dataset on DigitalOcean Spaces

1.1 Create a Spaces Object Store

1.2 Extract the Dataset Contents

(Optional) More Details on the Dataset Creation

1.3 Upload the Dataset to the Spaces Object Store

(Recommended) Easy Method

(Not Recommended) The Hard Method

Next Steps...

Files

STEP1_SPACES_DATASET.md

Latest commit

History

STEP1_SPACES_DATASET.md

File metadata and controls

Step 1: Creating Your Dataset on DigitalOcean Spaces

1.1 Create a Spaces Object Store

1.2 Extract the Dataset Contents

(Optional) More Details on the Dataset Creation

1.3 Upload the Dataset to the Spaces Object Store

(Recommended) Easy Method

(Not Recommended) The Hard Method

Next Steps...