GitHub - viraj-s15/KubeflowAccuPred: A ML K8s pipeline to predict customer frequencies

Customer Frequency Prediction Pipeline

Completely trains an XGBoost model for Customer Frequency prediction using Kubernetes

📝 Table of Contents

📝 Table of Contents
🧐 About
🏁 Getting Started
- Prerequisites
- Installing
🔧 Running the trainging Pipeline
🎈 Monitoring
🚀 Deployment
⛏️ Built Using
✍️ Authors

🧐 About

Components in the pipeline include:

Data prerocessing
Data Splitting
Hyperparameter Optimisation and Cross Validation
Model Saving (Local or Cloud)
Model testing on new data

🏁 Getting Started

You will need k8s set up with kubeflow before doing anything.

Prerequisites

Make sure you have poetry installed, if you use arch you can install it by running:

yay -S python-poetry

This is how I setup Kubeflow with MiniKube

I have kubectl aliased to minikube kubectl --

minikube start --cpus 4 --memory 4000 --disk-size=10g

curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh"  | bash

git clone [email protected]:kubeflow/manifests.git

cd manifests

while ! kustomize build example | awk '!/well-defined/' | kubectl apply -f -; do echo "Retrying to apply resources"; sleep 10; done

kubectl get pods -A
# To make sure its working
# Give all the pods some time to get running

kubectl port-forward svc/istio-ingressgateway -n istio-system 8080:80

Installing

To install all dependencies, I highly recommend using a virtual environment

poetry install

🔧 Running the trainging Pipeline

poetry run python train_pipeline_execute.py --host http://your_kubeflow_host_url

🎈 Monitoring

This project uses Aim for monitoring model performance. Run aim up in your terminal after the completion of the pipeline. This will open a dashboard in your browser for monitoring.

It will look something like this

🚀 Deployment

Gradio frontend has been built, can be deployed anywhere using the inference.py script

⛏️ Built Using

Python
XGBoost
Aim - Monitoring and Tracking
Kubeflow - Distributed computing for training and deployment
Gradio - API

✍️ Authors

@viraj-s15

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
compiled_pipelines		compiled_pipelines
components		components
data		data
data_preprocessing		data_preprocessing
data_splits		data_splits
hyperparam_optimisation		hyperparam_optimisation
model		model
model_testing		model_testing
notebooks		notebooks
params		params
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
inference.py		inference.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
train_pipeline_execute.py		train_pipeline_execute.py
training_pipeline.py		training_pipeline.py
training_pipeline_catboost.py		training_pipeline_catboost.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Customer Frequency Prediction Pipeline

📝 Table of Contents

🧐 About

🏁 Getting Started

Prerequisites

Installing

🔧 Running the trainging Pipeline

🎈 Monitoring

🚀 Deployment

⛏️ Built Using

✍️ Authors

About

Releases

Packages

Languages

viraj-s15/KubeflowAccuPred

Folders and files

Latest commit

History

Repository files navigation

Customer Frequency Prediction Pipeline

📝 Table of Contents

🧐 About

🏁 Getting Started

Prerequisites

Installing

🔧 Running the trainging Pipeline

🎈 Monitoring

🚀 Deployment

⛏️ Built Using

✍️ Authors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages