Skip to content

viraj-s15/KubeflowAccuPred

Repository files navigation

Customer Frequency Prediction Pipeline


Completely trains an XGBoost model for Customer Frequency prediction using Kubernetes

📝 Table of Contents

🧐 About

Components in the pipeline include:

  • Data prerocessing
  • Data Splitting
  • Hyperparameter Optimisation and Cross Validation
  • Model Saving (Local or Cloud)
  • Model testing on new data

🏁 Getting Started

You will need k8s set up with kubeflow before doing anything.

Prerequisites

Make sure you have poetry installed, if you use arch you can install it by running:

yay -S python-poetry 

This is how I setup Kubeflow with MiniKube

I have kubectl aliased to minikube kubectl --

minikube start --cpus 4 --memory 4000 --disk-size=10g

curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh"  | bash

git clone [email protected]:kubeflow/manifests.git

cd manifests

while ! kustomize build example | awk '!/well-defined/' | kubectl apply -f -; do echo "Retrying to apply resources"; sleep 10; done

kubectl get pods -A
# To make sure its working
# Give all the pods some time to get running

kubectl port-forward svc/istio-ingressgateway -n istio-system 8080:80

Installing

To install all dependencies, I highly recommend using a virtual environment

poetry install

🔧 Running the trainging Pipeline

poetry run python train_pipeline_execute.py --host http://your_kubeflow_host_url

🎈 Monitoring

This project uses Aim for monitoring model performance. Run aim up in your terminal after the completion of the pipeline. This will open a dashboard in your browser for monitoring.

It will look something like this

swappy-20231130_123149

swappy-20231130_123119

🚀 Deployment

Gradio frontend has been built, can be deployed anywhere using the inference.py script

⛏️ Built Using

  • Python
  • XGBoost
  • Aim - Monitoring and Tracking
  • Kubeflow - Distributed computing for training and deployment
  • Gradio - API

✍️ Authors

About

A ML K8s pipeline to predict customer frequencies

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages