An End to End Machile Learning Ops project. The Architecture can be found here:
The Feature Store & Prediction Store was implemented in Azure using a PyODBC connector:
You can find the dataset here.
For storage reasons the dataset is not uploaded into the repository.
Additionally the data has been split in two, half for ingestion and the other half for training.
To correctly run this project please follow the following steps:
-
Clone this repository from Github into your local machine.
-
Create a folder called 'data' at the root of your repository. E.g The path should be "FlightPricePrediction/data"
-
Download the csv file called 'Clean_Dataset.csv' into the folder 'data' from the Kaggle link above.
-
Making sure your terminal's working directory is on the repository and not any subfolders, run the splitting data script '0.0-splitting-data.py' under "/model/industralized-scripts"
-
The script will automatically split the data into ingestion data and training in their respective folders, '/airflow-data/ ' and 'model/raw_data/
After cloning this repository. To run the Streamlit app locally follow these steps:
-
Open your terminal on the root of this repository
-
Install all the packages in requirements.txt in your terminal by running the following command:
pip install -r requirements.txt
-
Run the following command:
streamlit run frontend/Make_Predictions.py
-
Copy paste the URL on your terminal into your browser
-
RUN FOR API
uvicorn Fastapi_endpoints_copy:app --port 5000