From 829451807a366599fcd02974a01d6ade071cf332 Mon Sep 17 00:00:00 2001
From: Aditya Sharma <adityashrm21@gmail.com>
Date: Sat, 30 Mar 2019 22:30:04 -0700
Subject: [PATCH] checking readme instructions

---
 README.md | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/README.md b/README.md
index 06e1727..4ad3f07 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,4 @@
-# Adult Income Prediction using Flask app on Heroku
+# Adult Income Prediction
 
 Follow the steps provided below to reproduce the whole project.
 
@@ -25,7 +25,7 @@ Now you should have everything installed that we need.
 
 ### Data format before cleaning
 
-This information is directly copied from the [UCI datasets repository for adult dataset](https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.names)
+This information is directly copied from the [UCI datasets repository for adult dataset](https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.names).
 
 - income: >50K, <=50K.
 - age: continuous.
@@ -79,7 +79,7 @@ Execute the `main.py` script which will train a model on the cleaned data and ex
 ```bash
 python3 incomePrediction/main.py
 ```
-I chose the `LogisticRegression` classifier from scikit-learn to get predictions (The test accuracy obtained is quite well ~ 85%). Cross-validation is done to choose the important hyperparameter (`C`) to control the degree of regularization. The script can be modified to use and tune any classifier available in `scikit-learn`. Both the training and test accuracies are comparable and hence, there seems to be no overfitting. I chose to go with Logistic Regression because it is a simple linear classifier whose results are interpretable and this is what I would expect from a model on such a dataset where the predictor-response relationship seems to be important in the analysis. I also tried building and tuning a RandomForest classifier and there was a 1 percent increase in the accuracies which is not much higher and therefore, a simpler model is a better choice.
+I chose the `LogisticRegression` classifier from scikit-learn to get predictions (The test accuracy obtained is quite well ~ 85%). Cross-validation is done to choose the important hyperparameter (`C`) to control the degree of regularization. The script can be modified to use and tune any classifier available in `scikit-learn`. Both the training and test accuracies are comparable and hence, there seems to be no overfitting. I chose to go with Logistic Regression because it is a simple linear classifier whose results are interpretable and this is what I would expect from a model on such a dataset where the predictor-response relationship seems to be important in the analysis. I also tried building and tuning a RandomForest classifier and there was a 1% increase in the accuracies which is not much higher and therefore, a simpler model is a better choice.
 
 ### Deploy the model on Heroku
 
@@ -103,4 +103,4 @@ I have used the `pytest` library to test the [Util class](https://github.com/adi
 pytest incomePrediction/tests/
 ```
 
-Due to shortage on time, I could not cover all kinds of tests but I did set up a basic test infrastructure which could be extended to test the remaining code (unit, integration and e2e tests).
+As of now, the tests section is not exhaustive but I did set up a basic test infrastructure.