checking readme instructions

adityashrm21 · Mar 31, 2019 · 8294518 · 8294518
1 parent 7b2adca
commit 8294518
Showing 1 changed file with 4 additions and 4 deletions.
diff --git a/README.md b/README.md
@@ -1,4 +1,4 @@
-# Adult Income Prediction using Flask app on Heroku
+# Adult Income Prediction
 
 Follow the steps provided below to reproduce the whole project.
 
@@ -25,7 +25,7 @@ Now you should have everything installed that we need.
 
 ### Data format before cleaning
 
-This information is directly copied from the [UCI datasets repository for adult dataset](https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.names)
+This information is directly copied from the [UCI datasets repository for adult dataset](https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.names).
 
 - income: >50K, <=50K.
 - age: continuous.
@@ -79,7 +79,7 @@ Execute the `main.py` script which will train a model on the cleaned data and ex
 ```bash
 python3 incomePrediction/main.py
 ```
-I chose the `LogisticRegression` classifier from scikit-learn to get predictions (The test accuracy obtained is quite well ~ 85%). Cross-validation is done to choose the important hyperparameter (`C`) to control the degree of regularization. The script can be modified to use and tune any classifier available in `scikit-learn`. Both the training and test accuracies are comparable and hence, there seems to be no overfitting. I chose to go with Logistic Regression because it is a simple linear classifier whose results are interpretable and this is what I would expect from a model on such a dataset where the predictor-response relationship seems to be important in the analysis. I also tried building and tuning a RandomForest classifier and there was a 1 percent increase in the accuracies which is not much higher and therefore, a simpler model is a better choice.
+I chose the `LogisticRegression` classifier from scikit-learn to get predictions (The test accuracy obtained is quite well ~ 85%). Cross-validation is done to choose the important hyperparameter (`C`) to control the degree of regularization. The script can be modified to use and tune any classifier available in `scikit-learn`. Both the training and test accuracies are comparable and hence, there seems to be no overfitting. I chose to go with Logistic Regression because it is a simple linear classifier whose results are interpretable and this is what I would expect from a model on such a dataset where the predictor-response relationship seems to be important in the analysis. I also tried building and tuning a RandomForest classifier and there was a 1% increase in the accuracies which is not much higher and therefore, a simpler model is a better choice.
 
 ### Deploy the model on Heroku
 
@@ -103,4 +103,4 @@ I have used the `pytest` library to test the [Util class](https://github.com/adi
 pytest incomePrediction/tests/
 ```
 
-Due to shortage on time, I could not cover all kinds of tests but I did set up a basic test infrastructure which could be extended to test the remaining code (unit, integration and e2e tests).
+As of now, the tests section is not exhaustive but I did set up a basic test infrastructure.