Logistic Regression Shape Classifier

I created a simple triangle/circle classifier from scratch using logistic regression to learn how neural newtworks work. This code runs on Python 3 and the latest (2019) version of imported modules.

To run notebook directly from terminal:

runipy shapeC.ipynb

If runipy is not installed:

pip install runipy

How this works, as simply as possible:

Input:

Our input data, x (which is a collection of 200px by 200px images of either triangles or circles) is flattened. This is our training data.

Our training data also has labels, 0 if it is a circle and 1 if it is a triangle. We will call this collection of labels y.

Forward Propogation:

x is multiplied by our weights, w, and a bias b is added to give us z.

Our weights and bias is initialised to 0.01 and 0 respectfully.

z = xw + b

Activation function:

z is then pushed through the activation function (basically takes the value and classifies it as either 0 or 1) to produce the prediction of y, y_pred, given z. The activation function used here is a sigmoid function.

y_pred = sigmoid(z)*

Loss function:

We now compare y (the actual labels of our images) to y_pred.

We calculate the mean squared error of all y_pred and y. The result is the cost, which shows how well our model is currently performing with the given weights and bias. *Note that y and y_pred are vectors.

Back Propogation:

We update our weights and bias, by calculating:

a = the dot product of x and (y_pred - y).T times scaling factor (which we specify) b = sum of (y_pred - y)

new_weight = old_weight - learning_rate times (a) new_bias = old_bias* - learning_rate times (b)

And now we do forward propogation again, but now with our new weights and bias. The process can be repeated untill an satisfying cost is reached.

After the model has been built, we test it using test data (data which the model has never seen before) and evaluate its preformance.

Results:

The model achieves a cost of 0 after 20 iterations. This means not one image was classified incorrectly after only 20 iterations. So the model has a accuracy of 100%.

This is however, not a good thing. This is what is called over fitting. Taking a closer look at the data shows that there is too little variation: Almost all cirlcle images look identical; they all occur in the exact center of the image, are about the same size, and are filled in every case. The triangles differ slightly more but still contain way too little variation.

The reason why overfitting is bad, is that given a circle that does not conform to the training data, (i.e. not filled, smaller or not centered) it will could be misclassified.

Going Forward

The purpose of this code was to learn how a neural network, logistic logression in particlar, work and not make an amazing shape classifier. I learned much about this topic and I would like to learn more. Some ideas I had were to create a model that is able to find waldo, with a nice UI, and training a model able to identify captha's.

The data is from [https://www.kaggle.com/smeschke/four-shapes] (I followed a tutorial on how to build a neural network, which I for the life of me cannot seem to locate, I apologize)

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data		data
.gitignore		.gitignore
readme.md		readme.md
shapeC.ipynb		shapeC.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Logistic Regression Shape Classifier

To run notebook directly from terminal:

Input:

Forward Propogation:

Activation function:

Loss function:

Back Propogation:

Results:

Going Forward

About

Releases

Packages

Languages

hen3kr8/ShapeClassifier

Folders and files

Latest commit

History

Repository files navigation

Logistic Regression Shape Classifier

To run notebook directly from terminal:

Input:

Forward Propogation:

Activation function:

Loss function:

Back Propogation:

Results:

Going Forward

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages