mapattacker
diff --git a/‎.DS_Store
6 KB b/‎.DS_Store
6 KB
diff --git a/‎Makefile
+20 b/‎Makefile
+20
diff --git a/‎_build/.DS_Store
6 KB b/‎_build/.DS_Store
6 KB
diff --git a/‎_build/doctrees/association.doctree
4.28 KB b/‎_build/doctrees/association.doctree
4.28 KB
diff --git a/‎_build/doctrees/decomposition.doctree
2.43 KB b/‎_build/doctrees/decomposition.doctree
2.43 KB
diff --git a/‎_build/doctrees/difference.doctree
4.49 KB b/‎_build/doctrees/difference.doctree
4.49 KB
diff --git a/‎_build/doctrees/environment.pickle
19.7 KB b/‎_build/doctrees/environment.pickle
19.7 KB
diff --git a/‎_build/doctrees/forecasting.doctree
2.38 KB b/‎_build/doctrees/forecasting.doctree
2.38 KB
diff --git a/‎_build/doctrees/general.doctree
20.6 KB b/‎_build/doctrees/general.doctree
20.6 KB
diff --git a/‎_build/doctrees/index.doctree
3.86 KB b/‎_build/doctrees/index.doctree
3.86 KB
diff --git a/‎_build/doctrees/supervised.doctree
5.99 KB b/‎_build/doctrees/supervised.doctree
5.99 KB
diff --git a/‎_build/doctrees/unsupervised.doctree
2.85 KB b/‎_build/doctrees/unsupervised.doctree
2.85 KB
diff --git a/‎_build/html/.buildinfo
+4 b/‎_build/html/.buildinfo
+4
diff --git a/‎_build/html/.nojekyll b/‎_build/html/.nojekyll
diff --git a/‎_build/html/_images/bias-variance.png
531 KB b/‎_build/html/_images/bias-variance.png
531 KB
diff --git a/‎_build/html/_sources/association.rst.txt
+16 b/‎_build/html/_sources/association.rst.txt
+16
diff --git a/‎_build/html/_sources/decomposition.rst.txt
+2 b/‎_build/html/_sources/decomposition.rst.txt
+2
diff --git a/‎_build/html/_sources/difference.rst.txt
+20 b/‎_build/html/_sources/difference.rst.txt
+20
diff --git a/‎_build/html/_sources/forecasting.rst.txt
+2 b/‎_build/html/_sources/forecasting.rst.txt
+2
diff --git a/‎_build/html/_sources/general.rst.txt
+95 b/‎_build/html/_sources/general.rst.txt
+95
diff --git a/‎_build/html/_sources/index.rst.txt
+24 b/‎_build/html/_sources/index.rst.txt
+24
diff --git a/‎_build/html/_sources/supervised.rst.txt
+36 b/‎_build/html/_sources/supervised.rst.txt
+36
diff --git a/‎_build/html/_sources/unsupervised.rst.txt
+8 b/‎_build/html/_sources/unsupervised.rst.txt
+8
diff --git a/‎_build/html/_static/ajax-loader.gif
673 Bytes b/‎_build/html/_static/ajax-loader.gif
673 Bytes
@@ -0,0 +1,20 @@
+# Minimal makefile for Sphinx documentation
+#
+
+# You can set these variables from the command line.
+SPHINXOPTS    =
+SPHINXBUILD   = sphinx-build
+SPHINXPROJ    = DataScience
+SOURCEDIR     = .
+BUILDDIR      = _build
+
+# Put it first so that "make" without argument is like "make help".
+help:
+	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
+
+.PHONY: help Makefile
+
+# Catch-all target: route all unknown targets to Sphinx using the new
+# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
+%: Makefile
+	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
@@ -0,0 +1,4 @@
+# Sphinx build info version 1
+# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
+config: 05ff58d1be54252f0a6748edeeab048c
+tags: 645f666f9bcd5a90fca523b33c5a78b7
@@ -0,0 +1,16 @@
+Tests of Association
+=====================
+
+Pearson's Correlation
+---------------------
+
+X, Explantory: ``Continuous``
+Y, Response: ``Continuous``
+Type: ``Non-Parametric``
+
+
+Spearman's Rank Correlation
+---------------------------
+X, Explantory:``Continuous``
+Y, Response: ``Continuous``
+Type: ``Parametric``
@@ -0,0 +1,2 @@
+Time Series Decomposition
+=========================
@@ -0,0 +1,20 @@
+Tests of Difference
+===================
+
+Chi-Square Test
+---------------
+X, Explantory: ``Categorical``
+Y, Response: ``Categorical``
+Type: ``Non-Parametric``
+
+
+Student's T-Test
+----------------
+Type: ``Parametric``
+
+
+ANOVA
+-----
+Type: ``Parametric``
+
+Analysis of Variance (ANOVA).
@@ -0,0 +1,2 @@
+Forecasting
+===========
@@ -0,0 +1,95 @@
+General Notes
+=============
+
+Variables
+---------
+``x`` = independent variable = explanatory = predictor
+
+``y`` = dependent variable = response = target
+
+
+Data Types
+----------
+The type of data is essential as it determines what kind of tests can be applied to it.
+
+``Continuous:`` Also known as quantitative. Unlimited number of values
+
+``Categorical:`` Also known as discrete or qualitative. Fixed number of values or *categories*
+
+
+Bias-Variance Tradeoff
+-----------------------
+The best predictive algorithm is one that has good *Generalization Ability*. 
+With that, it will be able to give accurate predictions to new and previously unseen data.
+
+*High Bias* results from *Underfitting* the model. This usually results from erroneous assumptions, and cause the model to be too general.
+
+*High Variance* results from *Overfitting* the model, and it will predict the training dataset very accurately, but not with unseen new datasets. 
+This is because it will fit even the slightless noise in the dataset.
+
+The best model with the highest accuarcy is the middle ground between the two.
+    
+.. figure:: ./images/bias-variance.png
+    :scale: 25 %
+    :align: center
+  
+    from Andrew Ng's lecture
+
+Steps to Build a Predictive Model
+--------------------------------------------
+Train Test Split
+*****************
+Split the dataset into *Train* and *Test* datasets.
+By default, sklearn assigns 75% to train & 25% to test randomly.
+
+.. code:: Python
+
+  train_predictor, test_predictor, train_target, test_target 
+  = train_test_split(predictor, target, test_size=0.25)
+
+Create Model
+************
+Choose model and set model parameters (if any).
+
+.. code:: Python
+
+  clf = DecisionTreeClassifier()
+
+
+Fit Model
+************
+Fit the model using the training dataset.
+
+.. code:: Python
+
+  model = clf.fit(train_predictor, train_target)
+
+>>> print model
+DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=None,
+            max_features=None, max_leaf_nodes=None, min_samples_leaf=1,
+            min_samples_split=2, min_weight_fraction_leaf=0.0,
+            presort=False, random_state=None, splitter='best')
+
+Test Model
+**********
+Test the model by predicting identity of unseen data using the testing dataset.
+
+.. code:: Python
+
+  predictions = model.predict(test_predictor)
+
+
+Score Model
+***********
+Use a confusion matrix and...
+
+>>> print sklearn.metrics.confusion_matrix(test_target,predictions)
+[[14  0  0]
+ [ 0 13  0]
+ [ 0  1 10]]
+ 
+accuarcy percentage score to obtain the predictive accuarcy.
+
+>>> print sklearn.metrics.accuracy_score(test_target, predictions)*100, '%'
+97.3684210526 %
+
@@ -0,0 +1,24 @@
+.. Data Science documentation master file, created by
+   sphinx-quickstart on Tue Jun 27 22:55:47 2017.
+   You can adapt this file completely to your liking, but it should at least
+   contain the root `toctree` directive.
+
+Data Science in Python
+========================================
+This documentation summarises various statistics and machine learning techniques in Python.
+
+  
+
+
+.. toctree::
+   :maxdepth: 2
+   :caption: Contents
+   :numbered:
+   
+   general
+   difference
+   association
+   supervised
+   unsupervised
+   decomposition
+   forecasting
@@ -0,0 +1,36 @@
+Supervised Learning
+===================
+
+Classification
+--------------
+
+K Nearest Neighbours (KNN)
+**************************
+
+Decision Tree
+**************************
+
+Random Forest
+**************************
+
+Logistic Regression
+**************************
+
+Support Vector Machine
+***********************
+
+
+Regression
+----------
+
+Ordinary Least Squares (OLS) Regression
+***************************************
+Best fit line ``ŷ = a + bx`` is drawn based on the ordrinary least squares method. i.e., least total area of squares with length from each x,y point to regresson line.
+
+
+Ridge Regression
+****************
+
+Lasso Regression
+****************
+
@@ -0,0 +1,8 @@
+Unsupervised Learning
+=====================
+
+Clustering
+----------
+
+K-Means
+**************************
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,2 @@`
	`1`	`+Time Series Decomposition`
	`2`	`+=========================`