This repository contains Jupyter notebooks where you can work on DS10 exercises.
Variables and values: The first notebook explains how to use Jupyter and introduces variables, values, and numerical computation.
Click here to run this notebook on Colab
Times and places: This notebook shows how to represent times, dates, and locations in Python, and uses the GeoPandas library to plot points on a map.
Click here to run this notebook on Colab
Lists and Arrays: This notebook presents lists and NumPy arrays. It discusses absolute, relative, and percent errors, and ways to summarize them.
Click here to run this notebook on Colab
Loops and Files: This notebook presents the for
loop and the if
statement; then it uses them to speed-read War and Peace and count the words.
Click here to run this notebook on Colab
Dictionaries: This notebook presents one of the most powerful features of Python, dictionaries, and uses them to count the unique words in a text and their frequencies.
Click here to run this notebook on Colab
Plotting: This notebook introduces a plotting library, Matplotlib, and uses it to generate a few common data visualizations and one less common one, a Zipf plot.
Click here to run this notebook on Colab
DataFrames: This notebook presents DataFrames, which are used to represent tables of data. As an example, it uses data from the National Survey of Family Growth to find the average weight of babies in the U.S.
Click here to run this notebook on Colab
Distributions: This notebook explains what a distribution is and presents 3 ways to represent one: a PMF, CDF, or PDF. It also shows how to compare a distribution to another distribution or a mathematical model.
Click here to run this notebook on Colab
Relationships: This notebook explores relationships between variables using scatter plots, violin plots, and box plots. It quantifies the strength of a relationship using the correlation coefficient and uses simple regression to estimate the slope of a line.
Click here to run this notebook on Colab
Regression: This notebook presents multiple regression and uses it to explore the relationship between age, education, and income. It uses visualization to interpret multivariate models. It also presents binary variables and logistic regression.