You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.org
+85
Original file line number
Diff line number
Diff line change
@@ -6,6 +6,90 @@
6
6
#+AUTHOR: Derek Devnich
7
7
8
8
* COMMENT Notes
9
+
** Post-workshop notes
10
+
1. Put the first command on the board for people that come in late.
11
+
2. Talk about environments after functions.
12
+
3. Update the z-score example to use numpy calculations instead of pandas calculations. There was a versioning issue. Several people in the room were running 1.2.x or 1.4.x instead of 2.x.x.
13
+
4. Send out documentation for anaconda environments and activating
14
+
15
+
** Info from John Gallagher
16
+
NB: This drops down to NumPy for the entire sequence, obviating any Pandas issues
17
+
18
+
Just wanted to share the code snippet you asked for:
# calculates mean z score for each country and appends mean zscore as new column
29
+
data['zmean'] = cell_values.T.mean()
30
+
# returns the data frame
31
+
return data
32
+
#+END_SRC
33
+
34
+
Here I used the .select_dtypes() method in Pandas. The documentation can be found here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.select_dtypes.html
35
+
36
+
I've really appreciated your teaching style and flexibility in class.
37
+
38
+
** TODO Institutional Effectiveness training
39
+
*** Derek's notes
40
+
1. Long to wide
41
+
2. Data cleanup
42
+
3. Joins
43
+
4. Database access
44
+
5. Quick graphs with seaborn plus fiddling with matplotlib
45
+
6. Tour of statistical libraries
46
+
7. Formatted strings for reports?
47
+
48
+
*** Corinne's notes
49
+
1. +Installation:+
50
+
- +Python - https://www.python.org/+
51
+
- +Anaconda - https://www.anaconda.com/+
52
+
2. +Notebooks:+
53
+
- +Jupyter+
54
+
3. Commonly used functions
55
+
1. Loading libraries (examples):
56
+
1. import pandas as pd
57
+
2. import numpy as np
58
+
3. other stats libraries to load and when (advanced stats, etc)
59
+
2. Loading a dataset
60
+
1. Local data
61
+
1. Load specific variables, filter rows based on condition...
62
+
2. Joins -- outer joins, left/right joins
63
+
3. Long vs wide data
64
+
2. Database Query -- querying a view (Oracle, Snowflake)
65
+
3. Basic data recoding/labeling
66
+
1. Creating bins, labels
67
+
2. String data
68
+
4. Descriptive Statistics / exploratory data analysis
69
+
1. Describe() /shows quick summary stats/
70
+
2. Crosstabs- basic X by Y table (e.g. class level by gender)
71
+
3. Others
72
+
5. Charts *(Add UCM colors to customize)*
73
+
1. Scatterplot
74
+
2. Histogram
75
+
3. Line
76
+
4. Columns
77
+
6. Basic Inferential Statistics
78
+
1. Correlations
79
+
2. Chi Square
80
+
3. T Test
81
+
4. Anova
82
+
7. Advanced Stats (what libraries to use; where to find documentation)
0 commit comments