Skip to content

Commit 3a3fe66

Browse files
committedJun 18, 2019
finished pandas-advanced teaching, added pics, updated readmes
1 parent 870a918 commit 3a3fe66

File tree

7 files changed

+4024
-3008
lines changed

7 files changed

+4024
-3008
lines changed
 

Diff for: ‎02-Simulation/02 SciPy Basics.ipynb

+8-1
Original file line numberDiff line numberDiff line change
@@ -664,6 +664,13 @@
664664
" ax[i].imshow(p, cmap=\"hot\")"
665665
]
666666
},
667+
{
668+
"cell_type": "code",
669+
"execution_count": null,
670+
"metadata": {},
671+
"outputs": [],
672+
"source": []
673+
},
667674
{
668675
"cell_type": "markdown",
669676
"metadata": {},
@@ -1000,7 +1007,7 @@
10001007
},
10011008
{
10021009
"cell_type": "code",
1003-
"execution_count": 1,
1010+
"execution_count": null,
10041011
"metadata": {},
10051012
"outputs": [],
10061013
"source": [

Diff for: ‎02-Simulation/README.md

+4
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,10 @@ It is recommended that only the packages you need from SciPy are imported, given
7878
>>> b = linalg.solve(A, x)
7979
```
8080

81+
In addition, `scipy` provides a suite of basic image processing functions to adjust and manipulate images of varying sizes.
82+
83+
![Image not found](cats.svg)
84+
8185
To get a solid grasp of this package, you'll have to work through the notebook!
8286

8387
## Dask arrays

Diff for: ‎02-Simulation/cats.svg

+577
Loading

Diff for: ‎03-Data/03 Pandas Advanced*.ipynb

+1,519
Large diffs are not rendered by default.

Diff for: ‎03-Data/03 Pandas Advanced.ipynb

-3,007
This file was deleted.

Diff for: ‎03-Data/README.md

+55
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,62 @@ The second section expands on the knowlege of `pandas` by giving the user access
7979
- Reshaping
8080
- Pivot Tables
8181

82+
One of the advantages in Pandas is the flexibility and ease to generate and use MultiIndexes:
8283

84+
```python
85+
>>> index=[("California",2000), ("California",2010),
86+
("New York",2000), ("New York",2010)]
87+
>>> population=[33871648, 37253956, 18976457, 19378102]
88+
>>> pop=pd.Series(population,index=index)
89+
>>> pop
90+
(California, 2000) 33871648
91+
(California, 2010) 37253956
92+
(New York, 2000) 18976457
93+
(New York, 2010) 19378102
94+
```
95+
96+
From here we can index a slice based on this multiple-index:
97+
98+
```python
99+
>>> pop[("California",2010):("New York",2010)]
100+
(California, 2010) 37253956
101+
(New York, 2000) 18976457
102+
(New York, 2010) 19378102
103+
```
104+
105+
Or we can manipulate this multi-index directly by creating a `pandas.MultiIndex` object:
106+
107+
```python
108+
>>> pd.MultiIndex.from_tuples(index)
109+
MultiIndex(levels=[['California', 'New York'], [2000, 2010]],
110+
labels=[[0, 0, 1, 1], [0, 1, 0, 1]])
111+
```
112+
113+
## Pandas Advanced
114+
115+
The third and final section on extensive Pandas touches on some of the most complex areas of Pandas, including:
116+
- Vectorized string operations
117+
- Categorical types
118+
- Time-Series
119+
- Pipes and Method Chaining
120+
- Sparsity
121+
- High-performance Pandas
122+
123+
For example, dates and times are an integral part of the Pandas package, particularly in relation to indexing time-series data:
124+
125+
```python
126+
>>> dates = pd.DatatimeIndex(["2014-07-04","2014-08-04","2015-07-04","2015-08-04"])
127+
>>> data = pd.Series([0, 1, 4, 2], index=dates)
128+
>>> data
129+
2014-07-04 0
130+
2014-08-04 1
131+
2015-07-04 4
132+
2015-08-04 2
133+
```
134+
135+
Since Pandas was built with financial modelling in mind, we go into some depth particularly with time-series data analysis and stock price modelling.
136+
137+
![Image not found](images/goog_stock.svg)
83138

84139
***
85140

Diff for: ‎03-Data/images/goog_stock.svg

+1,861
Loading

0 commit comments

Comments
 (0)
Please sign in to comment.