Skip to content

Commit

Permalink
Merge pull request #4 from pedropark99/truncate-cells
Browse files Browse the repository at this point in the history
Add new scripts for truncating chunk results
  • Loading branch information
pedropark99 authored Apr 4, 2023
2 parents fdc42af + fb05eb0 commit ef4f17c
Show file tree
Hide file tree
Showing 15 changed files with 8,853 additions and 349 deletions.
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,6 @@ metastore_db
Chapters/metastore_db

Chapters/*.html
Chapters/*/*
Chapters/*/*

Scripts/__pycache__/
17 changes: 1 addition & 16 deletions Chapters/04-dataframes.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -63,29 +63,14 @@ Like any python class, the `DataFrame` class comes with multiple methods that ar

As an example, in the code below I expose all the available methods from this `DataFrame` class. First, I create a Spark DataFrame with `spark.range(5)`, and, store it in the object `df5`. After that, I use the `dir()` function to show all the methods that I can use through this `df5` object:

```{python}
#| include: false
import sys
import os
sys.path.append(os.path.abspath("./../Scripts/"))
from print_big_list import print_big_list
```


```{python}
#| eval: false
#| eval: true
df5 = spark.range(5)
available_methods = dir(df5)
print(available_methods)
```

```{python}
#| echo: false
df5 = spark.range(5)
available_methods = dir(df5)
print_big_list(available_methods)
```


All the methods present in this `DataFrame` class, are commonly referred as the *DataFrame API of Spark*. Remember, this is the most important API of Spark. Because much of your Spark applications will heavily use this API to compose your data transformations and data flows [@chambers2018].

Expand Down
Loading

0 comments on commit ef4f17c

Please sign in to comment.