Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Google Analytics tracking #12

Merged
merged 2 commits into from
Oct 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
.Ruserdata
Introd-pyspark.pdf

python-env/

/.quarto/
./site_libs/

Expand Down
3 changes: 2 additions & 1 deletion _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ book:
title: "Introduction to `pyspark`"
author: "Pedro Duarte Faria"
date: "today"
google-analytics: "G-G42L33VM26"
navbar:
background: "#164e80"
left:
Expand Down Expand Up @@ -47,4 +48,4 @@ format:
include-in-header: font_config.tex
docx:
toc: true
lang: en-US
lang: en-US
151 changes: 99 additions & 52 deletions docs/Chapters/02-python.html

Large diffs are not rendered by default.

101 changes: 74 additions & 27 deletions docs/Chapters/03-spark.html
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>

<meta charset="utf-8">
<meta name="generator" content="quarto-1.4.532">
<meta name="generator" content="quarto-1.5.56">

<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">


<title>Introduction to pyspark - 2&nbsp; Introducing Apache Spark</title>
<title>2&nbsp; Introducing Apache Spark – Introduction to `pyspark`</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
Expand All @@ -33,7 +33,7 @@
}
@media print {
pre > code.sourceCode { white-space: pre-wrap; }
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
pre > code.sourceCode > span { display: inline-block; text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
Expand Down Expand Up @@ -122,8 +122,17 @@
"search-label": "Search"
}
}</script>
<script async="" src="https://www.googletagmanager.com/gtag/js?id=G-G42L33VM26"></script>

<script src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script>
<script type="text/javascript">

window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'G-G42L33VM26', { 'anonymize_ip': true});
</script>

<script src="https://cdnjs.cloudflare.com/polyfill/v3/polyfill.min.js?features=es6"></script>
<script src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml-full.js" type="text/javascript"></script>

<script type="text/javascript">
Expand Down Expand Up @@ -167,7 +176,7 @@
</a>
</div>
<div id="quarto-search" class="" title="Search"></div>
<button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbarCollapse" aria-controls="navbarCollapse" aria-expanded="false" aria-label="Toggle navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
<button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbarCollapse" aria-controls="navbarCollapse" role="menu" aria-expanded="false" aria-label="Toggle navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
<span class="navbar-toggler-icon"></span>
</button>
<div class="collapse navbar-collapse" id="navbarCollapse">
Expand All @@ -185,17 +194,17 @@
</li>
</ul>
</div> <!-- /navcollapse -->
<div class="quarto-navbar-tools">
<div class="quarto-navbar-tools">
</div>
</div> <!-- /container-fluid -->
</nav>
<nav class="quarto-secondary-nav">
<div class="container-fluid d-flex">
<button type="button" class="quarto-btn-toggle btn" data-bs-toggle="collapse" data-bs-target=".quarto-sidebar-collapse-item" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
<button type="button" class="quarto-btn-toggle btn" data-bs-toggle="collapse" role="button" data-bs-target=".quarto-sidebar-collapse-item" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
<i class="bi bi-layout-text-sidebar-reverse"></i>
</button>
<nav class="quarto-page-breadcrumbs" aria-label="breadcrumb"><ol class="breadcrumb"><li class="breadcrumb-item"><a href="../Chapters/03-spark.html"><span class="chapter-number">2</span>&nbsp; <span class="chapter-title">Introducing Apache Spark</span></a></li></ol></nav>
<a class="flex-grow-1" role="button" data-bs-toggle="collapse" data-bs-target=".quarto-sidebar-collapse-item" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
<a class="flex-grow-1" role="navigation" data-bs-toggle="collapse" data-bs-target=".quarto-sidebar-collapse-item" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
</a>
</div>
</nav>
Expand Down Expand Up @@ -233,7 +242,7 @@
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../Chapters/04-columns.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">4</span>&nbsp; <span class="chapter-title">Introducing the `Column` class</span></span></a>
<span class="menu-text"><span class="chapter-number">4</span>&nbsp; <span class="chapter-title">Introducing the <code>Column</code> class</span></span></a>
</div>
</li>
<li class="sidebar-item">
Expand All @@ -251,7 +260,7 @@
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../Chapters/06-dataframes-sql.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">7</span>&nbsp; <span class="chapter-title">Working with SQL in `pyspark`</span></span></a>
<span class="menu-text"><span class="chapter-number">7</span>&nbsp; <span class="chapter-title">Working with SQL in <code>pyspark</code></span></span></a>
</div>
</li>
<li class="sidebar-item">
Expand Down Expand Up @@ -367,7 +376,7 @@ <h2 data-number="2.2" class="anchored" data-anchor-id="spark-application"><span
<p>Every time a Spark application starts, the driver process has to communicate with the cluster manager, to acquire workers to perform the necessary tasks. In other words, the cluster manager decides if Spark can use some of the resources (i.e.&nbsp;some of the machines) of the cluster. If the cluster manager allow Spark to use the nodes it needs, the driver program will break the application into many small tasks, and will assign these tasks to the worker nodes.</p>
<p>The executor processes, are the processes that take place within each one of the worker nodes. Each executor process is composed of a set of tasks, and the worker node is responsible for performing and executing these tasks that were assigned to him, by the driver program. After executing these tasks, the worker node will send the results back to the driver node (or the driver program). If they need, the worker nodes can communicate with each other, while performing its tasks.</p>
<p>This structure is represented in <a href="#fig-spark-application" class="quarto-xref">Figure&nbsp;<span>2.1</span></a>:</p>
<div id="fig-spark-application" class="quarto-figure quarto-figure-center quarto-float anchored" data-fig-align="center">
<div id="fig-spark-application" class="quarto-float quarto-figure quarto-figure-center anchored" data-fig-align="center">
<figure class="quarto-float quarto-float-fig figure">
<div aria-describedby="fig-spark-application-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
<img src="../Figures/spark-application.png" class="img-fluid quarto-figure quarto-figure-center figure-img">
Expand Down Expand Up @@ -571,18 +580,7 @@ <h3 data-number="2.6.2" class="anchored" data-anchor-id="main-python-classes"><s
}
return false;
}
const clipboard = new window.ClipboardJS('.code-copy-button', {
text: function(trigger) {
const codeEl = trigger.previousElementSibling.cloneNode(true);
for (const childEl of codeEl.children) {
if (isCodeAnnotation(childEl)) {
childEl.remove();
}
}
return codeEl.innerText;
}
});
clipboard.on('success', function(e) {
const onCopySuccess = function(e) {
// button target
const button = e.trigger;
// don't keep focus
Expand Down Expand Up @@ -614,7 +612,47 @@ <h3 data-number="2.6.2" class="anchored" data-anchor-id="main-python-classes"><s
}, 1000);
// clear code selection
e.clearSelection();
}
const getTextToCopy = function(trigger) {
const codeEl = trigger.previousElementSibling.cloneNode(true);
for (const childEl of codeEl.children) {
if (isCodeAnnotation(childEl)) {
childEl.remove();
}
}
return codeEl.innerText;
}
const clipboard = new window.ClipboardJS('.code-copy-button:not([data-in-quarto-modal])', {
text: getTextToCopy
});
clipboard.on('success', onCopySuccess);
if (window.document.getElementById('quarto-embedded-source-code-modal')) {
// For code content inside modals, clipBoardJS needs to be initialized with a container option
// TODO: Check when it could be a function (https://github.com/zenorocha/clipboard.js/issues/860)
const clipboardModal = new window.ClipboardJS('.code-copy-button[data-in-quarto-modal]', {
text: getTextToCopy,
container: window.document.getElementById('quarto-embedded-source-code-modal')
});
clipboardModal.on('success', onCopySuccess);
}
var localhostRegex = new RegExp(/^(?:http|https):\/\/localhost\:?[0-9]*\//);
var mailtoRegex = new RegExp(/^mailto:/);
var filterRegex = new RegExp('/' + window.location.host + '/');
var isInternal = (href) => {
return filterRegex.test(href) || localhostRegex.test(href) || mailtoRegex.test(href);
}
// Inspect non-navigation links and adorn them if external
var links = window.document.querySelectorAll('a[href]:not(.nav-link):not(.navbar-brand):not(.toc-action):not(.sidebar-link):not(.sidebar-item-toggle):not(.pagination-link):not(.no-external):not([aria-hidden]):not(.dropdown-item):not(.quarto-navigation-tool):not(.about-link)');
for (var i=0; i<links.length; i++) {
const link = links[i];
if (!isInternal(link.href)) {
// undo the damage that might have been done by quarto-nav.js in the case of
// links that we want to consider external
if (link.dataset.originalHref !== undefined) {
link.href = link.dataset.originalHref;
}
}
}
function tippyHover(el, contentFn, onTriggerFn, onUntriggerFn) {
const config = {
allowHTML: true,
Expand Down Expand Up @@ -649,7 +687,11 @@ <h3 data-number="2.6.2" class="anchored" data-anchor-id="main-python-classes"><s
try { href = new URL(href).hash; } catch {}
const id = href.replace(/^#\/?/, "");
const note = window.document.getElementById(id);
return note.innerHTML;
if (note) {
return note.innerHTML;
} else {
return "";
}
});
}
const xrefs = window.document.querySelectorAll('a.quarto-xref');
Expand Down Expand Up @@ -697,7 +739,12 @@ <h3 data-number="2.6.2" class="anchored" data-anchor-id="main-python-classes"><s
if (window.Quarto?.typesetMath) {
window.Quarto.typesetMath(note);
}
return note.innerHTML;
// TODO in 1.5, we should make sure this works without a callout special case
if (note.classList.contains("callout")) {
return note.outerHTML;
} else {
return note.innerHTML;
}
}
}
for (var i=0; i<xrefs.length; i++) {
Expand Down Expand Up @@ -920,12 +967,12 @@ <h3 data-number="2.6.2" class="anchored" data-anchor-id="main-python-classes"><s
</script>
<nav class="page-navigation">
<div class="nav-page nav-page-previous">
<a href="../Chapters/02-python.html" class="pagination-link aria-label=" &lt;span="" concepts="" of="" python&lt;="" span&gt;"="">
<a href="../Chapters/02-python.html" class="pagination-link" aria-label="Key concepts of python">
<i class="bi bi-arrow-left-short"></i> <span class="nav-page-text"><span class="chapter-number">1</span>&nbsp; <span class="chapter-title">Key concepts of python</span></span>
</a>
</div>
<div class="nav-page nav-page-next">
<a href="../Chapters/04-dataframes.html" class="pagination-link" aria-label="<span class='chapter-number'>3</span>&nbsp; <span class='chapter-title'>Introducing Spark DataFrames</span>">
<a href="../Chapters/04-dataframes.html" class="pagination-link" aria-label="Introducing Spark DataFrames">
<span class="nav-page-text"><span class="chapter-number">3</span>&nbsp; <span class="chapter-title">Introducing Spark DataFrames</span></span> <i class="bi bi-arrow-right-short"></i>
</a>
</div>
Expand Down
Loading