Skip to content

Commit f7bf3de

Browse files
committed
Start of chapter 13
1 parent becff8d commit f7bf3de

File tree

8 files changed

+1472
-4
lines changed

8 files changed

+1472
-4
lines changed

Diff for: 12_browser.txt

+1-3
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,6 @@
11
:chap_num: 12
22
:prev_link: 11_language
3-
:next_link: 13_FIXME
4-
:frame_button: true
5-
:load_files: []
3+
:next_link: 13_dom
64

75
= JavaScript and the Browser =
86

Diff for: 13_dom.txt

+265
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,265 @@
1+
:chap_num: 13
2+
:prev_link: 12_browser
3+
:next_link: 14_FIXME
4+
5+
= The Document Object Model =
6+
7+
A JavaScript program running in the browser is locked up in its
8+
sandbox, unable to interact with the rest of the system. But it is not
9+
alone. The web page itself, the document that the browser is
10+
displaying, is in there as well.
11+
12+
Interacting with this document, in order to enhance it, make it
13+
interactive, or turn it into a full-blown application, is what
14+
JavaScript was invented for.
15+
16+
== Document structure ==
17+
18+
A HTML document can be visualized as a nested set of boxes. Tags like
19+
`<body>` and `</body>` enclose other tags, which in turn contain other
20+
tags (or text).
21+
22+
[sandbox="homepage"]
23+
[source,text/html]
24+
----
25+
<!doctype html>
26+
<html>
27+
<head>
28+
<title>My home page</title>
29+
</head>
30+
<body>
31+
<h1>My home page</h1>
32+
<p>Hello, I am Marijn and this is my home page.</p>
33+
<p>I also wrote a book! Read it
34+
<a href="http://eloquentjavascript.net">here</a>.</p>
35+
</body>
36+
</html>
37+
----
38+
39+
This example page has the following structure:
40+
41+
image::img/html-boxes.svg[alt="HTML document as nested boxes"]
42+
43+
The data structure the browser uses to represent the document follows
44+
this shape. For each box, there is an object, which we can interact
45+
with to find out things like what HTML tag it represents, and which
46+
boxes and text it contains. This representation is called the
47+
_Document Object Model_, DOM for short.
48+
49+
The global variable `document` gives us access to these objects. Its
50+
`documentElement` property refers to the object representing the
51+
`<html>` tag. It also provides properties `head` and `body`, holding
52+
the objects for those elements. The body, the actual visual part of
53+
the document, is usually the element we want to work with.
54+
55+
== Trees ==
56+
57+
Think back to the syntax trees from Chapter 11 for a moment. Their
58+
structure is strikingly similar to the structure of a browser's
59+
document. Each “node” may refer to sub-nodes, children, which are
60+
themselves nodes. This shape is typical of nested structures where the
61+
same kind of element can be repeated inside existing elements.
62+
63+
We call a data structure a _tree_ when it has a branching structure,
64+
contains no cycles (a node may not contain itself, directly or
65+
indirectly), and has a single, well-defined “root”.
66+
67+
Trees come up a lot in computer science. Apart from representing
68+
recursive structures like the programs from Chapter 11 and HTML
69+
documents, they are also often used to maintain sorted sets of data,
70+
because elements can often be found or inserted more efficiently in a
71+
sorted tree than in a sorted flat array.
72+
73+
A typical tree has different kinds of nodes. The syntax tree had
74+
variables, values, and application nodes, where applications always
75+
had children, and variables and values were _leaves_, nodes without
76+
children.
77+
78+
The same goes for the DOM. Nodes for regular elements (the
79+
representation of a tag in the document) make up the structure of the
80+
document. These can (but most not) have child nodes. An example of
81+
such a node is `document.body`. Some of these children can be leaf
82+
nodes, such as pieces of text or comments (which are written between
83+
`<!--` and `-->` in HTML).
84+
85+
Each DOM node object has a `nodeType` property, which contains a
86+
number code that identifies the type of node. Regular nodes have the
87+
value 1 (which is also defined as the constant property
88+
`document.ELEMENT_NODE`). Text nodes, representing a section of plain
89+
(non-tag) text in the document, get type 3 (`document.TEXT_NODE`).
90+
Comments get type 8 (`document.COMMENT_NODE`).
91+
92+
So another way to visualize our document tree is:
93+
94+
image::img/html-tree.svg[alt="HTML document as a tree"]
95+
96+
The leaves are text nodes, and the arrows indicate
97+
parent-relationships between nodes.
98+
99+
== The standard ==
100+
101+
Using cryptic number codes to represent node types is not a very
102+
JavaScript-like thing to do. Further on in this chapter, we'll see
103+
that other parts of the DOM interface also feel rather cumbersome and
104+
alien. The reason for this is that the DOM wasn't designed for just
105+
JavaScript, but rather tries to define a language-neutral interface
106+
that can be used in other systems as well, and not just for HTML, but
107+
also for XML, which is a generic data format with an HTML-like syntax.
108+
109+
This is unfortunate. Standards are often useful. But in this case, the
110+
advantage (cross-language consistency) isn't all that powerful. And
111+
the downside, having an interface that's not well integrated with the
112+
language, is rather serious.
113+
114+
As an example of such poor integration, consider the `childNodes`
115+
property that element nodes in the DOM have. This property holds an
116+
array-like object, with a `length` property and properties labeled by
117+
numbers (`0`, `1`) to access the child nodes. But it is an instance of
118+
the `NodeList` type, not a real array, so it does not have methods
119+
like `slice` and `forEach`.
120+
121+
Then there are issues that are simply the result of old-fashioned
122+
design. For example, there is no way to create a new node and
123+
immediately add children or attributes to it. Instead, you have to
124+
first create it, then add the children one by one, and set the
125+
attributes one by one. Code that interacts heavily with the DOM tends
126+
to get very long, repetetive, and ugly.
127+
128+
But of course, JavaScript allows us to create our own abstractions. It
129+
is easy to write some helper functions that allow you to express the
130+
operations you are performing in a clearer and shorter way. In fact,
131+
many libraries intended for browser programming come with such
132+
functions.
133+
134+
== Moving through the tree ==
135+
136+
DOM nodes contain a wealth of links to other, nearby nodes. The
137+
following diagram tries to illustrate these.
138+
139+
image::img/html-links.svg[alt="Links between DOM nodes"]
140+
141+
Every node has a `parentNode` property, pointing to the node it is
142+
part of. The diagram only shows one of each link type. Every element
143+
node (node type 1) has a `childNodes` property that contains a
144+
pseudo-array with its children. Those represent the fundamental
145+
structure of the tree.
146+
147+
In addition, there are a number of convenience links. The `firstChild`
148+
and `lastChild` properties point to the first and last child element,
149+
or have the value null for nodes without children. Similarly,
150+
`previousSibling` and `nextSibling` point to adjacent nodes, nodes
151+
with the same parent that appear immediately before or after the node
152+
itself. For a first child, `previousSibling` will be null, and for a
153+
last child, `nextSibling` is null.
154+
155+
When dealing with a data structure like this, whose structure repeats
156+
itself as we go deeper, recursive functions are often useful. The one
157+
below scans a document for text nodes containing a given string, and
158+
returns true when it has found one.
159+
160+
[sandbox="homepage"]
161+
[source,javascript]
162+
----
163+
function talksAbout(node, string) {
164+
if (node.nodeType == 1) {
165+
for (var i = 0; i < node.childNodes.length; i++) {
166+
if (talksAbout(node.childNodes[i], string))
167+
return true;
168+
}
169+
return false;
170+
} else if (node.nodeType == 3) {
171+
return node.nodeValue.indexOf(string) > -1;
172+
}
173+
}
174+
175+
console.log(talksAbout(document.body, "book"));
176+
// → true
177+
----
178+
179+
The `nodeValue` property of a text node refers to the string of text
180+
that it represents.
181+
182+
== Finding elements ==
183+
184+
Navigating these links to parents, children, and siblings is
185+
occasionally useful, for example in the function above, which blindly
186+
runs through the whole document. But usually, tying assumptions about
187+
the precise structure of your document into your program is a bad
188+
idea, since you might want to change that structure later. Another
189+
complicating factor is that text nodes are created even for the
190+
whitespace (newlines and spaces) between nodes. The example document's
191+
body tag does not have just three children (`<h1>` and two `<p>`’s),
192+
but actually has 7 (those three, plus the space before, after, and
193+
between them).
194+
195+
So if we want to get the `href` attribute of the link in that
196+
document, we don't want to say something horrible like “get the second
197+
child of the sixth child of the document body”. It'd be better if we
198+
could say “get the first link in the document”. And we can.
199+
200+
[sandbox="homepage"]
201+
[source,javascript]
202+
----
203+
var link = document.body.getElementsByTagName("a")[0];
204+
console.log(link.href);
205+
----
206+
207+
All element nodes have a `getElementsByTagName` method that retrieves
208+
a pseudo-array of all elements with the given tag name that exist
209+
inside of that element (even if they are wrapped in other nodes).
210+
211+
To find a specific _single_ node, you can give it an `id` attribute,
212+
and use `document.getElementById` instead.
213+
214+
[source,text/html]
215+
----
216+
<!doctype html>
217+
218+
<p>My ostrich Gertrude:</p>
219+
<p><img id="image" src="img/ostrich.png"></p>
220+
221+
<script>
222+
var ostrich = document.getElementById("image");
223+
console.log(ostrich.src);
224+
</script>
225+
----
226+
227+
A third, similar method is `getElementsByClassName`, which, like
228+
`getElementsByTagName`, searches through the contents of an element
229+
node, and retrieves all elements that have the given string in their
230+
`class` attribute.
231+
232+
There exist also `getElementByTagName` and `getElementByClassName`
233+
(note, “element” is not pluralized), which instead of returning a
234+
pseudo-array, return the first element that matches, or null if none
235+
is found.
236+
237+
== Changing the document ==
238+
239+
Almost everything about the DOM data structure can be changed. Element
240+
nodes have a number of methods for changing their content. The
241+
`removeChild` method removes the given child node from the document.
242+
To add a child, we can use `appendChild`, which puts it at the end of
243+
list of children, or `insertBefore`, which inserts the node given as
244+
first argument before the node given as second argument.
245+
246+
[source,text/html]
247+
----
248+
<!doctype html>
249+
250+
<p>One</p>
251+
<p>Two</p>
252+
<p>Three</p>
253+
254+
<script>
255+
var paragraphs = document.body.getElementsByTagName("p");
256+
document.body.insertBefore(paragraphs[2], paragraphs[0]);
257+
</script>
258+
----
259+
260+
261+
262+
263+
264+
Exercises:
265+
- implement getElementsByTagName

Diff for: Makefile

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
all: html tex
22

33
CHAPTERS := 00_intro 01_values 02_program_structure 03_functions 04_data 05_higher_order 06_object \
4-
07_elife 08_error 09_regexp 10_modules 11_language 12_browser
4+
07_elife 08_error 09_regexp 10_modules 11_language 12_browser 13_dom
55

66
html: $(foreach CHAP,$(CHAPTERS),html/$(CHAP).html)
77

Diff for: html/index.html

+1
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,7 @@ <h3>(Part 1: Language)</h3>
5454
<h3>(Part 2: Browser)</h3>
5555
<a href="12_browser.html">JavaScript and the Browser</a>
5656
</li>
57+
<li><a href="13_dom.html">The Document Object Model</a>
5758
</ol>
5859

5960
</article>

0 commit comments

Comments
 (0)