Skip to content

Commit eaae4aa

Browse files
committed
document sample datasets
closes #307
1 parent 56551bb commit eaae4aa

File tree

4 files changed

+43
-340
lines changed

4 files changed

+43
-340
lines changed

docs/lib/d3.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -94,10 +94,16 @@ function drag(simulation) {
9494
}
9595
```
9696

97-
The data is loaded as a JSON file:
97+
The graph is one of Observable’s [sample datasets](datasets):
9898

9999
```js echo
100-
const data = FileAttachment("miserables.json").json();
100+
const data = miserables;
101+
```
102+
103+
Alternatively, it could be referenced as a FileAttachment:
104+
105+
```js echo run-false
106+
// const data = FileAttachment("graph.json").json();
101107
```
102108

103109
We recommend using [Observable Plot](plot) if you want to create simple charts from your data; but for more complex or bespoke needs, including interactivity and animation, you will most probably want to use D3.

docs/lib/datasets.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Sample datasets
2+
3+
When you want to demonstrate a feature, practice your [Plot](plot)-fu and learn to wrangle data —or to create a minimal reproducible example for a bug report—, it is useful to have a few common datasets at hand. The following symbols default to well-known datasets:
4+
5+
- **aapl** — A time series of Apple stock. [Yahoo! Finance](https://finance.yahoo.com/lookup)
6+
- **alphabet** — Relative frequencies of letters in English. [_Cryptographical Mathematics_ by Robert Edward Lewand](http://cs.wellesley.edu/~fturbak/codman/letterfreq.html)
7+
- **cars** - [1983 ASA Data Exposition](http://lib.stat.cmu.edu/datasets/)
8+
- **citywages**[The Upshot](https://www.nytimes.com/2019/12/02/upshot/wealth-poverty-divide-american-cities.html)
9+
- **diamonds** — ggplot2 “diamonds” dataset (carat and price columns only)
10+
[source](https://github.com/tidyverse/ggplot2/blob/master/data-raw/diamonds.csv)
11+
- **flare**[Flare visualization toolkit package hierarchy](https://observablehq.com/@d3/treemap)
12+
- **industries** — U.S. Bureau of Labor Statistics
13+
- **miserables** Character interactions in the chapters of “Les Miserables”, [Donald Knuth, Stanford Graph Base](https://www-cs-faculty.stanford.edu/~knuth/sgb.html)
14+
- **olympians** [Matt Riggott/IOC](https://www.flother.is/2017/olympic-games-data/)
15+
- **penguins**[Dr. Kristen Gorman](https://github.com/allisonhorst/palmerpenguins)
16+
- **pizza** — A synthetic dataset. [Observable](https://observablehq.com/@observablehq/pizza-paradise-data)
17+
- **weather**[NOAA/Vega](https://github.com/vega/vega-datasets/blob/next/SOURCES.md#weathercsv)
18+
19+
For example, the line below creates a chart for the time series of AAPL closing prices over a span of five years:
20+
21+
```js echo
22+
Plot.lineY(aapl, {x: "Date", y: "Close"}).plot({grid: true})
23+
```
24+
25+
---
26+
27+
**What about performance?** Isn’t it slow to load all these datasets on every page? Thanks to static analysis, a dataset isn’t loaded unless you reference it. Referenced datasets are then served over the internet.
28+
29+
**Doesn’t this pollute the namespace?** These symbols are just default values. You are free to redefine them as you wish in your page.
30+
31+
```js echo
32+
const cars = ["Lightning McQueen", "Mater", "Sally Carrera", "Doc Hudson", "Ramone", "Luigi", "Guido", "Fillmore", "Flo", "Sarge"];
33+
```

0 commit comments

Comments
 (0)