|
| 1 | +# Sample datasets |
| 2 | + |
| 3 | +When you want to demonstrate a feature, practice your [Plot](plot)-fu and learn to wrangle data —or to create a minimal reproducible example for a bug report—, it is useful to have a few common datasets at hand. The following symbols default to well-known datasets: |
| 4 | + |
| 5 | +- **aapl** — A time series of Apple stock. [Yahoo! Finance](https://finance.yahoo.com/lookup) |
| 6 | +- **alphabet** — Relative frequencies of letters in English. [_Cryptographical Mathematics_ by Robert Edward Lewand](http://cs.wellesley.edu/~fturbak/codman/letterfreq.html) |
| 7 | +- **cars** - [1983 ASA Data Exposition](http://lib.stat.cmu.edu/datasets/) |
| 8 | +- **citywages** — [The Upshot](https://www.nytimes.com/2019/12/02/upshot/wealth-poverty-divide-american-cities.html) |
| 9 | +- **diamonds** — ggplot2 “diamonds” dataset (carat and price columns only) |
| 10 | + [source](https://github.com/tidyverse/ggplot2/blob/master/data-raw/diamonds.csv) |
| 11 | +- **flare** — [Flare visualization toolkit package hierarchy](https://observablehq.com/@d3/treemap) |
| 12 | +- **industries** — U.S. Bureau of Labor Statistics |
| 13 | +- **miserables** Character interactions in the chapters of “Les Miserables”, [Donald Knuth, Stanford Graph Base](https://www-cs-faculty.stanford.edu/~knuth/sgb.html) |
| 14 | +- **olympians** [Matt Riggott/IOC](https://www.flother.is/2017/olympic-games-data/) |
| 15 | +- **penguins** — [Dr. Kristen Gorman](https://github.com/allisonhorst/palmerpenguins) |
| 16 | +- **pizza** — A synthetic dataset. [Observable](https://observablehq.com/@observablehq/pizza-paradise-data) |
| 17 | +- **weather** — [NOAA/Vega](https://github.com/vega/vega-datasets/blob/next/SOURCES.md#weathercsv) |
| 18 | + |
| 19 | +For example, the line below creates a chart for the time series of AAPL closing prices over a span of five years: |
| 20 | + |
| 21 | +```js echo |
| 22 | +Plot.lineY(aapl, {x: "Date", y: "Close"}).plot({grid: true}) |
| 23 | +``` |
| 24 | + |
| 25 | +--- |
| 26 | + |
| 27 | +**What about performance?** Isn’t it slow to load all these datasets on every page? Thanks to static analysis, a dataset isn’t loaded unless you reference it. Referenced datasets are then served over the internet. |
| 28 | + |
| 29 | +**Doesn’t this pollute the namespace?** These symbols are just default values. You are free to redefine them as you wish in your page. |
| 30 | + |
| 31 | +```js echo |
| 32 | +const cars = ["Lightning McQueen", "Mater", "Sally Carrera", "Doc Hudson", "Ramone", "Luigi", "Guido", "Fillmore", "Flo", "Sarge"]; |
| 33 | +``` |
0 commit comments