Skip to content

Commit b7434f3

Browse files
committed
Polish documentation and vignettes
1 parent 0f2a1b5 commit b7434f3

9 files changed

+50
-31
lines changed

DESCRIPTION

+2-1
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,8 @@ Authors@R: c(
88
person("Hadley", "Wickham", email = "[email protected]", role = c("aut", "ctb"), comment = c(ORCID = "0000-0003-4757-117X")),
99
person("Niladri Roy", "Chowdhury", email = "[email protected]", role = c("aut", "ctb")),
1010
person("Di", "Cook", email = "[email protected]", role = c("aut", "cre"), comment = c(ORCID = "0000-0002-3813-7155")),
11-
person("Heike", "Hofmann", email = "[email protected]", role = c("aut", "ctb"), comment = c(ORCID = "0000-0001-6216-5183"))
11+
person("Heike", "Hofmann", email = "[email protected]", role = c("aut", "ctb"), comment = c(ORCID = "0000-0001-6216-5183")),
12+
person("Måns", "Thulin", email = "[email protected]", role = c("aut", "ctb"))
1213
)
1314
Maintainer: Di Cook <[email protected]>
1415
License: GPL (>= 2)

R/method-model.r

+1-1
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
#' 'rotate', 'perm', 'pboot' and 'boot' are defined by \code{\link{resid_rotate}},
1010
#' \code{\link{resid_perm}}, \code{\link{resid_pboot}} and \code{\link{resid_boot}}
1111
#' respectively
12-
#' @param additional whether to compute additional meaures: standardized
12+
#' @param additional whether to compute additional measures: standardized
1313
#' residuals and leverage
1414
#' @param ... other arguments passed onto \code{method}.
1515
#' @return a function that given \code{data} generates a null data set.

R/quick_plots.R

+20-12
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
#' descriptions of these plots.
88
#' In the lineup protocol the plot of the real data is embedded amongst a field of
99
#' plots of data generated to be consistent with some null hypothesis.
10-
#' If the observe can pick the real data as different from the others, this
10+
#' If the observer can pick the real data as different from the others, this
1111
#' lends weight to the statistical significance of the structure in the plot.
1212
#' The protocol is described in Buja et al. (2009).
1313
#'
@@ -25,9 +25,11 @@
2525
#' and high leverage, which are likely to have a strong influence on
2626
#' the model fit.
2727
#'
28-
#' Generate n - 1 null datasets and randomly position the true data. If you
29-
#' pick the real data as being noticeably different, then you have formally
30-
#' established that it is different to with p-value 1/n.
28+
#' 19 null datasets are plotted together the the true data (randomly
29+
#' positioned). If you pick the real data as being noticeably different, then
30+
#' you have formally established that it is different to with p-value 0.05.
31+
#' Run the \code{decrypt} message printed in the R Console to see which
32+
#' plot represents the true data.
3133
#'
3234
#' If the null hypothesis in the type 1 plot is violated, consider using
3335
#' a different model. If the null hypotheses in the type 2 or 3 plots
@@ -127,13 +129,16 @@ lineup_residuals <- function(model, type = 1, method = "rotate", color_points =
127129
#' \code{dist} argument.
128130
#' In the lineup protocol the plot of the real data is embedded amongst a field of
129131
#' plots of data generated to be consistent with some null hypothesis.
130-
#' If the observe can pick the real data as different from the others, this
132+
#' If the observer can pick the real data as different from the others, this
131133
#' lends weight to the statistical significance of the structure in the plot.
132134
#' The protocol is described in Buja et al. (2009).
133135
#'
134-
#' @details #' Generate n - 1 null datasets and randomly position the true data. If you
135-
#' pick the real data as being noticeably different, then you have formally
136-
#' established that it is different to with p-value 1/n.
136+
#' @details 19 null datasets are plotted together the the true data (randomly
137+
#' positioned) If you pick the real data as being noticeably different, then
138+
#' you have formally established that it is different to with p-value 0.05.
139+
#'
140+
#' Run the \code{decrypt} message printed in the R Console to see which
141+
#' plot represents the true data.
137142
#'
138143
#' @param data a data frame.
139144
#' @param variable the name of the variable that should be plotted.
@@ -209,13 +214,16 @@ lineup_histograms <- function(data, variable, dist = NULL, params = NULL, color_
209214
#' data follows the distribution specified by the \code{dist} argument.
210215
#' In the lineup protocol the plot of the real data is embedded amongst a field of
211216
#' plots of data generated to be consistent with some null hypothesis.
212-
#' If the observe can pick the real data as different from the others, this
217+
#' If the observer can pick the real data as different from the others, this
213218
#' lends weight to the statistical significance of the structure in the plot.
214219
#' The protocol is described in Buja et al. (2009).
215220
#'
216-
#' @details Generate n - 1 null datasets and randomly position the true data. If you
217-
#' pick the real data as being noticeably different, then you have formally
218-
#' established that it is different to with p-value 1/n.
221+
#' @details 19 null datasets are plotted together the the true data (randomly
222+
#' positioned) If you pick the real data as being noticeably different, then
223+
#' you have formally established that it is different to with p-value 0.05.
224+
#'
225+
#' Run the \code{decrypt} message printed in the R Console to see which
226+
#' plot represents the true data.
219227
#'
220228
#' @param data a data frame.
221229
#' @param variable the name of the variable that should be plotted.

man/lineup_histograms.Rd

+7-4
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

man/lineup_qq.Rd

+7-4
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

man/lineup_residuals.Rd

+6-4
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

man/null_lm.Rd

+1-1
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

vignettes/nullabor-distributions.Rmd

+3-3
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ lineup_histograms(tips, "total_bill", dist = "gamma")
6363
### Specifying distribution parameters
6464
In some cases, we need (or want) to specify the entire distribution, and not just the family. We then provide the distribution parameters, using the standard format for the distribution (i.e. the same used by `r*`, `d*`, `p*`, and `q*` functions, where `*` is the distribution name).
6565

66-
Let's say that we want to test whether a dataset comes from a uniform $U(0,1)$ distribution. First, we generate two example variables. `x1` is $U(0,1)$, but `x2` is not.
66+
As an example, let's say that we want to test whether a dataset comes from a uniform $U(0,1)$ distribution. First, we generate two example variables. `x1` is $U(0,1)$, but `x2` is not.
6767

6868
```{r message=FALSE}
6969
example_data <- data.frame(x1 = runif(100, 0, 1),
@@ -84,9 +84,9 @@ lineup_histograms(example_data, "x2", dist = "uniform", params = list(min = 0, m
8484

8585

8686
## Using Q-Q plots
87-
An alternative to histograms is to use quantile-quantile plots, in which the theoretical quantiles of the distribution are compared to the empirical quantiles from the data. Under the null hypothesis, the points should lie along the reference line. However, some deviations in the tails are usually expected. A lineup plot is useful to see how much points can deviate from the reference line under the null hypothesis.
87+
An alternative to histograms is to use quantile-quantile plots, in which the theoretical quantiles of the distribution are compared to the empirical quantiles from the (standardized) data. Under the null hypothesis, the points should lie along the reference line. However, some deviations in the tails are usually expected. A lineup plot is useful to see how much points can deviate from the reference line under the null hypothesis.
8888

89-
To create a Q-Q lineup plot, use `lineup_qq`:
89+
To create a Q-Q lineup plot using the normal distribution as the null distribution, use `lineup_qq` as follows:
9090

9191
```{r message=FALSE}
9292
lineup_qq(tips, "total_bill", dist = "normal")

vignettes/nullabor-regression.Rmd

+3-1
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ x <- lm(tip ~ total_bill, data = tips)
3131

3232
The `lineup_residuals` function can now be used to generate four types of residual lineup plots.
3333

34-
The first residual plot shows the residuals versus the fitted values. It is used to test the hypothesis that the response variable is a linear combination of the predictors:
34+
The first residual plot shows the residuals versus the fitted values. It is used to test the hypothesis that the response variable is a linear combination of the predictors. If you can spot the true data in the plot, you can formally reject the null hypothesis with p-value 0.05 (Buja et al., 2009; Li et al., 2024). After running the code below, run the `decrypt` message (e.g. `decrypt("XSKz 5xQx Vd Z3jVQV3d ww")`) printed in the R Console to see which dataset is the true data.
3535

3636
```{r message=FALSE}
3737
lineup_residuals(x, type = 1)
@@ -74,4 +74,6 @@ References
7474

7575
Buja, A., Cook, D., Hofmann, H., Lawrence, M., Lee, E.-K., Swayne, D. F, Wickham, H. (2009) Statistical Inference for Exploratory Data Analysis and Model Diagnostics, Royal Society Philosophical Transactions A, 367:4361--4383, DOI: 10.1098/rsta.2009.0120.
7676

77+
Li, W., Cook, D., Tanaka, E., & VanderPlas, S. (2024). A plot is worth a thousand tests: Assessing residual diagnostics with the lineup protocol. Journal of Computational and Graphical Statistics, 1-19.
78+
7779
Thulin, M. (2024) _Modern Statistics with R_. Boca Raton: CRC Press. ISBN 9781032512440. [https://www.modernstatisticswithr.com/](https://www.modernstatisticswithr.com/)

0 commit comments

Comments
 (0)