Potential-Outcomes.Rmd

---
title: "Causal Inference: <br> *The Mixtape*"
subtitle: "<it>Potential Outcomes</it>"
output: 
  learnr::tutorial:
    css: css/style.css
    highlight: "kate"
runtime: shiny_prerendered
---

## Welcome

This is material for the **Potential Outcomes** chapter in Scott Cunningham's book, [Causal Inference: The Mixtape.](https://mixtape.scunning.com/)


### Packages needed

The first thing you need to do is install a few packages to make sure everything runs:

```{r, eval = FALSE}
install.packages("tidyverse")
install.packages("cli")
install.packages("haven")
install.packages("rmarkdown")
install.packages("learnr")
install.packages("haven")
install.packages("stargazer")
```

### Load

```{r load, warning=FALSE, message=FALSE}
library(learnr)
library(haven)
library(tidyverse)
library(estimatr)
library(stargazer)

# 10 minute code time limit
options(tutorial.exercise.timelimit = 600)

# read_data function
read_data <- function(df) {
    full_path <- paste0("https://raw.github.com/scunning1975/mixtape/master/", df)
    return(haven::read_dta(full_path))
}
```


## Yule

```{r yule, exercise=TRUE, echo=FALSE}

yule <- read_data("yule.dta") %>%
    lm(paup ~ outrelief + old + pop, .)

stargazer(yule, type = "text")
```


#### Questions

- How do you interpret the coefficient on `outrelief` given it's a percentage change regressed onto a percentage?
- Draw a DAG representing what must be true in order for Yule's estimate of `outrelief` on pauper growth rates to be causal?  
- Yule concluded that public assistance (`outrelief`) increased pauper growth rates. How convinced are you that all backdoor paths between pauperism and out-relief are blocked once you control for two covariates in a cross-sectional database for all of England? Could there be unobserved determinants of both poverty and public assistance?
- If public assistance causes pauper growth rates, but pauper growth rates also causes public assistance, then why won't Yule's regression capture a causal effect of `outrelief` on pauper growth rates?  Explain the concept of reverse causality with Yule's data.


## Independence Assumption

```{r independence, exercise=TRUE, echo=FALSE}

gap <- function() {
    sdo <- tibble(
        y1 = c(7, 5, 5, 7, 4, 10, 1, 5, 3, 9),
        y0 = c(1, 6, 1, 8, 2, 1, 10, 6, 7, 8),
        random = rnorm(10)
    ) %>%
        arrange(random) %>%
        mutate(
            d = c(rep(1, 5), rep(0, 5)),
            y = d * y1 + (1 - d) * y0
        ) %>%
        pull(y)

    sdo <- mean(sdo[1:5] - sdo[6:10])

    return(sdo)
}

sim <- replicate(1000, gap())
mean(sim)
```

#### Questions
- The requirement that treatment be independent of potential outcomes states that a choice made by a person must be independent of what they expect to gain or lose from the choice.  Give an example where this is likely true?  What does independence imply about human decision-making?
- All of the behavioral sciences, including economics, suggest that independence is unlikely to hold outside of an experiment. What is so special about an experiment where independence will hold?  What is so special about behavior outside an experiment where it is unlikely to hold?
- What implication does the decision rule of utility maximization from economics have for our ability to appeal to treatment being distributed independent of potential outcomes?


## Fisher Randomization

```{r randomization_1, exercise=TRUE, echo=FALSE}
correct <- tibble(
    cup = c(1:8),
    guess = c(1:4, rep(0, 4))
)

combo <- as_tibble(t(combn(correct$cup, 4))) %>%
    transmute(
        cup_1 = V1, cup_2 = V2,
        cup_3 = V3, cup_4 = V4
    ) %>%
    mutate(permutation = 1:70) %>%
    crossing(., correct) %>%
    arrange(permutation, cup) %>%
    mutate(
        correct = case_when(
            cup_1 == 1 & cup_2 == 2 & cup_3 == 3 & cup_4 == 4 ~ 1,
            TRUE ~ 0
        )
    )

sum(combo$correct == 1)
p_value <- sum(combo$correct == 1) / nrow(combo)

p_value
```


#### Questions

- Using the above simulation, what is the probability that Dr. Bristol selected the correct four cups completely by chance?


## Randomization Inference

### Fisher Sharp Null

```{r ri, exercise=TRUE, echo=FALSE}

ri <- read_data("ri.dta") %>%
    mutate(id = c(1:8))

treated <- c(1:4)

combo <- ri %$% as_tibble(t(combn(id, 4))) %>%
    transmute(
        treated1 = V1, treated2 = V2,
        treated3 = V3, treated4 = V4
    ) %>%
    mutate(permutation = 1:70) %>%
    crossing(., ri) %>%
    arrange(permutation, name) %>%
    mutate(d = case_when(
        id == treated1 | id == treated2 |
            id == treated3 | id == treated4 ~ 1,
        TRUE ~ 0
    ))

te1 <- combo %>%
    group_by(permutation) %>%
    filter(d == 1) %>%
    summarize(te1 = mean(y, na.rm = TRUE))

te0 <- combo %>%
    group_by(permutation) %>%
    filter(d == 0) %>%
    summarize(te0 = mean(y, na.rm = TRUE))

n <- nrow(inner_join(te1, te0, by = "permutation"))

p_value <- inner_join(te1, te0, by = "permutation") %>%
    mutate(ate = te1 - te0) %>%
    dplyr::select(permutation, ate) %>%
    arrange(ate) %>%
    mutate(rank = 1:nrow(.)) %>%
    filter(permutation == 1) %>%
    pull(rank) / n

p_value
```


#### Questions

- Can we reject the null in the placebo distribution?


### KS Test


```{r ks, exercise=TRUE, echo=FALSE}
library(stats)

tb <- tibble(
    d = c(rep(0, 20), rep(1, 20)),
    y = c(
        0.22, -0.87, -2.39, -1.79, 0.37, -1.54,
        1.28, -0.31, -0.74, 1.72,
        0.38, -0.17, -0.62, -1.10, 0.30,
        0.15, 2.30, 0.19, -0.50, -0.9,
        -5.13, -2.19, 2.43, -3.83, 0.5,
        -3.25, 4.32, 1.63, 5.18, -0.43,
        7.11, 4.87, -3.10, -5.81, 3.76,
        6.31, 2.58, 0.07, 5.76, 3.50
    )
)

kdensity_d1 <- tb %>%
    filter(d == 1) %>%
    pull(y)
kdensity_d1 <- density(kdensity_d1)

kdensity_d0 <- tb %>%
    filter(d == 0) %>%
    pull(y)
kdensity_d0 <- density(kdensity_d0)

kdensity_d0 <- tibble(x = kdensity_d0$x, y = kdensity_d0$y, d = 0)
kdensity_d1 <- tibble(x = kdensity_d1$x, y = kdensity_d1$y, d = 1)

kdensity <- full_join(kdensity_d1, kdensity_d0)
kdensity$d <- as_factor(kdensity$d)

ggplot(kdensity) +
    geom_point(size = 0.3, aes(x, y, color = d)) +
    xlim(-7, 8) +
    labs(title = "Kolmogorov-Smirnov Test") +
    scale_color_discrete(labels = c("Control", "Treatment"))
```


## Approximate $p$-values

```{r thornton_ri, exercise=TRUE, echo=FALSE}

hiv <- read_data("thornton_hiv.dta")


# creating the permutations

tb <- NULL

permuteHIV <- function(df, random = TRUE) {
    tb <- df
    first_half <- ceiling(nrow(tb) / 2)
    second_half <- nrow(tb) - first_half

    if (random == TRUE) {
        tb <- tb %>%
            sample_frac(1) %>%
            mutate(any = c(rep(1, first_half), rep(0, second_half)))
    }

    te1 <- tb %>%
        filter(any == 1) %>%
        pull(got) %>%
        mean(na.rm = TRUE)

    te0 <- tb %>%
        filter(any == 0) %>%
        pull(got) %>%
        mean(na.rm = TRUE)

    ate <- te1 - te0

    return(ate)
}

permuteHIV(hiv, random = FALSE)

iterations <- 1000

permutation <- tibble(
    iteration = 1:iterations,
    ate = as.numeric(
        c(permuteHIV(hiv, random = FALSE), map(seq(iterations - 1), ~ permuteHIV(hiv, random = TRUE)))
    )
)

# calculating the p-value

permutation <- permutation %>%
    arrange(-ate) %>%
    mutate(rank = seq(iterations))

p_value <- permutation %>%
    filter(iteration == 1) %>%
    pull(rank) / iterations

p_value
```


#### Questions 

- How does the randomization inference test of no treatment effect differ from a null of no average treatment effect?
- How likely is it that Thornton's results were a result of random chance?