Synthetic_Control.Rmd

---
title: "Causal Inference: <br> *The Mixtape*"
subtitle: "<it>Synthetic Control</it>"
output: 
  learnr::tutorial:
    css: css/style.css
    highlight: "kate"
runtime: shiny_prerendered
---

## Welcome

This is material for the **Synthetic Control** chapter in Scott Cunningham's book, [Causal Inference: The Mixtape.](https://mixtape.scunning.com/)

### Packages needed

The first thing you need to do is install a few packages to make sure everything runs:

```{r, eval = FALSE}
install.packages("tidyverse")
install.packages("cli")
install.packages("haven")
install.packages("rmarkdown")
install.packages("learnr")
install.packages("haven")
install.packages("stargazer")

# This chapter only
install.packages("robustbase")
install.packages("Synth")
install.packages("devtools")
devtools::install_github("bcastanho/SCtools")
```

### Load

```{r load, warning=FALSE, message=FALSE}
library(learnr)
library(haven)
library(tidyverse)
library(stargazer)

# This chapter only
library(Synth)
library(SCtools)

# 20 minute code time limit
options(tutorial.exercise.timelimit = 1200)

# read_data function
read_data <- function(df) {
  full_path <- paste0("https://raw.github.com/scunning1975/mixtape/master/", df)
  return(haven::read_dta(full_path))
}
```


## Prison Construction and Black Male Incarceration

```{r texas, exercise=TRUE, echo=FALSE}

texas <- read_data("texas.dta") %>%
  as.data.frame(.)

dataprep_out <- dataprep(
  foo = texas,
  predictors = c("poverty", "income"),
  predictors.op = "mean",
  time.predictors.prior = 1985:1993,
  special.predictors = list(
    list("bmprison", c(1988, 1990:1992), "mean"),
    list("alcohol", 1990, "mean"),
    list("aidscapita", 1990:1991, "mean"),
    list("black", 1990:1992, "mean"),
    list("perc1519", 1990, "mean")),
  dependent = "bmprison",
  unit.variable = "statefip",
  unit.names.variable = "state",
  time.variable = "year",
  treatment.identifier = 48,
  controls.identifier = c(1,2,4:6,8:13,15:42,44:47,49:51,53:56),
  time.optimize.ssr = 1985:1993,
  time.plot = 1985:2000
)

synth_out <- synth(data.prep.obj = dataprep_out)

cli::cli_h1("Path Plot")
path.plot(synth_out, dataprep_out)

cli::cli_h1("Gap Plot")
gaps.plot(synth_out, dataprep_out)


cli::cli_h1("Placebos")
placebos <- generate.placebos(dataprep_out, synth_out, Sigf.ipop = 3)

plot_placebos(placebos)

mspe.plot(placebos, discard.extreme = TRUE, mspe.limit = 1, plot.hist = TRUE)

```

#### Questions
- In your own words, what do you think the identifying assumptions are for synthetic control to be consistent? 
- What role, if any, does parallel trends play in synthetic control?
- Who is the unit with the largest ratio of post to pre RMSPE?  
- Compare the unit with the largest post to pre RMSPE estimated effect to the Texas effect.  How do the weights compare?  How do the size of the effects compare?  How do the ``signs`` of the effects compare?
- Can you improve on my fit by experimenting with different combinations? Do so and report your analysis.
- Report results from a variety of different specifications.  How robust does the prison effect appear to be?