forked from scunning1975/mixtape_learnr
-
Notifications
You must be signed in to change notification settings - Fork 26
/
Copy pathSynthetic_Control.Rmd
112 lines (85 loc) · 3.04 KB
/
Synthetic_Control.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
---
title: "Causal Inference: <br> *The Mixtape*"
subtitle: "<it>Synthetic Control</it>"
output:
learnr::tutorial:
css: css/style.css
highlight: "kate"
runtime: shiny_prerendered
---
## Welcome
This is material for the **Synthetic Control** chapter in Scott Cunningham's book, [Causal Inference: The Mixtape.](https://mixtape.scunning.com/)
### Packages needed
The first thing you need to do is install a few packages to make sure everything runs:
```{r, eval = FALSE}
install.packages("tidyverse")
install.packages("cli")
install.packages("haven")
install.packages("rmarkdown")
install.packages("learnr")
install.packages("haven")
install.packages("stargazer")
# This chapter only
install.packages("robustbase")
install.packages("Synth")
install.packages("devtools")
devtools::install_github("bcastanho/SCtools")
```
### Load
```{r load, warning=FALSE, message=FALSE}
library(learnr)
library(haven)
library(tidyverse)
library(stargazer)
# This chapter only
library(Synth)
library(SCtools)
# 20 minute code time limit
options(tutorial.exercise.timelimit = 1200)
# read_data function
read_data <- function(df) {
full_path <- paste0("https://raw.github.com/scunning1975/mixtape/master/", df)
return(haven::read_dta(full_path))
}
```
## Prison Construction and Black Male Incarceration
```{r texas, exercise=TRUE, echo=FALSE}
texas <- read_data("texas.dta") %>%
as.data.frame(.)
dataprep_out <- dataprep(
foo = texas,
predictors = c("poverty", "income"),
predictors.op = "mean",
time.predictors.prior = 1985:1993,
special.predictors = list(
list("bmprison", c(1988, 1990:1992), "mean"),
list("alcohol", 1990, "mean"),
list("aidscapita", 1990:1991, "mean"),
list("black", 1990:1992, "mean"),
list("perc1519", 1990, "mean")),
dependent = "bmprison",
unit.variable = "statefip",
unit.names.variable = "state",
time.variable = "year",
treatment.identifier = 48,
controls.identifier = c(1,2,4:6,8:13,15:42,44:47,49:51,53:56),
time.optimize.ssr = 1985:1993,
time.plot = 1985:2000
)
synth_out <- synth(data.prep.obj = dataprep_out)
cli::cli_h1("Path Plot")
path.plot(synth_out, dataprep_out)
cli::cli_h1("Gap Plot")
gaps.plot(synth_out, dataprep_out)
cli::cli_h1("Placebos")
placebos <- generate.placebos(dataprep_out, synth_out, Sigf.ipop = 3)
plot_placebos(placebos)
mspe.plot(placebos, discard.extreme = TRUE, mspe.limit = 1, plot.hist = TRUE)
```
#### Questions
- In your own words, what do you think the identifying assumptions are for synthetic control to be consistent?
- What role, if any, does parallel trends play in synthetic control?
- Who is the unit with the largest ratio of post to pre RMSPE?
- Compare the unit with the largest post to pre RMSPE estimated effect to the Texas effect. How do the weights compare? How do the size of the effects compare? How do the ``signs`` of the effects compare?
- Can you improve on my fit by experimenting with different combinations? Do so and report your analysis.
- Report results from a variety of different specifications. How robust does the prison effect appear to be?