Skip to content

Commit

Permalink
.
Browse files Browse the repository at this point in the history
  • Loading branch information
gladishd committed May 22, 2018
1 parent b05047d commit 69a5cdd
Show file tree
Hide file tree
Showing 3 changed files with 16 additions and 11 deletions.
Binary file added Case-Study-3-CodeSup.pdf
Binary file not shown.
20 changes: 11 additions & 9 deletions Case-Study-3-CodeSup.rmd
Original file line number Diff line number Diff line change
Expand Up @@ -92,33 +92,33 @@ summary(stp.inter.fwd)
# and between unionized status and party preference,
# I have created some plots of gender, region, and union.
plot(allEffects(glm.base), rows = 1, cols = 3, type = "link",
plot(allEffects(stp.inter.fwd), rows = 2, cols = 3, type = "link",
ylab = "Log(Odds of Democratic Party Support)")
```

As we can see from the plots, males have lower odds of supporting the Democratic party.

Additionally, people in North Carolina and the Southern region have lower odds of supporting the Democratic party.
Although people in North Carolina and the Southern region have lower odds of supporting the Democratic party, our model has been simplified such that these do not matter.

Those who are not in unions also have lower odds of supporting the Democratic party.

For further analysis of the probability that any given individual supports the Democrats, we can use the following code:

```{r, message = F, warning = F}
plot(Effect(c("gender", "region", "union"), glm.base), multiline = TRUE, type = "response", ylab = "Probability(Democrat)")
plot(Effect(c("gender", "union"), stp.inter.fwd), multiline = TRUE, type = "response", ylab = "Probability(Democrat)")
```

This code allows us to more clearly see that Support of the Democratic Party tends to come from people who are in regions NE and W, who are in unions, and who are female.
This code allows us to more clearly see that Support of the Democratic Party tends to come from people who are in unions and who are female.

More specifically, Unionization seems to have the largest effect on support, followed by Gender and then Region.

NOW, we need to assess the significance of these effects regardless of time.

```{r, message = F, warning = F}
for (i in c(2, 3, 4, 5, 6)) {
coefficient <- coef(glm.base)[i]
standardError <- sqrt(vcov(glm.base)[i,i])
for (i in c(2, 3, 4, 5, 6, 7, 8)) {
coefficient <- coef(stp.inter.fwd)[i]
standardError <- sqrt(vcov(stp.inter.fwd)[i,i])
waldStat <- (coefficient / standardError)^2
print(1-pchisq(waldStat, df = 1))
}
Expand All @@ -136,9 +136,11 @@ anova(gender_only, glm.base, test = "Chisq")

shows that we can reject the notion that the other coefficients are not necessary.

```{r, message = F, warning = F}
plot(glm.base$residuals)
```



The residuals plot shows that our model generally fits the data.



Expand Down
7 changes: 5 additions & 2 deletions Case-Study-3-WriteUp.rmd
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,12 @@ colnames(m) <- c(" ", "Estimate", "Standard error", "z value", "P-value")
pander(m, caption = "Important Coefficients of our Logistic Regression Model")
```

The following set of scatterplots represent what is essentially the relationship between our estimated model and the data for the years 1980 and 2000.
The following set of plots represent what is essentially the relationship between our estimated model and the data for the years 1980 and 2000.

```{r, message = F, warning = F, echo = F}
library(ggformula)
```

_____

\ \ \ \ \ \ \ Through exploratory data analysis of significance and association, we found that interaction variables gave us a closer fit to the data.

Expand Down

0 comments on commit 69a5cdd

Please sign in to comment.