|
| 1 | +--- |
| 2 | +title: "BDCD Business Analytics Lab exercise <br> SOLUTION" |
| 3 | +author: "Prof Di Cook, with Stephanie Kobakian, Stuart Lee, Mitch O'Hara-Wild" |
| 4 | +date: "Econometrics & Bus Stat, Monash, Clayton campus, 20/4/2017" |
| 5 | +output: |
| 6 | + tufte::tufte_html: default |
| 7 | + tufte::tufte_handout: |
| 8 | + citation_package: natbib |
| 9 | + latex_engine: xelatex |
| 10 | +--- |
| 11 | + |
| 12 | +```{r setup, include=FALSE} |
| 13 | +knitr::opts_chunk$set( |
| 14 | + echo = FALSE, |
| 15 | + message = FALSE, |
| 16 | + warning = FALSE, |
| 17 | + error = FALSE) |
| 18 | +``` |
| 19 | + |
| 20 | + |
| 21 | +## Exercise 1: Background work |
| 22 | + |
| 23 | +Find the web site for the competition, to determine the answers for these questions. |
| 24 | + |
| 25 | +1. What team won the competition in 2017, and 2018? Who were the runners-up in each year? |
| 26 | + *2017 - Adelaide Crows, 2018 - Western Bulldogs* |
| 27 | +2. What players were awarded best and fairest in each year? |
| 28 | + *2017 - Erin Philips, 2018 - Emma Kearney* |
| 29 | + |
| 30 | + |
| 31 | +```{r fig.margin=TRUE, fig.width=3.5, fig.height=7} |
| 32 | +library(tidyverse) |
| 33 | +players <- read_csv("AFLW/data/players.csv") |
| 34 | +ggplot(players, aes(x=Kicks, y=Disp_eff)) + |
| 35 | + geom_point() + |
| 36 | + facet_wrap(~Year, ncol=1) + |
| 37 | + theme(aspect.ratio=1) + |
| 38 | + xlab("Av Kicks/Game") + |
| 39 | + ylab("% Dispatch Efficiency") |
| 40 | +``` |
| 41 | + |
| 42 | +## Exercise 2: Explore the data |
| 43 | + |
| 44 | +1. We have written a web app, using R, that you can access by |
| 45 | +- open the R project "AFLW" |
| 46 | +- open the file "app.R" |
| 47 | +- Click "Run App" |
| 48 | +2. |
| 49 | + a. Which player had the highest average kicks in 2017? 2018? |
| 50 | + *2017 - Erin Philips, 2018 - Emma Kearney* |
| 51 | + b. Is the player who has the most kicks, also good at dispatching the ball? Name a few of the players who managed to dispatch 100% of the time, and also the ones who never manage to dispatch the ball. |
| 52 | + *They are about average in the dispatch efficiency*<br> |
| 53 | + *100% of dispatches: Kim Mickle, Natalie Plane, Sarah Lampard*<br> |
| 54 | + *0%: Taryn Priestly, Romy Timmins, Lou Wotton, Hannah Dunn, Ruby Blair, Louise Stephenson* |
| 55 | + c. What was the highest average number of goals by a player in both seasons? Who achieved this? Is this different from the player with the most kicks? How? |
| 56 | + *2017 - 2.0, Darcy Vescio, 2018 - 1.6, Jess Weuchtner*<br> |
| 57 | + *Yes, these tended to be only average in the number of kicks made.* |
| 58 | + |
| 59 | + |
| 60 | +```{r fig.margin=TRUE, fig.width=3.5, fig.height=3.5} |
| 61 | +players <- players %>% mutate(GWS = ifelse(Club == "GWS", "yes", "no")) |
| 62 | +players_sel <- players %>% filter(Year == "2018") %>% |
| 63 | + filter(Player %in% c("Alicia Eva", "Jessica Dal Pos")) |
| 64 | +ggplot(filter(players, Year == "2018"), aes(x=Kicks, y=Disp_eff, colour=GWS)) + |
| 65 | + geom_point() + |
| 66 | + theme(aspect.ratio=1, legend.position = "none") + |
| 67 | + xlab("Av Kicks/Game") + |
| 68 | + ylab("% Dispatch Efficiency") + |
| 69 | + geom_label(data=players_sel, aes(label = Player), |
| 70 | + vjust="top", hjust="right", alpha=0.6) + |
| 71 | + scale_colour_brewer(palette="Dark2") |
| 72 | +ggplot(players, aes(x=Kicks, y=Goals)) + |
| 73 | + geom_point() + |
| 74 | + theme(aspect.ratio=1, legend.position = "none") + |
| 75 | + xlab("Av Kicks/Game") + |
| 76 | + ylab("Goals") + facet_wrap(~Year, ncol=1) |
| 77 | +``` |
| 78 | + |
| 79 | +## Exercise 3: Checking the news |
| 80 | + |
| 81 | +1. The news article [AFL's edict to women players: make the game more entertaining](https://www.theaustralian.com.au/sport/afl/afls-edict-to-women-players-make-the-game-more-entertaining/news-story/44e9b896759ca76edf166b38499d149d) suggests that the organisers felt the games were too low scoring in the first season. |
| 82 | + a. Has this changed in season 2? |
| 83 | + *There is not a lot of change. It does look like a few more players have slightly higher averages.* |
| 84 | + b. Make a plot of goals vs kicks to help to answer this question. |
| 85 | + |
| 86 | +2. [AFLW best and fairest: AFL rejects claim that votes were awarded to wrong player](https://www.foxsports.com.au/afl/womens-afl/aflw-best-and-fairest-afl-rejects-claim-that-votes-were-awarded-to-wrong-player/news-story/c0ae1205ccfecc7481dcf4edc45ed692) |
| 87 | + a. What two players are mentioned in the article? *Jessica Dal Pos, Alicia Eva* |
| 88 | + b. Colour the team discussed in the article. |
| 89 | + c. Does the data suggest that if one of these players got votes, it was a mistake? And that they should have gone to the other player? If so, why? *Yes, Alicia Eva's statistics were far better than Jessica Dal Pos, which suggests umpires got mixed up.* |
| 90 | + |
| 91 | +```{r fig.margin=TRUE, fig.width=3.5, fig.height=7} |
| 92 | +players_sub17 <- players %>% |
| 93 | + filter(Year == "2017") |
| 94 | +players_sub18 <- players %>% |
| 95 | + filter(Year == "2018") |
| 96 | +players_sub_mat17 <- as.matrix(players_sub17[,4:12]) |
| 97 | +players_sub_mat17 <- apply(players_sub_mat17, 2, scale) |
| 98 | +players_sub_mat18 <- as.matrix(players_sub18[,4:12]) |
| 99 | +players_sub_mat18 <- apply(players_sub_mat18, 2, scale) |
| 100 | +players_mds17 <- cmdscale(dist(players_sub_mat17), k=2) |
| 101 | +players_mds18 <- cmdscale(dist(players_sub_mat18), k=2) |
| 102 | +players_mds_df17 <- as_tibble(players_mds17) |
| 103 | +players_mds_df17$Player <- players$Player[players$Year == "2017"] |
| 104 | +players_mds_df17$Year <- "2017" |
| 105 | +players_mds_df18 <- as_tibble(players_mds18) |
| 106 | +players_mds_df18$Player <- players$Player[players$Year == "2018"] |
| 107 | +players_mds_df18$Year <- "2018" |
| 108 | +players_mds_df <- bind_rows(players_mds_df17, players_mds_df18) |
| 109 | +p1 <- players_mds_df %>% filter(Year == "2017", Player == "Erin Phillips") |
| 110 | +p2 <- players_mds_df %>% filter(Year == "2018", Player == "Emma Kearney") |
| 111 | +pl <- bind_rows(p1, p2) |
| 112 | +ggplot(players_mds_df, aes(x=V1, y=V2, label=Player)) + |
| 113 | + geom_point() + facet_wrap(~Year, scales="free", ncol=1) + |
| 114 | + geom_point(data=pl, colour = "orange", size=2) + |
| 115 | + geom_label(data=pl, aes(label = Player), |
| 116 | + vjust="top", hjust="left", alpha=0.6) |
| 117 | +``` |
| 118 | + |
| 119 | +## Exercise 4: Similarity of players |
| 120 | + |
| 121 | +- Change the variables being combined to make the low-dimensional picture. Does the plot change? Explain why this happens based on the formulae given in the lecture notes.*The distance between players changes based on which statistics are used.* |
| 122 | +- Color the points by teams. Is there much difference between teams? If so, or if not, what does this mean? *Not so much. This suggests that the teams are quite balanced and similar. The purpose is to have close matches, and there is little competition for outbidding for best players.* |
| 123 | +- Find the most valuable player awardee in each year, and look at the nearest neighbours of those players. Were they also nominees for the award? |
| 124 | + |
| 125 | +```{r fig.margin=TRUE, fig.width=3.5, fig.height=7} |
| 126 | +library(forcats) |
| 127 | +ggplot(players, aes(x=fct_relevel(Club, c("COLL", "CARL", "ADEL", "MELB", "WB", "BL", "FRE", "GWS")), y=Kicks)) + |
| 128 | + geom_boxplot() + |
| 129 | + facet_wrap(~Year, ncol=1) + xlab("Club") |
| 130 | +``` |
| 131 | + |
| 132 | +## Exercise 5: Your turn to code |
| 133 | + |
| 134 | +1. Easy tasks: |
| 135 | + a. Change the title from "Exploring the AFLW statistics" to "Women's AFL Statistics Exploration" |
| 136 | + b. Change the label of tab 1 from "Data" to "Basic plots" |
| 137 | +2. Medium task: |
| 138 | + a. Change the tab 2 title to "Boxplots" |
| 139 | + b. Remove the "X" variable input menu |
| 140 | + c. Change the type of plot on tab 1 to be a side-by-side boxplot of the statistic by Club. |
| 141 | +3. Difficult: Make a new app to study the similarity of the teams. The steps to do this are: |
| 142 | + a. Copy and paste the entire directory code for the "AFLW" app. Rename it. |
| 143 | + b. Read in the teams data, and pass this into the analysis functions. |
| 144 | + c. Change quite a few things: the list of variables to use for the MDS, name of the tab, label by Club instead of Player, .... |
| 145 | + |
| 146 | +With your new app, using ONLY this set of statistics Kicks, Handballs, Dispatch efficiency, Frees for and against, Goals and Behinds, which teams are most similar based on being close together in the MDS plot? Which are very different from each other? |
| 147 | + |
| 148 | +## Turn in |
| 149 | + |
| 150 | +Each group needs to provide to the instructor: |
| 151 | + |
| 152 | +1. A document with answers to each of the questions |
| 153 | +2. A copy of your new app code |
| 154 | +3. A presentation slide with a plot of the data, and a sentence about what you've learned about women's AFL. (HINT: This part will have the most points for judging the best team, so be creative and informative.) |
| 155 | + |
0 commit comments