Skip to content

Commit f476f82

Browse files
committed
final versions of all materials on womens AFL
1 parent 8924b55 commit f476f82

File tree

14 files changed

+2439
-470
lines changed

14 files changed

+2439
-470
lines changed

AFLW/data/players.csv

+470-470
Large diffs are not rendered by default.

AFLW/lab_exercises/lab.Rmd

+149
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,149 @@
1+
---
2+
title: "BDCD Business Analytics Lab exercise"
3+
author: "Prof Di Cook, with Steph Kobakian, Stuart Lee, Mitch O'Hara-Wild"
4+
date: "Econometrics & Bus Stat, Monash, Clayton campus, 5/4/2018"
5+
output:
6+
tufte::tufte_html: default
7+
tufte::tufte_handout:
8+
citation_package: natbib
9+
latex_engine: xelatex
10+
---
11+
12+
```{r setup, include=FALSE}
13+
knitr::opts_chunk$set(
14+
echo = FALSE,
15+
message = FALSE,
16+
warning = FALSE,
17+
error = FALSE)
18+
```
19+
20+
## Motivation
21+
22+
Business analytics involves mathematics and computing. This lab exercise has a little of both. For the coding part, you will want to copy, pull apart and put together again.
23+
24+
Sports provide a lot of statistics about players and teams. In this exercise we are going to look at the statistics collected for the Women's Australian Rules Football leagues over the past two years. We have stats for players and for teams. These are averages across all the games: Kicks, Handballs, Dispatch efficiency (%), Marks, Frees_Agst, Goals, Behinds, Goal_assists, Time_On_Ground.
25+
26+
Materials for the workshop can be downloaded from [https://github.com/BDCD18/BusAn](https://github.com/BDCD18/BusAn).
27+
28+
Data collected from [http://www.afl.com.au/womens/matches/stats](http://www.afl.com.au/womens/matches/stats).
29+
30+
## Exercise 1: Background work
31+
32+
Find the web site for the competition, to determine the answers for these questions.
33+
34+
1. What team won the competition in 2017, and 2018? Who were the runners-up in each year?
35+
2. What players were awarded best and fairest in each year?
36+
37+
```{r fig.margin=TRUE, fig.width=3.5, fig.height=7}
38+
library(tidyverse)
39+
players <- read_csv("AFLW/data/players.csv")
40+
ggplot(players, aes(x=Kicks, y=Disp_eff)) +
41+
geom_point() +
42+
facet_wrap(~Year, ncol=1) +
43+
theme(aspect.ratio=1) +
44+
xlab("Av Kicks/Game") +
45+
ylab("% Dispatch Efficiency")
46+
```
47+
48+
## Exercise 2: Explore the data
49+
50+
1. We have written a web app, using R, that you can access by
51+
- open the R project "AFLW"
52+
- open the file "app.R"
53+
- Click "Run App"
54+
2.
55+
a. Which player had the highest average kicks in 2017? 2018?
56+
b. Is the player who has the most kicks, also good at dispatching the ball? Name a few of the players who managed to dispatch 100% of the time, and also the ones who never manage to dispatch the ball.
57+
c. What was the highest average number of goals by a player in both seasons? Who achieved this? Is this different from the player with the most kicks? How?
58+
59+
60+
```{r fig.margin=TRUE, fig.width=3.5, fig.height=3.5}
61+
players <- players %>% mutate(GWS = ifelse(Club == "GWS", "yes", "no"))
62+
players_sel <- players %>% filter(Year == "2018") %>%
63+
filter(Player %in% c("Alicia Eva", "Jessica Dal Pos"))
64+
ggplot(filter(players, Year == "2018"), aes(x=Kicks, y=Disp_eff, colour=GWS)) +
65+
geom_point() +
66+
theme(aspect.ratio=1, legend.position = "none") +
67+
xlab("Av Kicks/Game") +
68+
ylab("% Dispatch Efficiency") +
69+
geom_label(data=players_sel, aes(label = Player),
70+
vjust="top", hjust="right", alpha=0.6) +
71+
scale_colour_brewer(palette="Dark2")
72+
```
73+
74+
## Exercise 3: Checking the news
75+
76+
1. The news article [AFL's edict to women players: make the game more entertaining](https://www.theaustralian.com.au/sport/afl/afls-edict-to-women-players-make-the-game-more-entertaining/news-story/44e9b896759ca76edf166b38499d149d) suggests that the organisers felt the games were too low scoring in the first season.
77+
a. Has this changed in season 2?
78+
b. Make a plot of goals vs kicks to help to answer this question.
79+
80+
2. [AFLW best and fairest: AFL rejects claim that votes were awarded to wrong player](https://www.foxsports.com.au/afl/womens-afl/aflw-best-and-fairest-afl-rejects-claim-that-votes-were-awarded-to-wrong-player/news-story/c0ae1205ccfecc7481dcf4edc45ed692)
81+
a. What two players are mentioned in the article?
82+
b. Colour the team discussed in the article.
83+
c. Does the data suggest that if one of these players got votes, it was a mistake? And that they should have gone to the other player? If so, why?
84+
85+
```{r fig.margin=TRUE, fig.width=3.5, fig.height=7}
86+
players_sub17 <- players %>%
87+
filter(Year == "2017")
88+
players_sub18 <- players %>%
89+
filter(Year == "2018")
90+
players_sub_mat17 <- as.matrix(players_sub17[,4:12])
91+
players_sub_mat17 <- apply(players_sub_mat17, 2, scale)
92+
players_sub_mat18 <- as.matrix(players_sub18[,4:12])
93+
players_sub_mat18 <- apply(players_sub_mat18, 2, scale)
94+
players_mds17 <- cmdscale(dist(players_sub_mat17), k=2)
95+
players_mds18 <- cmdscale(dist(players_sub_mat18), k=2)
96+
players_mds_df17 <- as_tibble(players_mds17)
97+
players_mds_df17$Player <- players$Player[players$Year == "2017"]
98+
players_mds_df17$Year <- "2017"
99+
players_mds_df18 <- as_tibble(players_mds18)
100+
players_mds_df18$Player <- players$Player[players$Year == "2018"]
101+
players_mds_df18$Year <- "2018"
102+
players_mds_df <- bind_rows(players_mds_df17, players_mds_df18)
103+
p1 <- players_mds_df %>% filter(Year == "2017", Player == "Erin Phillips")
104+
p2 <- players_mds_df %>% filter(Year == "2018", Player == "Emma Kearney")
105+
pl <- bind_rows(p1, p2)
106+
ggplot(players_mds_df, aes(x=V1, y=V2, label=Player)) +
107+
geom_point() + facet_wrap(~Year, scales="free", ncol=1) +
108+
geom_point(data=pl, colour = "orange", size=2) +
109+
geom_label(data=pl, aes(label = Player),
110+
vjust="top", hjust="left", alpha=0.6)
111+
```
112+
113+
## Exercise 4: Similarity of players
114+
115+
- Change the variables being combined to make the low-dimensional picture. Does the plot change? Explain why this happens based on the formulae given in the lecture notes.
116+
- Color the points by teams. Is there much difference between teams? If so, or if not, what does this mean?
117+
- Find the most valuable player awardee in each year, and look at the nearest neighbours of those players. Were they also nominees for the award?
118+
119+
```{r fig.margin=TRUE, fig.width=3.5, fig.height=7}
120+
library(forcats)
121+
ggplot(players, aes(x=fct_relevel(Club, c("COLL", "CARL", "ADEL", "MELB", "WB", "BL", "FRE", "GWS")), y=Kicks)) +
122+
geom_boxplot() +
123+
facet_wrap(~Year, ncol=1) + xlab("Club")
124+
```
125+
126+
## Exercise 5: Your turn to code
127+
128+
1. Easy tasks:
129+
a. Change the title from "Exploring the AFLW statistics" to "Women's AFL Statistics Exploration"
130+
b. Change the label of tab 1 from "Data" to "Basic plots"
131+
2. Medium task:
132+
a. Change the tab 2 title to "Boxplots"
133+
b. Remove the "X" variable input menu
134+
c. Change the type of plot on tab 1 to be a side-by-side boxplot of the statistic by Club.
135+
3. Difficult: Make a new app to study the similarity of the teams. The steps to do this are:
136+
a. Copy and paste the entire directory code for the "AFLW" app. Rename it.
137+
b. Read in the teams data, and pass this into the analysis functions.
138+
c. Change quite a few things: the list of variables to use for the MDS, name of the tab, label by Club instead of Player, ....
139+
140+
With your new app, using ONLY this set of statistics Kicks, Handballs, Dispatch efficiency, Frees for and against, Goals and Behinds, which teams are most similar based on being close together in the MDS plot? Which are very different from each other?
141+
142+
## Turn in
143+
144+
Each group needs to provide to the instructor:
145+
146+
1. A document with answers to each of the questions
147+
2. A copy of your app code
148+
3. A presentation slide with a plot of the data, and a sentence about what you've learned about women's AFL. (HINT: This part will have the most points for judging the best team, so be creative and informative.)
149+

AFLW/lab_exercises/lab.html

+177
Large diffs are not rendered by default.

AFLW/lab_exercises/lab.pdf

113 KB
Binary file not shown.

AFLW/lab_exercises/lab_sol.Rmd

+155
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,155 @@
1+
---
2+
title: "BDCD Business Analytics Lab exercise <br> SOLUTION"
3+
author: "Prof Di Cook, with Stephanie Kobakian, Stuart Lee, Mitch O'Hara-Wild"
4+
date: "Econometrics & Bus Stat, Monash, Clayton campus, 20/4/2017"
5+
output:
6+
tufte::tufte_html: default
7+
tufte::tufte_handout:
8+
citation_package: natbib
9+
latex_engine: xelatex
10+
---
11+
12+
```{r setup, include=FALSE}
13+
knitr::opts_chunk$set(
14+
echo = FALSE,
15+
message = FALSE,
16+
warning = FALSE,
17+
error = FALSE)
18+
```
19+
20+
21+
## Exercise 1: Background work
22+
23+
Find the web site for the competition, to determine the answers for these questions.
24+
25+
1. What team won the competition in 2017, and 2018? Who were the runners-up in each year?
26+
*2017 - Adelaide Crows, 2018 - Western Bulldogs*
27+
2. What players were awarded best and fairest in each year?
28+
*2017 - Erin Philips, 2018 - Emma Kearney*
29+
30+
31+
```{r fig.margin=TRUE, fig.width=3.5, fig.height=7}
32+
library(tidyverse)
33+
players <- read_csv("AFLW/data/players.csv")
34+
ggplot(players, aes(x=Kicks, y=Disp_eff)) +
35+
geom_point() +
36+
facet_wrap(~Year, ncol=1) +
37+
theme(aspect.ratio=1) +
38+
xlab("Av Kicks/Game") +
39+
ylab("% Dispatch Efficiency")
40+
```
41+
42+
## Exercise 2: Explore the data
43+
44+
1. We have written a web app, using R, that you can access by
45+
- open the R project "AFLW"
46+
- open the file "app.R"
47+
- Click "Run App"
48+
2.
49+
a. Which player had the highest average kicks in 2017? 2018?
50+
*2017 - Erin Philips, 2018 - Emma Kearney*
51+
b. Is the player who has the most kicks, also good at dispatching the ball? Name a few of the players who managed to dispatch 100% of the time, and also the ones who never manage to dispatch the ball.
52+
*They are about average in the dispatch efficiency*<br>
53+
*100% of dispatches: Kim Mickle, Natalie Plane, Sarah Lampard*<br>
54+
*0%: Taryn Priestly, Romy Timmins, Lou Wotton, Hannah Dunn, Ruby Blair, Louise Stephenson*
55+
c. What was the highest average number of goals by a player in both seasons? Who achieved this? Is this different from the player with the most kicks? How?
56+
*2017 - 2.0, Darcy Vescio, 2018 - 1.6, Jess Weuchtner*<br>
57+
*Yes, these tended to be only average in the number of kicks made.*
58+
59+
60+
```{r fig.margin=TRUE, fig.width=3.5, fig.height=3.5}
61+
players <- players %>% mutate(GWS = ifelse(Club == "GWS", "yes", "no"))
62+
players_sel <- players %>% filter(Year == "2018") %>%
63+
filter(Player %in% c("Alicia Eva", "Jessica Dal Pos"))
64+
ggplot(filter(players, Year == "2018"), aes(x=Kicks, y=Disp_eff, colour=GWS)) +
65+
geom_point() +
66+
theme(aspect.ratio=1, legend.position = "none") +
67+
xlab("Av Kicks/Game") +
68+
ylab("% Dispatch Efficiency") +
69+
geom_label(data=players_sel, aes(label = Player),
70+
vjust="top", hjust="right", alpha=0.6) +
71+
scale_colour_brewer(palette="Dark2")
72+
ggplot(players, aes(x=Kicks, y=Goals)) +
73+
geom_point() +
74+
theme(aspect.ratio=1, legend.position = "none") +
75+
xlab("Av Kicks/Game") +
76+
ylab("Goals") + facet_wrap(~Year, ncol=1)
77+
```
78+
79+
## Exercise 3: Checking the news
80+
81+
1. The news article [AFL's edict to women players: make the game more entertaining](https://www.theaustralian.com.au/sport/afl/afls-edict-to-women-players-make-the-game-more-entertaining/news-story/44e9b896759ca76edf166b38499d149d) suggests that the organisers felt the games were too low scoring in the first season.
82+
a. Has this changed in season 2?
83+
*There is not a lot of change. It does look like a few more players have slightly higher averages.*
84+
b. Make a plot of goals vs kicks to help to answer this question.
85+
86+
2. [AFLW best and fairest: AFL rejects claim that votes were awarded to wrong player](https://www.foxsports.com.au/afl/womens-afl/aflw-best-and-fairest-afl-rejects-claim-that-votes-were-awarded-to-wrong-player/news-story/c0ae1205ccfecc7481dcf4edc45ed692)
87+
a. What two players are mentioned in the article? *Jessica Dal Pos, Alicia Eva*
88+
b. Colour the team discussed in the article.
89+
c. Does the data suggest that if one of these players got votes, it was a mistake? And that they should have gone to the other player? If so, why? *Yes, Alicia Eva's statistics were far better than Jessica Dal Pos, which suggests umpires got mixed up.*
90+
91+
```{r fig.margin=TRUE, fig.width=3.5, fig.height=7}
92+
players_sub17 <- players %>%
93+
filter(Year == "2017")
94+
players_sub18 <- players %>%
95+
filter(Year == "2018")
96+
players_sub_mat17 <- as.matrix(players_sub17[,4:12])
97+
players_sub_mat17 <- apply(players_sub_mat17, 2, scale)
98+
players_sub_mat18 <- as.matrix(players_sub18[,4:12])
99+
players_sub_mat18 <- apply(players_sub_mat18, 2, scale)
100+
players_mds17 <- cmdscale(dist(players_sub_mat17), k=2)
101+
players_mds18 <- cmdscale(dist(players_sub_mat18), k=2)
102+
players_mds_df17 <- as_tibble(players_mds17)
103+
players_mds_df17$Player <- players$Player[players$Year == "2017"]
104+
players_mds_df17$Year <- "2017"
105+
players_mds_df18 <- as_tibble(players_mds18)
106+
players_mds_df18$Player <- players$Player[players$Year == "2018"]
107+
players_mds_df18$Year <- "2018"
108+
players_mds_df <- bind_rows(players_mds_df17, players_mds_df18)
109+
p1 <- players_mds_df %>% filter(Year == "2017", Player == "Erin Phillips")
110+
p2 <- players_mds_df %>% filter(Year == "2018", Player == "Emma Kearney")
111+
pl <- bind_rows(p1, p2)
112+
ggplot(players_mds_df, aes(x=V1, y=V2, label=Player)) +
113+
geom_point() + facet_wrap(~Year, scales="free", ncol=1) +
114+
geom_point(data=pl, colour = "orange", size=2) +
115+
geom_label(data=pl, aes(label = Player),
116+
vjust="top", hjust="left", alpha=0.6)
117+
```
118+
119+
## Exercise 4: Similarity of players
120+
121+
- Change the variables being combined to make the low-dimensional picture. Does the plot change? Explain why this happens based on the formulae given in the lecture notes.*The distance between players changes based on which statistics are used.*
122+
- Color the points by teams. Is there much difference between teams? If so, or if not, what does this mean? *Not so much. This suggests that the teams are quite balanced and similar. The purpose is to have close matches, and there is little competition for outbidding for best players.*
123+
- Find the most valuable player awardee in each year, and look at the nearest neighbours of those players. Were they also nominees for the award?
124+
125+
```{r fig.margin=TRUE, fig.width=3.5, fig.height=7}
126+
library(forcats)
127+
ggplot(players, aes(x=fct_relevel(Club, c("COLL", "CARL", "ADEL", "MELB", "WB", "BL", "FRE", "GWS")), y=Kicks)) +
128+
geom_boxplot() +
129+
facet_wrap(~Year, ncol=1) + xlab("Club")
130+
```
131+
132+
## Exercise 5: Your turn to code
133+
134+
1. Easy tasks:
135+
a. Change the title from "Exploring the AFLW statistics" to "Women's AFL Statistics Exploration"
136+
b. Change the label of tab 1 from "Data" to "Basic plots"
137+
2. Medium task:
138+
a. Change the tab 2 title to "Boxplots"
139+
b. Remove the "X" variable input menu
140+
c. Change the type of plot on tab 1 to be a side-by-side boxplot of the statistic by Club.
141+
3. Difficult: Make a new app to study the similarity of the teams. The steps to do this are:
142+
a. Copy and paste the entire directory code for the "AFLW" app. Rename it.
143+
b. Read in the teams data, and pass this into the analysis functions.
144+
c. Change quite a few things: the list of variables to use for the MDS, name of the tab, label by Club instead of Player, ....
145+
146+
With your new app, using ONLY this set of statistics Kicks, Handballs, Dispatch efficiency, Frees for and against, Goals and Behinds, which teams are most similar based on being close together in the MDS plot? Which are very different from each other?
147+
148+
## Turn in
149+
150+
Each group needs to provide to the instructor:
151+
152+
1. A document with answers to each of the questions
153+
2. A copy of your new app code
154+
3. A presentation slide with a plot of the data, and a sentence about what you've learned about women's AFL. (HINT: This part will have the most points for judging the best team, so be creative and informative.)
155+

AFLW/lab_exercises/lab_sol.html

+175
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)