diff --git a/Presentations-GeorgiaRS/202309/3-descriptive-statistics.Rmd b/Presentations-GeorgiaRS/202309/3-descriptive-statistics.Rmd index 19f3982..fda74e2 100644 --- a/Presentations-GeorgiaRS/202309/3-descriptive-statistics.Rmd +++ b/Presentations-GeorgiaRS/202309/3-descriptive-statistics.Rmd @@ -62,28 +62,47 @@ xaringanExtra::use_logo( # Table of contents // სარჩევი -- ADD +- [Introduction](#intro) +- [Piping](#piping) +- [Quick summary statistics](#quick-summary-stats) +- [Customized summary statistics](#customized-summary-stats) +- [Exporting table](#exporting-tables) +- [Customizing table outputs](#customiing-table-outputs) +- [Wrapping up](#wrapping-up) --- -# Introduction +class: inverse, center, middle +name: intro + +# Introduction // გაცნობა + +

+ +--- + +# Introduction // გაცნობა - We learned yesterday how to conduct statistical programming and export the results in `.csv` files - However, sometime we might need more refined tables than simple (and ugly) CSVs -ADD IMAGE, DATA WORK PIPELINE AND IN OUTPUTS THERE'S A CSV AND A NICE EXCEL +```{r echo = FALSE, out.width="95%"} +knitr::include_graphics("img/session3/data-work-descriptive-stats.png") +``` --- -# Introduction +# Introduction // გაცნობა - That's what today's session is about, along with an explanation of the pipes (`%>%`) -SAME IMAGE +```{r echo = FALSE, out.width="95%"} +knitr::include_graphics("img/session3/data-work-descriptive-stats.png") +``` --- -# Introduction +# Introduction // გაცნობა ## Exercise 1a: Getting the libraries for today's session @@ -96,46 +115,64 @@ install.packages("modelsummary") install.packages("huxtable") ``` -ADD IMAGE OF CONSOLE INSTALLING +```{r echo = FALSE, out.width="55%"} +knitr::include_graphics("img/session3/install.png") +``` --- -# Introduction +# Introduction // გაცნობა ## Exercise 1b: Download and load the data we'll use -1. Go to https://www.osf.io/XXXXX and download the file +.pull-left[ +1. Go to https://osf.io/z8snr and download the file -1. In RStudio, go to `File` > `Import Dataset` > `From Text (base)` and select the file `XXXXX.csv` +1. In RStudio, go to `File` > `Import Dataset` > `From Text (base)` and select the file `small_business_2019_all.csv` + If you don't know where the file is, remember to check in your `Downloads` folder 1. Select `Import` +] -1. Go to https://www.osf.io/XXX, download the file, and repeat steps 2-3 - -ADD SCREENSHOT OF IMPORT BOX +.pull-right[ +```{r echo = FALSE, out.width="85%"} +knitr::include_graphics("img/session3/import.png") +``` +] --- -# Introduction +# Introduction // გაცნობა -You should have two dataframes loaded in the environment after this. +You should have one dataframe loaded in the environment after this. -ADD IMAGE +```{r echo = FALSE, out.width="90%"} +knitr::include_graphics("img/session3/environment.png") +``` --- -# Introduction +# Introduction // გაცნობა ## Recap: always know your data! +- This data is similar to the one we used before +- Every row is one business in one tax period (month) +- `modified_id` is a business identifier +- We also have information about the region, firm age, monthly income, VAT liability +- There is one more variable we didn't see before: `group` contains the group the firm was assigned to in a random experiment + +```{r echo = FALSE, out.width="40%"} +knitr::include_graphics("img/session3/df.png") +``` + --- class: inverse, center, middle name: piping -# Piping // +# Piping

@@ -236,7 +273,6 @@ There are several important details to notice here: --- - # Piping .pull-left[ @@ -297,8 +333,6 @@ df_tbilisi_50 <- filter(small_business_2019, ``` ] -There are several important details to notice here: - 3.- Notice that the functions `arrange()` and `filter()` used after the pipes now have only **one argument instead of two**. This is because when using pipes the first argument is implied to be result of the function before the pipes --- @@ -333,7 +367,7 @@ df_tbilisi_50 <- filter(small_business_2019, ] .pull-right[ -```{r echo = FALSE, out.width="75%"} +```{r echo = FALSE, out.width="85%"} knitr::include_graphics("img/session3/env-pipes.png") ``` ] @@ -369,7 +403,30 @@ df_tbilisi_50 <- # Piping -ADD SOMETHING ON GOOD CODE AND CODE CLARITY? AND ELEGANT? +.pull-left[ +```{r eval=FALSE} +# Previous solution +df_tbilisi_50 <- filter(small_business_2019, + region == "Tbilisi") %>% + arrange(-income) %>% + filter(row_number() <= 50) +``` +] + +.pull-right[ +```{r eval=FALSE} +# The same with better spacing +df_tbilisi_50 <- + small_business_2019 %>% + filter(region == "Tbilisi") %>% + arrange(-income) %>% + filter(row_number() <= 50) +``` +] + +- Good code is code that is both correct (does what it's supposed to) and it's easy to understand + +- Piping is **instrumental for writing good code in R** --- @@ -400,7 +457,7 @@ knitr::include_graphics("img/session3/pipes-joke.png") class: inverse, center, middle name: quick-summary-stats -# Quick summary statistics // +# Quick summary statistics // სწრაფი შემაჯამებელი სტატისტიკა

@@ -428,11 +485,13 @@ We learned yesterday how to produce dataframes with results and export them. - We chose a combination of both because together they export a large range of output types and allow fine-grained customization of outputs -ADD IMAGE, FROM UGLY CSV TO NICE EXCEL WITH LIBRARY BADGES +```{r echo = FALSE, out.width="95%"} +knitr::include_graphics("img/session3/csv-to-excel.png") +``` --- -# Quick summary statistics // +# Quick summary statistics We'll start by introducing the function `datasummary_skim()` from `modelsummary` @@ -459,28 +518,32 @@ datasummary_skim( --- -# Quick summary statistics // +# Quick summary statistics ## Exercise 3: Calculate quick summary statistics -1. Use `datasummary_skim()` to create a descriptive statistics table for `XXXX` +1. Load `modelsummary` with `library(modelsummary)` -```{r eval = FALSE} -datasummary_skim(XXXXX) +1. Use `datasummary_skim()` to create a descriptive statistics table for `small_business_all` + +```{r echo=FALSE} +small_business_2019_all <- read.csv("data/small_business_2019_all.csv") +``` + +```{r eval=FALSE} +datasummary_skim(small_business_2019_all) ``` --- -# Quick summary statistics // +# Quick summary statistics You should be seeing this result in the lower right panel of RStudio. -```{r eval = FALSE} -datasummary_skim(XXXXX) +```{r echo=FALSE} +datasummary_skim(small_business_2019_all) ``` -DROP EVAL=FALSE HERE SO RESULT IS DISPLAYED - --- # Quick summary statistics @@ -489,11 +552,17 @@ DROP EVAL=FALSE HERE SO RESULT IS DISPLAYED - To summarize categorical variables, use the argument `type = "categorical"` -```{r eval = FALSE} -datasummary_skim(XXXXX, type = "categorical") +```{r eval=FALSE} +datasummary_skim(small_business_2019_all, type = "categorical") ``` -DROP EVAL=FALSE HERE SO RESULT IS DISPLAYED +--- + +# Quick summary statistics + +```{r echo=FALSE} +datasummary_skim(small_business_2019_all, type = "categorical") +``` --- @@ -501,12 +570,10 @@ DROP EVAL=FALSE HERE SO RESULT IS DISPLAYED - `datasummary_skim()` is convenient because it's fast, easy, and shows a lot of information -```{r eval = FALSE} -datasummary_skim(XXXXX) +```{r echo = FALSE} +datasummary_skim(small_business_2019_all) ``` -DROP EVAL=FALSE HERE SO RESULT IS DISPLAYED - - But what if we wanted to customize what to show? that's when we use `datasummary()` instead, also from the library `modelsummary` --- @@ -514,13 +581,13 @@ DROP EVAL=FALSE HERE SO RESULT IS DISPLAYED class: inverse, center, middle name: customized-summary-stats -# Customized summary statistics // +# Customized summary statistics // მორგებული შემაჯამებელი სტატისტიკა

--- -# Customized summary statistics // +# Customized summary statistics `datasummary()` is very similar to `data_summary_skim()`. The only difference is that it requires a **formula argument**. @@ -543,42 +610,40 @@ datasummary( --- -# Customized summary statistics // +# Customized summary statistics ## Exercise 4: -Create a summary statistics table showing the nuber of observations, mean, standard deviation, minimum, and maximum for variables XXX, XXX, XXX, and XXX of the dataframe `XXX` +Create a summary statistics table showing the nuber of observations, mean, standard deviation, minimum, and maximum for variables `age`, `income`, and `vat_liability` of the dataframe `small_business_2019_all` 1. Use `datasummary()` for this: ```{r eval=FALSE} datasummary( - XX + XX + XX ~ N + Mean + SD + Min + Max, - XXXXX + age + income + vat_liability ~ N + Mean + SD + Min + Max, + small_business_2019_all ) ``` --- -# Customized summary statistics // +# Customized summary statistics -```{r eval=FALSE} +```{r echo=FALSE} datasummary( - XX + XX + XX ~ N + Mean + SD + Min + Max, - XXXXX + age + income + vat_liability ~ N + Mean + SD + Min + Max, + small_business_2019_all ) ``` -REMOVE EVAL= FALSE O SHOW RESULTS - --- -# Customized summary statistics // +# Customized summary statistics ```{r eval=FALSE} datasummary( - XX + XX + XX ~ N + Mean + SD + Min + Max, # this is the formula - XXXXXX # this is the data + age + income + vat_liability ~ N + Mean + SD + Min + Max, # this is the formula + small_business_2019_all # this is the data ) ``` @@ -591,12 +656,12 @@ Some notes: --- -# Customized summary statistics // +# Customized summary statistics ```{r eval=FALSE} datasummary( - XX + XX + XX ~ N + Mean + SD + Min + Max, # this is the formula - XXXXXX # this is the data + age + income + vat_liability ~ N + Mean + SD + Min + Max, # this is the formula + small_business_2019_all # this is the data ) ``` @@ -615,20 +680,20 @@ In this exercise we used the statistics N (number of observations), mean, SD (st class: inverse, center, middle name: exporting-tables -# Exporting tables // +# Exporting tables // მაგიდების ექსპორტი

--- -# Exporting tables // +# Exporting tables // მაგიდების ექსპორტი -Remember that both `datasummary_skim()` and `datasummary()` had an optional argument named *output*? We can use it to specify a file path for an output file. +Remember that both `datasummary_skim()` and `datasummary()` have an optional argument named *output*? We can use it to specify a file path for an output file. For example: ```{r eval=FALSE} -datasummary_skim(XXXXX, +datasummary_skim(small_business_2019_all, output = "quick_stats.docx") ``` @@ -636,7 +701,7 @@ Will export the result to the `Documents` folder (in Windows) in a Word file nam --- -# Exporting tables // +# Exporting tables // მაგიდების ექსპორტი The file type of the output is dictated by the file extension. For example: @@ -652,7 +717,7 @@ Noticed that we're missing Excel? --- -# Exporting tables // +# Exporting tables // მაგიდების ექსპორტი ## That's because the functions of `modelsummary` can't export to Excel @@ -664,38 +729,52 @@ Noticed that we're missing Excel? --- -# Exporting tables // +# Exporting tables // მაგიდების ექსპორტი ## Exercise 5: Export a table to Excel +1. Load `huxtable` with `library(huxtable)` + 1. Run the following code to export the result of `datasummary_skim()` to Excel: ```{r eval=FALSE} # Store the table in a new object -stats_table <- datasummary_skim(XXXXX, output = "huxtable") +stats_table <- datasummary_skim(small_business_2019_all, output = "huxtable") # Export this new object to Excel with quick_xlsx() -quick_xlsx(stats_table, "quick_stats.xlsx") +quick_xlsx(stats_table, file = "quick_stats.xlsx") ``` --- -# Exporting tables // +# Exporting tables // მაგიდების ექსპორტი Now the result will show in your `Documents` folder -ADD SCREENSHOT +```{r echo = FALSE, out.width="65%"} +knitr::include_graphics("img/session3/quick-stats-output.png") +``` --- -# Exporting tables // +# Exporting tables // მაგიდების ექსპორტი + +And you can open it with Excel for further customization if you want + +```{r echo = FALSE, out.width="65%"} +knitr::include_graphics("img/session3/quick-stats-excel.png") +``` + +--- + +# Exporting tables // მაგიდების ექსპორტი ```{r eval=FALSE} # Store the table in a new object -stats_table <- datasummary_skim(XXXXX, output = "huxtable") +stats_table <- datasummary_skim(small_business_2019_all, output = "huxtable") # Export this new object to Excel with quick_xlsx() -quick_xlsx(stats_table, "quick_stats.xlsx") +quick_xlsx(stats_table, file = "quick_stats.xlsx") ``` Some comments about this code: @@ -709,15 +788,15 @@ Some comments about this code: class: inverse, center, middle name: customizing-table-outputs -# Customizing table outputs // +# Customizing table outputs // ცხრილის შედეგების მორგება

--- -# Customizing table outputs // +# Customizing table outputs -The code belows shows how the table `stats_table` can be formatted: +The code below shows how the table `stats_table` can be formatted: .pull-left[ ```{r eval=FALSE} @@ -738,7 +817,9 @@ stats_table %>% .pull-right[ .small[ -```{r, eval = F, message = F} +```{r echo=FALSE, message=FALSE, warning=FALSE} +stats_table <- datasummary_skim(small_business_2019_all, output = "huxtable") + # Format table stats_table %>% set_header_rows(1, TRUE) %>% # Use first row as table header @@ -752,12 +833,12 @@ stats_table %>% --- -# Customizing table outputs // +# Customizing table outputs ## Exercise 6: Export a customized table to Excel .pull-left[ -1. Customize `stats_table` in a new object called `stats_table_custom` +1.- Customize `stats_table` in a new object called `stats_table_custom` ```{r eval = FALSE} stats_table_custom <- stats_table %>% @@ -775,28 +856,27 @@ stats_table_custom <- stats_table %>% ] .pull-right[ -2. Export `stats_table_custom` to a file named `stats-custom.xlsx` with `quick_xlsx()` +2.- Export `stats_table_custom` to a file named `stats-custom.xlsx` with `quick_xlsx()` ```{r eval=FALSE} quick_xlsx( stats_table_custom, - "stats-custom.xlsx" + file = "stats-custom.xlsx" ) -) ``` ] --- -# Customizing table outputs // - -SCREENSHOT OF OUTPUT IN DOCUMENTS FOLDER +# Customizing table outputs -ALSO SCREENSHOT OF TABLE OPENED IN EXCEL +```{r echo = FALSE, out.width="55%"} +knitr::include_graphics("img/session3/stats-custom.png") +``` --- -# Customizing table outputs // +# Customizing table outputs Notice that here in the first part of the exercise we stored the result in a new object @@ -814,86 +894,110 @@ This is the object that we export later with `quick_xslx()` ```{r eval=FALSE} quick_xlsx( stats_table_custom, - "stats-custom.xlsx" + file = "stats-custom.xlsx" ) -) ``` --- -# Customizing table outputs // +# Customizing table outputs .pull-left[ -## Before +**Before:** + +```{r echo = FALSE, out.width="95%"} +knitr::include_graphics("img/session3/quick-stats-excel.png") +``` ] .pull-right[ -## After +**After:** +```{r echo = FALSE, out.width="95%"} +knitr::include_graphics("img/session3/stats-custom.png") +``` ] --- -# Customizing table outputs // +# Customizing table outputs -We used `theme_basic()` to give a minimalist, basic theme to the table. Other available themes are: +We used `theme_basic()` to give a minimalistic, basic theme to the table. Other available themes are: -ADD IMAGE +```{r echo = FALSE, out.width="75%"} +knitr::include_graphics("img/session3/themes.png") +``` --- class: inverse, center, middle name: wrapping-up -# Wrapping up // +# Wrapping up // შეფუთვა

--- -# Wrapping up // +# Wrapping up // შეფუთვა ## Save your work! Click the floppy disk to save the script you wrote in this session. -ADD IMAGE +```{r echo = FALSE, out.width="55%"} +knitr::include_graphics("img/session3/save.png") +``` --- -# Wrapping up // +# Wrapping up // შეფუთვა ## What else is available? - This was a short overview of how `modelsummary` and `huxtable` work together to produce professional-looking table outputs in R -- Other formatting options are: +- Other formatting options are: (all from `huxtable`) - + Export in new Excel tabs instead of new files - + Change variable and row names - + Change the width of columns or height of rows - + Color cells - + Use text formatting like bold or italics - -- This is explained in the libraries documentation: +| Formatting | Command | +|------------|---------| +|Export in new Excel tabs instead of new files|`as_Workbook()`| +|Change row names|`add_rownames()`| +|Change column names|`add_colnames()`| +|Cells in bold|`set_bold()`| +|Cells in italics|`set_italic()`| +|Cell font size|`font_size()`| +|Cell color|`background_color()`| + +--- + +# Wrapping up // შეფუთვა + +## What else is available? + +More of this is explained in the libraries documentation: + `modelsummary` documentation: https://modelsummary.com/index.html + `huxtable` documentation: https://hughjonesd.github.io/huxtable/ --- -# Wrapping up // +# Wrapping up // შეფუთვა ## This session -ADD IMAGE OF PIPELINE ENDING IN EXCEL EXPORTED OUTPUT +```{r echo = FALSE, out.width="95%"} +knitr::include_graphics("img/session3/data-work-descriptive-stats.png") +``` --- -# Wrapping up // +# Wrapping up // შეფუთვა ## Next session (last one) -ADD IMAGE OF PIPELINE ENDING IN VISUALIZATIONS +```{r echo = FALSE, out.width="95%"} +knitr::include_graphics("img/session4/data-work-data-vis.png") +``` --- @@ -901,4 +1005,4 @@ class: inverse, center, middle # Thanks! // მადლობა! // ¡Gracias! // Obrigado! -

\ No newline at end of file +

diff --git a/Presentations-GeorgiaRS/202309/3-descriptive-statistics.html b/Presentations-GeorgiaRS/202309/3-descriptive-statistics.html index 143f82e..354d08b 100644 --- a/Presentations-GeorgiaRS/202309/3-descriptive-statistics.html +++ b/Presentations-GeorgiaRS/202309/3-descriptive-statistics.html @@ -16,6 +16,8 @@ + + @@ -53,28 +55,43 @@ # Table of contents // სარჩევი -- ADD +- [Introduction](#intro) +- [Piping](#piping) +- [Quick summary statistics](#quick-summary-stats) +- [Customized summary statistics](#customized-summary-stats) +- [Exporting table](#exporting-tables) +- [Customizing table outputs](#customiing-table-outputs) +- [Wrapping up](#wrapping-up) --- -# Introduction +class: inverse, center, middle +name: intro + +# Introduction // გაცნობა + +<html><div style='float:left'></div><hr color='#D38C28' size=1px width=1100px></html> + +--- + +# Introduction // გაცნობა - We learned yesterday how to conduct statistical programming and export the results in `.csv` files - However, sometime we might need more refined tables than simple (and ugly) CSVs -ADD IMAGE, DATA WORK PIPELINE AND IN OUTPUTS THERE'S A CSV AND A NICE EXCEL +<img src="img/session3/data-work-descriptive-stats.png" width="95%" style="display: block; margin: auto;" /> --- -# Introduction +# Introduction // გაცნობა - That's what today's session is about, along with an explanation of the pipes (`%>%`) -SAME IMAGE +<img src="img/session3/data-work-descriptive-stats.png" width="95%" style="display: block; margin: auto;" /> --- -# Introduction +# Introduction // გაცნობა ## Exercise 1a: Getting the libraries for today's session @@ -88,46 +105,56 @@ install.packages("huxtable") ``` -ADD IMAGE OF CONSOLE INSTALLING +<img src="img/session3/install.png" width="55%" style="display: block; margin: auto;" /> --- -# Introduction +# Introduction // გაცნობა ## Exercise 1b: Download and load the data we'll use -1. Go to https://www.osf.io/XXXXX and download the file +.pull-left[ +1. Go to https://osf.io/z8snr and download the file -1. In RStudio, go to `File` > `Import Dataset` > `From Text (base)` and select the file `XXXXX.csv` +1. In RStudio, go to `File` > `Import Dataset` > `From Text (base)` and select the file `small_business_2019_all.csv` + If you don't know where the file is, remember to check in your `Downloads` folder 1. Select `Import` +] -1. Go to https://www.osf.io/XXX, download the file, and repeat steps 2-3 - -ADD SCREENSHOT OF IMPORT BOX +.pull-right[ +<img src="img/session3/import.png" width="85%" style="display: block; margin: auto;" /> +] --- -# Introduction +# Introduction // გაცნობა -You should have two dataframes loaded in the environment after this. +You should have one dataframe loaded in the environment after this. -ADD IMAGE +<img src="img/session3/environment.png" width="90%" style="display: block; margin: auto;" /> --- -# Introduction +# Introduction // გაცნობა ## Recap: always know your data! +- This data is similar to the one we used before +- Every row is one business in one tax period (month) +- `modified_id` is a business identifier +- We also have information about the region, firm age, monthly income, VAT liability +- There is one more variable we didn't see before: `group` contains the group the firm was assigned to in a random experiment + +<img src="img/session3/df.png" width="40%" style="display: block; margin: auto;" /> + --- class: inverse, center, middle name: piping -# Piping // +# Piping <html><div style='float:left'></div><hr color='#D38C28' size=1px width=1100px></html> @@ -231,7 +258,6 @@ --- - # Piping .pull-left[ @@ -296,8 +322,6 @@ ``` ] -There are several important details to notice here: - 3.- Notice that the functions `arrange()` and `filter()` used after the pipes now have only **one argument instead of two**. This is because when using pipes the first argument is implied to be result of the function before the pipes --- @@ -334,7 +358,7 @@ ] .pull-right[ -<img src="img/session3/env-pipes.png" width="75%" style="display: block; margin: auto;" /> +<img src="img/session3/env-pipes.png" width="85%" style="display: block; margin: auto;" /> ] --- @@ -370,7 +394,32 @@ # Piping -ADD SOMETHING ON GOOD CODE AND CODE CLARITY? AND ELEGANT? +.pull-left[ + +```r +# Previous solution +df_tbilisi_50 <- filter(small_business_2019, + region == "Tbilisi") %>% + arrange(-income) %>% + filter(row_number() <= 50) +``` +] + +.pull-right[ + +```r +# The same with better spacing +df_tbilisi_50 <- + small_business_2019 %>% + filter(region == "Tbilisi") %>% + arrange(-income) %>% + filter(row_number() <= 50) +``` +] + +- Good code is code that is both correct (does what it's supposed to) and it's easy to understand + +- Piping is **instrumental for writing good code in R** --- @@ -399,7 +448,7 @@ class: inverse, center, middle name: quick-summary-stats -# Quick summary statistics // +# Quick summary statistics // სწრაფი შემაჯამებელი სტატისტიკა <html><div style='float:left'></div><hr color='#D38C28' size=1px width=1100px></html> @@ -427,11 +476,11 @@ - We chose a combination of both because together they export a large range of output types and allow fine-grained customization of outputs -ADD IMAGE, FROM UGLY CSV TO NICE EXCEL WITH LIBRARY BADGES +<img src="img/session3/csv-to-excel.png" width="95%" style="display: block; margin: auto;" /> --- -# Quick summary statistics // +# Quick summary statistics We'll start by introducing the function `datasummary_skim()` from `modelsummary` @@ -459,29 +508,164 @@ --- -# Quick summary statistics // +# Quick summary statistics ## Exercise 3: Calculate quick summary statistics -1. Use `datasummary_skim()` to create a descriptive statistics table for `XXXX` +1. Load `modelsummary` with `library(modelsummary)` + +1. Use `datasummary_skim()` to create a descriptive statistics table for `small_business_all` + + ```r -datasummary_skim(XXXXX) +datasummary_skim(small_business_2019_all) ``` --- -# Quick summary statistics // +# Quick summary statistics You should be seeing this result in the lower right panel of RStudio. - -```r -datasummary_skim(XXXXX) -``` - -DROP EVAL=FALSE HERE SO RESULT IS DISPLAYED +<table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;"> + <thead> + <tr> + <th style="text-align:left;"> </th> + <th style="text-align:right;"> Unique (#) </th> + <th style="text-align:right;"> Missing (%) </th> + <th style="text-align:right;"> Mean </th> + <th style="text-align:right;"> SD </th> + <th style="text-align:right;"> Min </th> + <th style="text-align:right;"> Median </th> + <th style="text-align:right;"> Max </th> + <th style="text-align:right;"> </th> + </tr> + </thead> +<tbody> + <tr> + <td style="text-align:left;"> modified_id </td> + <td style="text-align:right;"> 984 </td> + <td style="text-align:right;"> 0 </td> + <td style="text-align:right;"> 5448915.1 </td> + <td style="text-align:right;"> 3758602.4 </td> + <td style="text-align:right;"> 19832.0 </td> + <td style="text-align:right;"> 5008712.0 </td> + <td style="text-align:right;"> 12296912.0 </td> + <td style="text-align:right;"> <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" class="svglite" width="48.00pt" height="12.00pt" viewBox="0 0 48.00 12.00"><defs><style type="text/css"> + .svglite line, .svglite polyline, .svglite polygon, .svglite path, .svglite rect, .svglite circle { + fill: none; + stroke: #000000; + stroke-linecap: round; + stroke-linejoin: round; + stroke-miterlimit: 10.00; + } + .svglite text { + white-space: pre; + } + </style></defs><rect width="100%" height="100%" style="stroke: none; fill: none;"></rect><defs><clipPath id="cpMC4wMHw0OC4wMHwwLjAwfDEyLjAw"><rect x="0.00" y="0.00" width="48.00" height="12.00"></rect></clipPath></defs><g clip-path="url(#cpMC4wMHw0OC4wMHwwLjAwfDEyLjAw)"> +</g><defs><clipPath id="cpMC4wMHw0OC4wMHwyLjg4fDEyLjAw"><rect x="0.00" y="2.88" width="48.00" height="9.12"></rect></clipPath></defs><g clip-path="url(#cpMC4wMHw0OC4wMHwyLjg4fDEyLjAw)"><rect x="1.71" y="3.22" width="3.62" height="8.44" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="5.33" y="7.10" width="3.62" height="4.56" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="8.95" y="7.52" width="3.62" height="4.14" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="12.57" y="7.52" width="3.62" height="4.14" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="16.19" y="6.84" width="3.62" height="4.83" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="19.81" y="7.05" width="3.62" height="4.62" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="23.43" y="7.94" width="3.62" height="3.72" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="27.05" y="9.04" width="3.62" height="2.62" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="30.67" y="9.72" width="3.62" height="1.94" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="34.29" y="9.20" width="3.62" height="2.47" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="37.91" y="4.95" width="3.62" height="6.71" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="41.53" y="8.04" width="3.62" height="3.62" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="45.15" y="11.03" width="3.62" height="0.63" style="stroke-width: 0.38; fill: #000000;"></rect></g></svg> +</td> + </tr> + <tr> + <td style="text-align:left;"> taxperiod </td> + <td style="text-align:right;"> 12 </td> + <td style="text-align:right;"> 0 </td> + <td style="text-align:right;"> 201906.7 </td> + <td style="text-align:right;"> 3.4 </td> + <td style="text-align:right;"> 201901.0 </td> + <td style="text-align:right;"> 201907.0 </td> + <td style="text-align:right;"> 201912.0 </td> + <td style="text-align:right;"> <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" class="svglite" width="48.00pt" height="12.00pt" viewBox="0 0 48.00 12.00"><defs><style type="text/css"> + .svglite line, .svglite polyline, .svglite polygon, .svglite path, .svglite rect, .svglite circle { + fill: none; + stroke: #000000; + stroke-linecap: round; + stroke-linejoin: round; + stroke-miterlimit: 10.00; + } + .svglite text { + white-space: pre; + } + </style></defs><rect width="100%" height="100%" style="stroke: none; fill: none;"></rect><defs><clipPath id="cpMC4wMHw0OC4wMHwwLjAwfDEyLjAw"><rect x="0.00" y="0.00" width="48.00" height="12.00"></rect></clipPath></defs><g clip-path="url(#cpMC4wMHw0OC4wMHwwLjAwfDEyLjAw)"> +</g><defs><clipPath id="cpMC4wMHw0OC4wMHwyLjg4fDEyLjAw"><rect x="0.00" y="2.88" width="48.00" height="9.12"></rect></clipPath></defs><g clip-path="url(#cpMC4wMHw0OC4wMHwyLjg4fDEyLjAw)"><rect x="1.78" y="3.22" width="4.04" height="8.44" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="5.82" y="7.44" width="4.04" height="4.22" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="9.86" y="8.19" width="4.04" height="3.47" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="13.90" y="6.64" width="4.04" height="5.02" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="17.94" y="7.17" width="4.04" height="4.49" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="21.98" y="8.03" width="4.04" height="3.63" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="26.02" y="6.85" width="4.04" height="4.81" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="30.06" y="7.12" width="4.04" height="4.54" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="34.10" y="6.00" width="4.04" height="5.67" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="38.14" y="6.91" width="4.04" height="4.76" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="42.18" y="7.28" width="4.04" height="4.38" style="stroke-width: 0.38; fill: #000000;"></rect></g></svg> +</td> + </tr> + <tr> + <td style="text-align:left;"> age </td> + <td style="text-align:right;"> 30 </td> + <td style="text-align:right;"> 0 </td> + <td style="text-align:right;"> 14.0 </td> + <td style="text-align:right;"> 8.4 </td> + <td style="text-align:right;"> 1.0 </td> + <td style="text-align:right;"> 13.0 </td> + <td style="text-align:right;"> 30.0 </td> + <td style="text-align:right;"> <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" class="svglite" width="48.00pt" height="12.00pt" viewBox="0 0 48.00 12.00"><defs><style type="text/css"> + .svglite line, .svglite polyline, .svglite polygon, .svglite path, .svglite rect, .svglite circle { + fill: none; + stroke: #000000; + stroke-linecap: round; + stroke-linejoin: round; + stroke-miterlimit: 10.00; + } + .svglite text { + white-space: pre; + } + </style></defs><rect width="100%" height="100%" style="stroke: none; fill: none;"></rect><defs><clipPath id="cpMC4wMHw0OC4wMHwwLjAwfDEyLjAw"><rect x="0.00" y="0.00" width="48.00" height="12.00"></rect></clipPath></defs><g clip-path="url(#cpMC4wMHw0OC4wMHwwLjAwfDEyLjAw)"> +</g><defs><clipPath id="cpMC4wMHw0OC4wMHwyLjg4fDEyLjAw"><rect x="0.00" y="2.88" width="48.00" height="9.12"></rect></clipPath></defs><g clip-path="url(#cpMC4wMHw0OC4wMHwyLjg4fDEyLjAw)"><rect x="0.25" y="6.61" width="3.07" height="5.05" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="3.31" y="5.27" width="3.07" height="6.39" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="6.38" y="3.22" width="3.07" height="8.44" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="9.44" y="5.82" width="3.07" height="5.84" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="12.51" y="5.43" width="3.07" height="6.23" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="15.57" y="5.98" width="3.07" height="5.68" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="18.64" y="6.69" width="3.07" height="4.97" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="21.70" y="6.06" width="3.07" height="5.60" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="24.77" y="6.53" width="3.07" height="5.13" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="27.83" y="7.72" width="3.07" height="3.95" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="30.90" y="7.08" width="3.07" height="4.58" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="33.96" y="6.77" width="3.07" height="4.89" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="37.03" y="6.61" width="3.07" height="5.05" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="40.09" y="7.24" width="3.07" height="4.42" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="43.16" y="8.98" width="3.07" height="2.68" style="stroke-width: 0.38; fill: #000000;"></rect></g></svg> +</td> + </tr> + <tr> + <td style="text-align:left;"> income </td> + <td style="text-align:right;"> 721 </td> + <td style="text-align:right;"> 0 </td> + <td style="text-align:right;"> 3283.9 </td> + <td style="text-align:right;"> 8242.4 </td> + <td style="text-align:right;"> 0.0 </td> + <td style="text-align:right;"> 906.8 </td> + <td style="text-align:right;"> 139394.5 </td> + <td style="text-align:right;"> <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" class="svglite" width="48.00pt" height="12.00pt" viewBox="0 0 48.00 12.00"><defs><style type="text/css"> + .svglite line, .svglite polyline, .svglite polygon, .svglite path, .svglite rect, .svglite circle { + fill: none; + stroke: #000000; + stroke-linecap: round; + stroke-linejoin: round; + stroke-miterlimit: 10.00; + } + .svglite text { + white-space: pre; + } + </style></defs><rect width="100%" height="100%" style="stroke: none; fill: none;"></rect><defs><clipPath id="cpMC4wMHw0OC4wMHwwLjAwfDEyLjAw"><rect x="0.00" y="0.00" width="48.00" height="12.00"></rect></clipPath></defs><g clip-path="url(#cpMC4wMHw0OC4wMHwwLjAwfDEyLjAw)"> +</g><defs><clipPath id="cpMC4wMHw0OC4wMHwyLjg4fDEyLjAw"><rect x="0.00" y="2.88" width="48.00" height="9.12"></rect></clipPath></defs><g clip-path="url(#cpMC4wMHw0OC4wMHwyLjg4fDEyLjAw)"><rect x="1.78" y="3.22" width="3.19" height="8.44" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="4.97" y="11.32" width="3.19" height="0.34" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="8.15" y="11.52" width="3.19" height="0.15" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="11.34" y="11.60" width="3.19" height="0.063" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="14.53" y="11.64" width="3.19" height="0.027" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="17.72" y="11.65" width="3.19" height="0.0091" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="20.91" y="11.65" width="3.19" height="0.0091" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="24.10" y="11.65" width="3.19" height="0.0091" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="27.28" y="11.65" width="3.19" height="0.0091" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="30.47" y="11.66" width="3.19" height="0.00" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="33.66" y="11.66" width="3.19" height="0.00" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="36.85" y="11.66" width="3.19" height="0.00" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="40.04" y="11.66" width="3.19" height="0.00" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="43.23" y="11.65" width="3.19" height="0.0091" style="stroke-width: 0.38; fill: #000000;"></rect></g></svg> +</td> + </tr> + <tr> + <td style="text-align:left;"> vat_liability </td> + <td style="text-align:right;"> 721 </td> + <td style="text-align:right;"> 0 </td> + <td style="text-align:right;"> 591.1 </td> + <td style="text-align:right;"> 1483.6 </td> + <td style="text-align:right;"> 0.0 </td> + <td style="text-align:right;"> 163.2 </td> + <td style="text-align:right;"> 25091.0 </td> + <td style="text-align:right;"> <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" class="svglite" width="48.00pt" height="12.00pt" viewBox="0 0 48.00 12.00"><defs><style type="text/css"> + .svglite line, .svglite polyline, .svglite polygon, .svglite path, .svglite rect, .svglite circle { + fill: none; + stroke: #000000; + stroke-linecap: round; + stroke-linejoin: round; + stroke-miterlimit: 10.00; + } + .svglite text { + white-space: pre; + } + </style></defs><rect width="100%" height="100%" style="stroke: none; fill: none;"></rect><defs><clipPath id="cpMC4wMHw0OC4wMHwwLjAwfDEyLjAw"><rect x="0.00" y="0.00" width="48.00" height="12.00"></rect></clipPath></defs><g clip-path="url(#cpMC4wMHw0OC4wMHwwLjAwfDEyLjAw)"> +</g><defs><clipPath id="cpMC4wMHw0OC4wMHwyLjg4fDEyLjAw"><rect x="0.00" y="2.88" width="48.00" height="9.12"></rect></clipPath></defs><g clip-path="url(#cpMC4wMHw0OC4wMHwyLjg4fDEyLjAw)"><rect x="1.78" y="3.22" width="3.54" height="8.44" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="5.32" y="11.38" width="3.54" height="0.29" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="8.86" y="11.56" width="3.54" height="0.099" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="12.41" y="11.60" width="3.54" height="0.063" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="15.95" y="11.64" width="3.54" height="0.018" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="19.49" y="11.65" width="3.54" height="0.0090" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="23.03" y="11.64" width="3.54" height="0.018" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="26.58" y="11.65" width="3.54" height="0.0090" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="30.12" y="11.66" width="3.54" height="0.00" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="33.66" y="11.66" width="3.54" height="0.00" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="37.20" y="11.66" width="3.54" height="0.00" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="40.75" y="11.66" width="3.54" height="0.00" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="44.29" y="11.65" width="3.54" height="0.0090" style="stroke-width: 0.38; fill: #000000;"></rect></g></svg> +</td> + </tr> +</tbody> +</table> --- @@ -493,23 +677,241 @@ ```r -datasummary_skim(XXXXX, type = "categorical") +datasummary_skim(small_business_2019_all, type = "categorical") ``` -DROP EVAL=FALSE HERE SO RESULT IS DISPLAYED - --- # Quick summary statistics -- `datasummary_skim()` is convenient because it's fast, easy, and shows a lot of information +<table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;"> + <thead> + <tr> + <th style="text-align:left;"> </th> + <th style="text-align:left;"> </th> + <th style="text-align:right;"> N </th> + <th style="text-align:right;"> % </th> + </tr> + </thead> +<tbody> + <tr> + <td style="text-align:left;"> region </td> + <td style="text-align:left;"> Guria </td> + <td style="text-align:right;"> 259 </td> + <td style="text-align:right;"> 25.9 </td> + </tr> + <tr> + <td style="text-align:left;"> </td> + <td style="text-align:left;"> ImereTI-Racha-Lechkhum-kv.SvaneTi </td> + <td style="text-align:right;"> 37 </td> + <td style="text-align:right;"> 3.7 </td> + </tr> + <tr> + <td style="text-align:left;"> </td> + <td style="text-align:left;"> KaxeTi </td> + <td style="text-align:right;"> 270 </td> + <td style="text-align:right;"> 27.0 </td> + </tr> + <tr> + <td style="text-align:left;"> </td> + <td style="text-align:left;"> Kvemo KarTli </td> + <td style="text-align:right;"> 9 </td> + <td style="text-align:right;"> 0.9 </td> + </tr> + <tr> + <td style="text-align:left;"> </td> + <td style="text-align:left;"> Samegrelo-Z.SvaneTi </td> + <td style="text-align:right;"> 28 </td> + <td style="text-align:right;"> 2.8 </td> + </tr> + <tr> + <td style="text-align:left;"> </td> + <td style="text-align:left;"> Samtskhe-Javakheti </td> + <td style="text-align:right;"> 7 </td> + <td style="text-align:right;"> 0.7 </td> + </tr> + <tr> + <td style="text-align:left;"> </td> + <td style="text-align:left;"> Shida KarTli </td> + <td style="text-align:right;"> 17 </td> + <td style="text-align:right;"> 1.7 </td> + </tr> + <tr> + <td style="text-align:left;"> </td> + <td style="text-align:left;"> Tbilisi </td> + <td style="text-align:right;"> 373 </td> + <td style="text-align:right;"> 37.3 </td> + </tr> + <tr> + <td style="text-align:left;"> group </td> + <td style="text-align:left;"> No notifications to be sent </td> + <td style="text-align:right;"> 286 </td> + <td style="text-align:right;"> 28.6 </td> + </tr> + <tr> + <td style="text-align:left;"> </td> + <td style="text-align:left;"> Notification sent on Day 13 and Day 15 </td> + <td style="text-align:right;"> 226 </td> + <td style="text-align:right;"> 22.6 </td> + </tr> + <tr> + <td style="text-align:left;"> </td> + <td style="text-align:left;"> Notification sent only on Day 13 </td> + <td style="text-align:right;"> 247 </td> + <td style="text-align:right;"> 24.7 </td> + </tr> + <tr> + <td style="text-align:left;"> </td> + <td style="text-align:left;"> Notification sent only on Day 15 </td> + <td style="text-align:right;"> 241 </td> + <td style="text-align:right;"> 24.1 </td> + </tr> +</tbody> +</table> +--- -```r -datasummary_skim(XXXXX) -``` +# Quick summary statistics + +- `datasummary_skim()` is convenient because it's fast, easy, and shows a lot of information -DROP EVAL=FALSE HERE SO RESULT IS DISPLAYED +<table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;"> + <thead> + <tr> + <th style="text-align:left;"> </th> + <th style="text-align:right;"> Unique (#) </th> + <th style="text-align:right;"> Missing (%) </th> + <th style="text-align:right;"> Mean </th> + <th style="text-align:right;"> SD </th> + <th style="text-align:right;"> Min </th> + <th style="text-align:right;"> Median </th> + <th style="text-align:right;"> Max </th> + <th style="text-align:right;"> </th> + </tr> + </thead> +<tbody> + <tr> + <td style="text-align:left;"> modified_id </td> + <td style="text-align:right;"> 984 </td> + <td style="text-align:right;"> 0 </td> + <td style="text-align:right;"> 5448915.1 </td> + <td style="text-align:right;"> 3758602.4 </td> + <td style="text-align:right;"> 19832.0 </td> + <td style="text-align:right;"> 5008712.0 </td> + <td style="text-align:right;"> 12296912.0 </td> + <td style="text-align:right;"> <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" class="svglite" width="48.00pt" height="12.00pt" viewBox="0 0 48.00 12.00"><defs><style type="text/css"> + .svglite line, .svglite polyline, .svglite polygon, .svglite path, .svglite rect, .svglite circle { + fill: none; + stroke: #000000; + stroke-linecap: round; + stroke-linejoin: round; + stroke-miterlimit: 10.00; + } + .svglite text { + white-space: pre; + } + </style></defs><rect width="100%" height="100%" style="stroke: none; fill: none;"></rect><defs><clipPath id="cpMC4wMHw0OC4wMHwwLjAwfDEyLjAw"><rect x="0.00" y="0.00" width="48.00" height="12.00"></rect></clipPath></defs><g clip-path="url(#cpMC4wMHw0OC4wMHwwLjAwfDEyLjAw)"> +</g><defs><clipPath id="cpMC4wMHw0OC4wMHwyLjg4fDEyLjAw"><rect x="0.00" y="2.88" width="48.00" height="9.12"></rect></clipPath></defs><g clip-path="url(#cpMC4wMHw0OC4wMHwyLjg4fDEyLjAw)"><rect x="1.71" y="3.22" width="3.62" height="8.44" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="5.33" y="7.10" width="3.62" height="4.56" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="8.95" y="7.52" width="3.62" height="4.14" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="12.57" y="7.52" width="3.62" height="4.14" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="16.19" y="6.84" width="3.62" height="4.83" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="19.81" y="7.05" width="3.62" height="4.62" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="23.43" y="7.94" width="3.62" height="3.72" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="27.05" y="9.04" width="3.62" height="2.62" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="30.67" y="9.72" width="3.62" height="1.94" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="34.29" y="9.20" width="3.62" height="2.47" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="37.91" y="4.95" width="3.62" height="6.71" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="41.53" y="8.04" width="3.62" height="3.62" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="45.15" y="11.03" width="3.62" height="0.63" style="stroke-width: 0.38; fill: #000000;"></rect></g></svg> +</td> + </tr> + <tr> + <td style="text-align:left;"> taxperiod </td> + <td style="text-align:right;"> 12 </td> + <td style="text-align:right;"> 0 </td> + <td style="text-align:right;"> 201906.7 </td> + <td style="text-align:right;"> 3.4 </td> + <td style="text-align:right;"> 201901.0 </td> + <td style="text-align:right;"> 201907.0 </td> + <td style="text-align:right;"> 201912.0 </td> + <td style="text-align:right;"> <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" class="svglite" width="48.00pt" height="12.00pt" viewBox="0 0 48.00 12.00"><defs><style type="text/css"> + .svglite line, .svglite polyline, .svglite polygon, .svglite path, .svglite rect, .svglite circle { + fill: none; + stroke: #000000; + stroke-linecap: round; + stroke-linejoin: round; + stroke-miterlimit: 10.00; + } + .svglite text { + white-space: pre; + } + </style></defs><rect width="100%" height="100%" style="stroke: none; fill: none;"></rect><defs><clipPath id="cpMC4wMHw0OC4wMHwwLjAwfDEyLjAw"><rect x="0.00" y="0.00" width="48.00" height="12.00"></rect></clipPath></defs><g clip-path="url(#cpMC4wMHw0OC4wMHwwLjAwfDEyLjAw)"> +</g><defs><clipPath id="cpMC4wMHw0OC4wMHwyLjg4fDEyLjAw"><rect x="0.00" y="2.88" width="48.00" height="9.12"></rect></clipPath></defs><g clip-path="url(#cpMC4wMHw0OC4wMHwyLjg4fDEyLjAw)"><rect x="1.78" y="3.22" width="4.04" height="8.44" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="5.82" y="7.44" width="4.04" height="4.22" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="9.86" y="8.19" width="4.04" height="3.47" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="13.90" y="6.64" width="4.04" height="5.02" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="17.94" y="7.17" width="4.04" height="4.49" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="21.98" y="8.03" width="4.04" height="3.63" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="26.02" y="6.85" width="4.04" height="4.81" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="30.06" y="7.12" width="4.04" height="4.54" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="34.10" y="6.00" width="4.04" height="5.67" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="38.14" y="6.91" width="4.04" height="4.76" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="42.18" y="7.28" width="4.04" height="4.38" style="stroke-width: 0.38; fill: #000000;"></rect></g></svg> +</td> + </tr> + <tr> + <td style="text-align:left;"> age </td> + <td style="text-align:right;"> 30 </td> + <td style="text-align:right;"> 0 </td> + <td style="text-align:right;"> 14.0 </td> + <td style="text-align:right;"> 8.4 </td> + <td style="text-align:right;"> 1.0 </td> + <td style="text-align:right;"> 13.0 </td> + <td style="text-align:right;"> 30.0 </td> + <td style="text-align:right;"> <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" class="svglite" width="48.00pt" height="12.00pt" viewBox="0 0 48.00 12.00"><defs><style type="text/css"> + .svglite line, .svglite polyline, .svglite polygon, .svglite path, .svglite rect, .svglite circle { + fill: none; + stroke: #000000; + stroke-linecap: round; + stroke-linejoin: round; + stroke-miterlimit: 10.00; + } + .svglite text { + white-space: pre; + } + </style></defs><rect width="100%" height="100%" style="stroke: none; fill: none;"></rect><defs><clipPath id="cpMC4wMHw0OC4wMHwwLjAwfDEyLjAw"><rect x="0.00" y="0.00" width="48.00" height="12.00"></rect></clipPath></defs><g clip-path="url(#cpMC4wMHw0OC4wMHwwLjAwfDEyLjAw)"> +</g><defs><clipPath id="cpMC4wMHw0OC4wMHwyLjg4fDEyLjAw"><rect x="0.00" y="2.88" width="48.00" height="9.12"></rect></clipPath></defs><g clip-path="url(#cpMC4wMHw0OC4wMHwyLjg4fDEyLjAw)"><rect x="0.25" y="6.61" width="3.07" height="5.05" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="3.31" y="5.27" width="3.07" height="6.39" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="6.38" y="3.22" width="3.07" height="8.44" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="9.44" y="5.82" width="3.07" height="5.84" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="12.51" y="5.43" width="3.07" height="6.23" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="15.57" y="5.98" width="3.07" height="5.68" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="18.64" y="6.69" width="3.07" height="4.97" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="21.70" y="6.06" width="3.07" height="5.60" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="24.77" y="6.53" width="3.07" height="5.13" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="27.83" y="7.72" width="3.07" height="3.95" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="30.90" y="7.08" width="3.07" height="4.58" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="33.96" y="6.77" width="3.07" height="4.89" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="37.03" y="6.61" width="3.07" height="5.05" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="40.09" y="7.24" width="3.07" height="4.42" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="43.16" y="8.98" width="3.07" height="2.68" style="stroke-width: 0.38; fill: #000000;"></rect></g></svg> +</td> + </tr> + <tr> + <td style="text-align:left;"> income </td> + <td style="text-align:right;"> 721 </td> + <td style="text-align:right;"> 0 </td> + <td style="text-align:right;"> 3283.9 </td> + <td style="text-align:right;"> 8242.4 </td> + <td style="text-align:right;"> 0.0 </td> + <td style="text-align:right;"> 906.8 </td> + <td style="text-align:right;"> 139394.5 </td> + <td style="text-align:right;"> <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" class="svglite" width="48.00pt" height="12.00pt" viewBox="0 0 48.00 12.00"><defs><style type="text/css"> + .svglite line, .svglite polyline, .svglite polygon, .svglite path, .svglite rect, .svglite circle { + fill: none; + stroke: #000000; + stroke-linecap: round; + stroke-linejoin: round; + stroke-miterlimit: 10.00; + } + .svglite text { + white-space: pre; + } + </style></defs><rect width="100%" height="100%" style="stroke: none; fill: none;"></rect><defs><clipPath id="cpMC4wMHw0OC4wMHwwLjAwfDEyLjAw"><rect x="0.00" y="0.00" width="48.00" height="12.00"></rect></clipPath></defs><g clip-path="url(#cpMC4wMHw0OC4wMHwwLjAwfDEyLjAw)"> +</g><defs><clipPath id="cpMC4wMHw0OC4wMHwyLjg4fDEyLjAw"><rect x="0.00" y="2.88" width="48.00" height="9.12"></rect></clipPath></defs><g clip-path="url(#cpMC4wMHw0OC4wMHwyLjg4fDEyLjAw)"><rect x="1.78" y="3.22" width="3.19" height="8.44" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="4.97" y="11.32" width="3.19" height="0.34" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="8.15" y="11.52" width="3.19" height="0.15" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="11.34" y="11.60" width="3.19" height="0.063" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="14.53" y="11.64" width="3.19" height="0.027" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="17.72" y="11.65" width="3.19" height="0.0091" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="20.91" y="11.65" width="3.19" height="0.0091" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="24.10" y="11.65" width="3.19" height="0.0091" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="27.28" y="11.65" width="3.19" height="0.0091" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="30.47" y="11.66" width="3.19" height="0.00" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="33.66" y="11.66" width="3.19" height="0.00" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="36.85" y="11.66" width="3.19" height="0.00" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="40.04" y="11.66" width="3.19" height="0.00" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="43.23" y="11.65" width="3.19" height="0.0091" style="stroke-width: 0.38; fill: #000000;"></rect></g></svg> +</td> + </tr> + <tr> + <td style="text-align:left;"> vat_liability </td> + <td style="text-align:right;"> 721 </td> + <td style="text-align:right;"> 0 </td> + <td style="text-align:right;"> 591.1 </td> + <td style="text-align:right;"> 1483.6 </td> + <td style="text-align:right;"> 0.0 </td> + <td style="text-align:right;"> 163.2 </td> + <td style="text-align:right;"> 25091.0 </td> + <td style="text-align:right;"> <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" class="svglite" width="48.00pt" height="12.00pt" viewBox="0 0 48.00 12.00"><defs><style type="text/css"> + .svglite line, .svglite polyline, .svglite polygon, .svglite path, .svglite rect, .svglite circle { + fill: none; + stroke: #000000; + stroke-linecap: round; + stroke-linejoin: round; + stroke-miterlimit: 10.00; + } + .svglite text { + white-space: pre; + } + </style></defs><rect width="100%" height="100%" style="stroke: none; fill: none;"></rect><defs><clipPath id="cpMC4wMHw0OC4wMHwwLjAwfDEyLjAw"><rect x="0.00" y="0.00" width="48.00" height="12.00"></rect></clipPath></defs><g clip-path="url(#cpMC4wMHw0OC4wMHwwLjAwfDEyLjAw)"> +</g><defs><clipPath id="cpMC4wMHw0OC4wMHwyLjg4fDEyLjAw"><rect x="0.00" y="2.88" width="48.00" height="9.12"></rect></clipPath></defs><g clip-path="url(#cpMC4wMHw0OC4wMHwyLjg4fDEyLjAw)"><rect x="1.78" y="3.22" width="3.54" height="8.44" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="5.32" y="11.38" width="3.54" height="0.29" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="8.86" y="11.56" width="3.54" height="0.099" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="12.41" y="11.60" width="3.54" height="0.063" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="15.95" y="11.64" width="3.54" height="0.018" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="19.49" y="11.65" width="3.54" height="0.0090" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="23.03" y="11.64" width="3.54" height="0.018" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="26.58" y="11.65" width="3.54" height="0.0090" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="30.12" y="11.66" width="3.54" height="0.00" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="33.66" y="11.66" width="3.54" height="0.00" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="37.20" y="11.66" width="3.54" height="0.00" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="40.75" y="11.66" width="3.54" height="0.00" style="stroke-width: 0.38; fill: #000000;"></rect><rect x="44.29" y="11.65" width="3.54" height="0.0090" style="stroke-width: 0.38; fill: #000000;"></rect></g></svg> +</td> + </tr> +</tbody> +</table> - But what if we wanted to customize what to show? that's when we use `datasummary()` instead, also from the library `modelsummary` @@ -518,13 +920,13 @@ class: inverse, center, middle name: customized-summary-stats -# Customized summary statistics // +# Customized summary statistics // მორგებული შემაჯამებელი სტატისტიკა <html><div style='float:left'></div><hr color='#D38C28' size=1px width=1100px></html> --- -# Customized summary statistics // +# Customized summary statistics `datasummary()` is very similar to `data_summary_skim()`. The only difference is that it requires a **formula argument**. @@ -548,45 +950,74 @@ --- -# Customized summary statistics // +# Customized summary statistics ## Exercise 4: -Create a summary statistics table showing the nuber of observations, mean, standard deviation, minimum, and maximum for variables XXX, XXX, XXX, and XXX of the dataframe `XXX` +Create a summary statistics table showing the nuber of observations, mean, standard deviation, minimum, and maximum for variables `age`, `income`, and `vat_liability` of the dataframe `small_business_2019_all` 1. Use `datasummary()` for this: ```r datasummary( - XX + XX + XX ~ N + Mean + SD + Min + Max, - XXXXX + age + income + vat_liability ~ N + Mean + SD + Min + Max, + small_business_2019_all ) ``` --- -# Customized summary statistics // - - -```r -datasummary( - XX + XX + XX ~ N + Mean + SD + Min + Max, - XXXXX -) -``` - -REMOVE EVAL= FALSE O SHOW RESULTS +# Customized summary statistics + +<table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;"> + <thead> + <tr> + <th style="text-align:left;"> </th> + <th style="text-align:right;"> N </th> + <th style="text-align:right;"> Mean </th> + <th style="text-align:right;"> SD </th> + <th style="text-align:right;"> Min </th> + <th style="text-align:right;"> Max </th> + </tr> + </thead> +<tbody> + <tr> + <td style="text-align:left;"> age </td> + <td style="text-align:right;"> 1000 </td> + <td style="text-align:right;"> 14.00 </td> + <td style="text-align:right;"> 8.37 </td> + <td style="text-align:right;"> 1.00 </td> + <td style="text-align:right;"> 30.00 </td> + </tr> + <tr> + <td style="text-align:left;"> income </td> + <td style="text-align:right;"> 1000 </td> + <td style="text-align:right;"> 3283.87 </td> + <td style="text-align:right;"> 8242.45 </td> + <td style="text-align:right;"> 0.00 </td> + <td style="text-align:right;"> 139394.52 </td> + </tr> + <tr> + <td style="text-align:left;"> vat_liability </td> + <td style="text-align:right;"> 1000 </td> + <td style="text-align:right;"> 591.10 </td> + <td style="text-align:right;"> 1483.64 </td> + <td style="text-align:right;"> 0.00 </td> + <td style="text-align:right;"> 25091.01 </td> + </tr> +</tbody> +</table> --- -# Customized summary statistics // +# Customized summary statistics ```r datasummary( - XX + XX + XX ~ N + Mean + SD + Min + Max, # this is the formula - XXXXXX # this is the data + age + income + vat_liability ~ N + Mean + SD + Min + Max, # this is the formula + small_business_2019_all # this is the data ) ``` @@ -599,13 +1030,13 @@ --- -# Customized summary statistics // +# Customized summary statistics ```r datasummary( - XX + XX + XX ~ N + Mean + SD + Min + Max, # this is the formula - XXXXXX # this is the data + age + income + vat_liability ~ N + Mean + SD + Min + Max, # this is the formula + small_business_2019_all # this is the data ) ``` @@ -624,21 +1055,21 @@ class: inverse, center, middle name: exporting-tables -# Exporting tables // +# Exporting tables // მაგიდების ექსპორტი <html><div style='float:left'></div><hr color='#D38C28' size=1px width=1100px></html> --- -# Exporting tables // +# Exporting tables // მაგიდების ექსპორტი -Remember that both `datasummary_skim()` and `datasummary()` had an optional argument named *output*? We can use it to specify a file path for an output file. +Remember that both `datasummary_skim()` and `datasummary()` have an optional argument named *output*? We can use it to specify a file path for an output file. For example: ```r -datasummary_skim(XXXXX, +datasummary_skim(small_business_2019_all, output = "quick_stats.docx") ``` @@ -646,7 +1077,7 @@ --- -# Exporting tables // +# Exporting tables // მაგიდების ექსპორტი The file type of the output is dictated by the file extension. For example: @@ -662,7 +1093,7 @@ --- -# Exporting tables // +# Exporting tables // მაგიდების ექსპორტი ## That's because the functions of `modelsummary` can't export to Excel @@ -674,40 +1105,50 @@ --- -# Exporting tables // +# Exporting tables // მაგიდების ექსპორტი ## Exercise 5: Export a table to Excel +1. Load `huxtable` with `library(huxtable)` + 1. Run the following code to export the result of `datasummary_skim()` to Excel: ```r # Store the table in a new object -stats_table <- datasummary_skim(XXXXX, output = "huxtable") +stats_table <- datasummary_skim(small_business_2019_all, output = "huxtable") # Export this new object to Excel with quick_xlsx() -quick_xlsx(stats_table, "quick_stats.xlsx") +quick_xlsx(stats_table, file = "quick_stats.xlsx") ``` --- -# Exporting tables // +# Exporting tables // მაგიდების ექსპორტი Now the result will show in your `Documents` folder -ADD SCREENSHOT +<img src="img/session3/quick-stats-output.png" width="65%" style="display: block; margin: auto;" /> + +--- + +# Exporting tables // მაგიდების ექსპორტი + +And you can open it with Excel for further customization if you want + +<img src="img/session3/quick-stats-excel.png" width="65%" style="display: block; margin: auto;" /> --- -# Exporting tables // +# Exporting tables // მაგიდების ექსპორტი ```r # Store the table in a new object -stats_table <- datasummary_skim(XXXXX, output = "huxtable") +stats_table <- datasummary_skim(small_business_2019_all, output = "huxtable") # Export this new object to Excel with quick_xlsx() -quick_xlsx(stats_table, "quick_stats.xlsx") +quick_xlsx(stats_table, file = "quick_stats.xlsx") ``` Some comments about this code: @@ -721,15 +1162,15 @@ class: inverse, center, middle name: customizing-table-outputs -# Customizing table outputs // +# Customizing table outputs // ცხრილის შედეგების მორგება <html><div style='float:left'></div><hr color='#D38C28' size=1px width=1100px></html> --- -# Customizing table outputs // +# Customizing table outputs -The code belows shows how the table `stats_table` can be formatted: +The code below shows how the table `stats_table` can be formatted: .pull-left[ @@ -751,27 +1192,32 @@ .pull-right[ .small[ + + + + + + + + + + + + + +
Unique (#)Missing (%)MeanSDMinMedianMax
modified_id 984 0 5448915 3758602 19832 5008712 12296912
taxperiod 12 0 201907 3 201901 201907 201912
age 30 0 14 8 1 13 30
income 721 0 3284 8242 0 907 139395
vat_liability 721 0 591 1484 0 163 25091
-```r -# Format table - stats_table %>% - set_header_rows(1, TRUE) %>% # Use first row as table header - set_header_cols(1, TRUE) %>% # Use first column as row header - set_number_format(everywhere, 2:ncol(.), "%9.0f") %>% # Don't round large numbers - set_align(1, everywhere, "center") %>% # Centralize cells in first row - theme_basic() # Set a theme for quick formatting -``` ] ] --- -# Customizing table outputs // +# Customizing table outputs ## Exercise 6: Export a customized table to Excel .pull-left[ -1. Customize `stats_table` in a new object called `stats_table_custom` +1.- Customize `stats_table` in a new object called `stats_table_custom` ```r @@ -790,29 +1236,26 @@ ] .pull-right[ -2. Export `stats_table_custom` to a file named `stats-custom.xlsx` with `quick_xlsx()` +2.- Export `stats_table_custom` to a file named `stats-custom.xlsx` with `quick_xlsx()` ```r quick_xlsx( stats_table_custom, - "stats-custom.xlsx" + file = "stats-custom.xlsx" ) -) ``` ] --- -# Customizing table outputs // - -SCREENSHOT OF OUTPUT IN DOCUMENTS FOLDER +# Customizing table outputs -ALSO SCREENSHOT OF TABLE OPENED IN EXCEL +<img src="img/session3/stats-custom.png" width="55%" style="display: block; margin: auto;" /> --- -# Customizing table outputs // +# Customizing table outputs Notice that here in the first part of the exercise we stored the result in a new object @@ -832,86 +1275,98 @@ ```r quick_xlsx( stats_table_custom, - "stats-custom.xlsx" + file = "stats-custom.xlsx" ) -) ``` --- -# Customizing table outputs // +# Customizing table outputs .pull-left[ -## Before +**Before:** + +<img src="img/session3/quick-stats-excel.png" width="95%" style="display: block; margin: auto;" /> ] .pull-right[ -## After +**After:** +<img src="img/session3/stats-custom.png" width="95%" style="display: block; margin: auto;" /> ] --- -# Customizing table outputs // +# Customizing table outputs -We used `theme_basic()` to give a minimalist, basic theme to the table. Other available themes are: +We used `theme_basic()` to give a minimalistic, basic theme to the table. Other available themes are: -ADD IMAGE +<img src="img/session3/themes.png" width="75%" style="display: block; margin: auto;" /> --- class: inverse, center, middle name: wrapping-up -# Wrapping up // +# Wrapping up // შეფუთვა <html><div style='float:left'></div><hr color='#D38C28' size=1px width=1100px></html> --- -# Wrapping up // +# Wrapping up // შეფუთვა ## Save your work! Click the floppy disk to save the script you wrote in this session. -ADD IMAGE +<img src="img/session3/save.png" width="55%" style="display: block; margin: auto;" /> --- -# Wrapping up // +# Wrapping up // შეფუთვა ## What else is available? - This was a short overview of how `modelsummary` and `huxtable` work together to produce professional-looking table outputs in R -- Other formatting options are: +- Other formatting options are: (all from `huxtable`) - + Export in new Excel tabs instead of new files - + Change variable and row names - + Change the width of columns or height of rows - + Color cells - + Use text formatting like bold or italics - -- This is explained in the libraries documentation: +| Formatting | Command | +|------------|---------| +|Export in new Excel tabs instead of new files|`as_Workbook()`| +|Change row names|`add_rownames()`| +|Change column names|`add_colnames()`| +|Cells in bold|`set_bold()`| +|Cells in italics|`set_italic()`| +|Cell font size|`font_size()`| +|Cell color|`background_color()`| + +--- + +# Wrapping up // შეფუთვა + +## What else is available? + +More of this is explained in the libraries documentation: + `modelsummary` documentation: https://modelsummary.com/index.html + `huxtable` documentation: https://hughjonesd.github.io/huxtable/ --- -# Wrapping up // +# Wrapping up // შეფუთვა ## This session -ADD IMAGE OF PIPELINE ENDING IN EXCEL EXPORTED OUTPUT +<img src="img/session3/data-work-descriptive-stats.png" width="95%" style="display: block; margin: auto;" /> --- -# Wrapping up // +# Wrapping up // შეფუთვა ## Next session (last one) -ADD IMAGE OF PIPELINE ENDING IN VISUALIZATIONS +<img src="img/session4/data-work-data-vis.png" width="95%" style="display: block; margin: auto;" /> --- diff --git a/Presentations-GeorgiaRS/202309/exercises-session2.R b/Presentations-GeorgiaRS/202309/exercises-session2.R index 7870324..51e10e7 100644 --- a/Presentations-GeorgiaRS/202309/exercises-session2.R +++ b/Presentations-GeorgiaRS/202309/exercises-session2.R @@ -1,3 +1,7 @@ +# Data +small_business_2019 <- read.csv("data/small_business_2019.csv") +small_business_2019_age <- read.csv("data/small_business_2019_age.csv") + # Exercise 3 library(dplyr) diff --git a/Presentations-GeorgiaRS/202309/exercises-session3.R b/Presentations-GeorgiaRS/202309/exercises-session3.R new file mode 100644 index 0000000..74a1d32 --- /dev/null +++ b/Presentations-GeorgiaRS/202309/exercises-session3.R @@ -0,0 +1,40 @@ +# Data +small_business_2019_all <- read.csv("data/small_business_2019_all.csv") +View(small_business_2019_all) + +# Exercise 1 +#install.packages("modelsummary") +#install.packages("huxtable") + +# Exercise 2 +df_tbilisi_50 <- filter(small_business_2019, + region == "Tbilisi") %>% + arrange(-income) %>% + filter(row_number() <= 50) + +# Exercise 3 +library(modelsummary) +datasummary_skim(small_business_2019_all) + +# Exercise 4 +datasummary( + age + income + vat_liability ~ N + Mean + SD + Min + Max, + small_business_2019_all +) + +# Exercise 5 +library(huxtable) +stats_table <- datasummary_skim(small_business_2019_all, output = "huxtable") +quick_xlsx(stats_table, file = "quick_stats.xlsx") + +# Exercise 6 +stats_table_custom <- stats_table %>% + set_header_rows(1, TRUE) %>% + set_header_cols(1, TRUE) %>% + set_number_format(everywhere, 2:ncol(.), "%9.0f") %>% + set_align(1, everywhere, "center") %>% + theme_basic() +quick_xlsx( + stats_table_custom, + file = "stats-custom.xlsx" +) \ No newline at end of file diff --git a/Presentations-GeorgiaRS/202309/exercises-session4.R b/Presentations-GeorgiaRS/202309/exercises-session4.R new file mode 100644 index 0000000..8d0ddf7 --- /dev/null +++ b/Presentations-GeorgiaRS/202309/exercises-session4.R @@ -0,0 +1,125 @@ +# Data +small_business_2019_all <- read.csv("data/small_business_2019_all.csv") + +# Exercise 1a +library(ggplot2) +ggplot(small_business_2019_all) + + aes(x = taxperiod, + y = vat_liability) + + geom_col() + + labs(title = "Total VAT liability of small businesses in 2019 by month") + +# Exercise 1b +ggplot(small_business_2019_all) + + aes(x = taxperiod, + y = vat_liability) + + geom_col() + + labs(title = "Total VAT liability of small businesses in 2019 by month", + # x-axis title + x = "Month", + # y-axis title + y = "Georgian Lari") + + # telling R not to break the x-axis + scale_x_continuous(breaks = 201901:201912) + + # centering plot title + theme(plot.title = element_text(hjust = 0.5)) + +# Exercise 1c +ggsave("vat_liability_small_2019.png", + width = 20, + height = 10, + units = "cm") + +# Exercise 2a +df_group_month <- small_business_2019_all %>% + select(group, taxperiod, vat_liability) %>% + group_by(group, taxperiod) %>% + summarize(total = sum(vat_liability)) + +# Exercise 2b +ggplot(df_group_month) + + aes(x = taxperiod, + y = total) + + geom_line(aes(color = group)) + + labs(title = "Total VAT liability of small businesses in 2019 by experiment group", + x = "Month", + y = "Georgian Lari") + + scale_x_continuous(breaks = 201901:201912) + + theme(plot.title = element_text(hjust = 0.5)) + +# Exercise 2c + ggplot(df_group_month) + + aes(x = taxperiod, + y = total) + + geom_line(aes(color = group)) + + labs(title = "Total VAT liability of small businesses in 2019 by experiment group", + x = "Month", + y = "Georgian Lari") + + scale_x_continuous(breaks = 201901:201912) + + theme(legend.text = element_text(size = 7), + axis.text.x = element_text(size = 6)) + +# Exercise 2d + ggsave("vat_liability_small_2019_by_group.png", + width = 20, + height = 10, + units = "cm") + +# Exercise 3a +df_month <- small_business_2019_all %>% + select(taxperiod, vat_liability) %>% + group_by(taxperiod) %>% + summarize(total = sum(vat_liability)) + +# Exercise 3b +ggplot(df_month) + + aes(x = taxperiod, + y = total) + + geom_col() + + geom_text(aes(label = total), + position = position_dodge(width = 1), + vjust = -0.5, + size = 3) + + labs(title = "Total VAT liability of small businesses in 2019 by month", + x = "Month", + y = "Georgian Lari") + + scale_x_continuous(breaks = 201901:201912) + + theme(plot.title = element_text(hjust = 0.5)) + +# Exercise 3c + ggplot(df_month) + + aes(x = taxperiod, + y = total) + + geom_col() + + geom_text(aes(label = round(total)), + position = position_dodge(width = 1), + vjust = -0.5, + size = 3) + + labs(title = "Total VAT liability of small businesses in 2019 by month", + x = "Month", + y = "Georgian Lari") + + scale_x_continuous(breaks = 201901:201912) + + theme(plot.title = element_text(hjust = 0.5)) + +# Saving +ggsave("vat_liability_small_2019_text.png", + width = 20, + height = 10, + units = "cm") + +# Exercise 4 + ggplot(small_business_2019_all) + + aes(x = age, + y = vat_liability) + + geom_point() + + labs(title = "VAT liability versus age for small businesses in 2019", + x = "Age of firm (years)", + y = "VAT liability") + + theme(plot.title = element_text(hjust = 0.5)) + +# Saving + ggsave("scatter_age_vat.png", + width = 20, + height = 10, + units = "cm") + \ No newline at end of file diff --git a/Presentations-GeorgiaRS/202309/img/session3/csv-to-excel.png b/Presentations-GeorgiaRS/202309/img/session3/csv-to-excel.png new file mode 100644 index 0000000..11a4812 Binary files /dev/null and b/Presentations-GeorgiaRS/202309/img/session3/csv-to-excel.png differ diff --git a/Presentations-GeorgiaRS/202309/img/session3/data-work-descriptive-stats.png b/Presentations-GeorgiaRS/202309/img/session3/data-work-descriptive-stats.png new file mode 100644 index 0000000..4fa40cb Binary files /dev/null and b/Presentations-GeorgiaRS/202309/img/session3/data-work-descriptive-stats.png differ diff --git a/Presentations-GeorgiaRS/202309/img/session3/df.png b/Presentations-GeorgiaRS/202309/img/session3/df.png new file mode 100644 index 0000000..4f69d14 Binary files /dev/null and b/Presentations-GeorgiaRS/202309/img/session3/df.png differ diff --git a/Presentations-GeorgiaRS/202309/img/session3/environment.png b/Presentations-GeorgiaRS/202309/img/session3/environment.png new file mode 100644 index 0000000..cf2589c Binary files /dev/null and b/Presentations-GeorgiaRS/202309/img/session3/environment.png differ diff --git a/Presentations-GeorgiaRS/202309/img/session3/import.png b/Presentations-GeorgiaRS/202309/img/session3/import.png new file mode 100644 index 0000000..e081439 Binary files /dev/null and b/Presentations-GeorgiaRS/202309/img/session3/import.png differ diff --git a/Presentations-GeorgiaRS/202309/img/session3/install.png b/Presentations-GeorgiaRS/202309/img/session3/install.png new file mode 100644 index 0000000..f5aae5f Binary files /dev/null and b/Presentations-GeorgiaRS/202309/img/session3/install.png differ diff --git a/Presentations-GeorgiaRS/202309/img/session3/quick-stats-excel.png b/Presentations-GeorgiaRS/202309/img/session3/quick-stats-excel.png new file mode 100644 index 0000000..d5e0a15 Binary files /dev/null and b/Presentations-GeorgiaRS/202309/img/session3/quick-stats-excel.png differ diff --git a/Presentations-GeorgiaRS/202309/img/session3/quick-stats-output.png b/Presentations-GeorgiaRS/202309/img/session3/quick-stats-output.png new file mode 100644 index 0000000..5cf3ebd Binary files /dev/null and b/Presentations-GeorgiaRS/202309/img/session3/quick-stats-output.png differ diff --git a/Presentations-GeorgiaRS/202309/img/session3/save.png b/Presentations-GeorgiaRS/202309/img/session3/save.png new file mode 100644 index 0000000..dd2a8bc Binary files /dev/null and b/Presentations-GeorgiaRS/202309/img/session3/save.png differ diff --git a/Presentations-GeorgiaRS/202309/img/session3/stats-custom.png b/Presentations-GeorgiaRS/202309/img/session3/stats-custom.png new file mode 100644 index 0000000..fc78a50 Binary files /dev/null and b/Presentations-GeorgiaRS/202309/img/session3/stats-custom.png differ diff --git a/Presentations-GeorgiaRS/202309/img/session3/themes.png b/Presentations-GeorgiaRS/202309/img/session3/themes.png new file mode 100644 index 0000000..b0e3ef8 Binary files /dev/null and b/Presentations-GeorgiaRS/202309/img/session3/themes.png differ diff --git a/Presentations-GeorgiaRS/202309/libs/kePrint/kePrint.js b/Presentations-GeorgiaRS/202309/libs/kePrint/kePrint.js new file mode 100644 index 0000000..e6fbbfc --- /dev/null +++ b/Presentations-GeorgiaRS/202309/libs/kePrint/kePrint.js @@ -0,0 +1,8 @@ +$(document).ready(function(){ + if (typeof $('[data-toggle="tooltip"]').tooltip === 'function') { + $('[data-toggle="tooltip"]').tooltip(); + } + if ($('[data-toggle="popover"]').popover === 'function') { + $('[data-toggle="popover"]').popover(); + } +}); diff --git a/Presentations-GeorgiaRS/202309/libs/lightable/lightable.css b/Presentations-GeorgiaRS/202309/libs/lightable/lightable.css new file mode 100644 index 0000000..3be3be9 --- /dev/null +++ b/Presentations-GeorgiaRS/202309/libs/lightable/lightable.css @@ -0,0 +1,272 @@ +/*! + * lightable v0.0.1 + * Copyright 2020 Hao Zhu + * Licensed under MIT (https://github.com/haozhu233/kableExtra/blob/master/LICENSE) + */ + +.lightable-minimal { + border-collapse: separate; + border-spacing: 16px 1px; + width: 100%; + margin-bottom: 10px; +} + +.lightable-minimal td { + margin-left: 5px; + margin-right: 5px; +} + +.lightable-minimal th { + margin-left: 5px; + margin-right: 5px; +} + +.lightable-minimal thead tr:last-child th { + border-bottom: 2px solid #00000050; + empty-cells: hide; + +} + +.lightable-minimal tbody tr:first-child td { + padding-top: 0.5em; +} + +.lightable-minimal.lightable-hover tbody tr:hover { + background-color: #f5f5f5; +} + +.lightable-minimal.lightable-striped tbody tr:nth-child(even) { + background-color: #f5f5f5; +} + +.lightable-classic { + border-top: 0.16em solid #111111; + border-bottom: 0.16em solid #111111; + width: 100%; + margin-bottom: 10px; + margin: 10px 5px; +} + +.lightable-classic tfoot tr td { + border: 0; +} + +.lightable-classic tfoot tr:first-child td { + border-top: 0.14em solid #111111; +} + +.lightable-classic caption { + color: #222222; +} + +.lightable-classic td { + padding-left: 5px; + padding-right: 5px; + color: #222222; +} + +.lightable-classic th { + padding-left: 5px; + padding-right: 5px; + font-weight: normal; + color: #222222; +} + +.lightable-classic thead tr:last-child th { + border-bottom: 0.10em solid #111111; +} + +.lightable-classic.lightable-hover tbody tr:hover { + background-color: #F9EEC1; +} + +.lightable-classic.lightable-striped tbody tr:nth-child(even) { + background-color: #f5f5f5; +} + +.lightable-classic-2 { + border-top: 3px double #111111; + border-bottom: 3px double #111111; + width: 100%; + margin-bottom: 10px; +} + +.lightable-classic-2 tfoot tr td { + border: 0; +} + +.lightable-classic-2 tfoot tr:first-child td { + border-top: 3px double #111111; +} + +.lightable-classic-2 caption { + color: #222222; +} + +.lightable-classic-2 td { + padding-left: 5px; + padding-right: 5px; + color: #222222; +} + +.lightable-classic-2 th { + padding-left: 5px; + padding-right: 5px; + font-weight: normal; + color: #222222; +} + +.lightable-classic-2 tbody tr:last-child td { + border-bottom: 3px double #111111; +} + +.lightable-classic-2 thead tr:last-child th { + border-bottom: 1px solid #111111; +} + +.lightable-classic-2.lightable-hover tbody tr:hover { + background-color: #F9EEC1; +} + +.lightable-classic-2.lightable-striped tbody tr:nth-child(even) { + background-color: #f5f5f5; +} + +.lightable-material { + min-width: 100%; + white-space: nowrap; + table-layout: fixed; + font-family: Roboto, sans-serif; + border: 1px solid #EEE; + border-collapse: collapse; + margin-bottom: 10px; +} + +.lightable-material tfoot tr td { + border: 0; +} + +.lightable-material tfoot tr:first-child td { + border-top: 1px solid #EEE; +} + +.lightable-material th { + height: 56px; + padding-left: 16px; + padding-right: 16px; +} + +.lightable-material td { + height: 52px; + padding-left: 16px; + padding-right: 16px; + border-top: 1px solid #eeeeee; +} + +.lightable-material.lightable-hover tbody tr:hover { + background-color: #f5f5f5; +} + +.lightable-material.lightable-striped tbody tr:nth-child(even) { + background-color: #f5f5f5; +} + +.lightable-material.lightable-striped tbody td { + border: 0; +} + +.lightable-material.lightable-striped thead tr:last-child th { + border-bottom: 1px solid #ddd; +} + +.lightable-material-dark { + min-width: 100%; + white-space: nowrap; + table-layout: fixed; + font-family: Roboto, sans-serif; + border: 1px solid #FFFFFF12; + border-collapse: collapse; + margin-bottom: 10px; + background-color: #363640; +} + +.lightable-material-dark tfoot tr td { + border: 0; +} + +.lightable-material-dark tfoot tr:first-child td { + border-top: 1px solid #FFFFFF12; +} + +.lightable-material-dark th { + height: 56px; + padding-left: 16px; + padding-right: 16px; + color: #FFFFFF60; +} + +.lightable-material-dark td { + height: 52px; + padding-left: 16px; + padding-right: 16px; + color: #FFFFFF; + border-top: 1px solid #FFFFFF12; +} + +.lightable-material-dark.lightable-hover tbody tr:hover { + background-color: #FFFFFF12; +} + +.lightable-material-dark.lightable-striped tbody tr:nth-child(even) { + background-color: #FFFFFF12; +} + +.lightable-material-dark.lightable-striped tbody td { + border: 0; +} + +.lightable-material-dark.lightable-striped thead tr:last-child th { + border-bottom: 1px solid #FFFFFF12; +} + +.lightable-paper { + width: 100%; + margin-bottom: 10px; + color: #444; +} + +.lightable-paper tfoot tr td { + border: 0; +} + +.lightable-paper tfoot tr:first-child td { + border-top: 1px solid #00000020; +} + +.lightable-paper thead tr:last-child th { + color: #666; + vertical-align: bottom; + border-bottom: 1px solid #00000020; + line-height: 1.15em; + padding: 10px 5px; +} + +.lightable-paper td { + vertical-align: middle; + border-bottom: 1px solid #00000010; + line-height: 1.15em; + padding: 7px 5px; +} + +.lightable-paper.lightable-hover tbody tr:hover { + background-color: #F9EEC1; +} + +.lightable-paper.lightable-striped tbody tr:nth-child(even) { + background-color: #00000008; +} + +.lightable-paper.lightable-striped tbody td { + border: 0; +} +