-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathREADME.qmd
152 lines (115 loc) · 5.01 KB
/
README.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
---
format:
gfm:
preview-mode: raw
execute:
echo: false
---
<!-- README.md is generated from README.qmd. Please edit that file -->
# Ethiopian Political YouTube Channels Daily Statistics
This repository contains a Python script that collects daily statistics,
such as view count, subscriber count, and video count, from a selected
list of Ethiopian YouTube channels that have political proclivities or
heavily discuss Ethiopian politics.
The script `main.py` has been running since `January 01, 2023`, see
[.`/data/channels-data_01-2023.csv`](./data/channels-data_01-2023.csv).
The repo also includes R scripts for generating charts based on the
collected data. You can use this repo to collect daily statistics of
other YouTube channels, see the [Usage](#usage) section below.
## Table of Contents
- [Overview](#overview)
- [Requirements](#requirements)
- [Installation](#installation)
- [Usage](#usage)
- [Visualizing Data](#visualizing-data)
- [Contributing](#contributing)
## Overview {#overview}
The purpose of this project is to gather daily insights and analyze the
impact of Ethiopian political YouTube channels on their audience. By
collecting and processing data such as view count, subscriber count, and
video count, we can better understand the reach and influence of these
channels in the Ethiopian political landscape, and identify which
channels are more popular.
<details>
<summary>View a snapshot of stats</summary>
```{r}
#| label: example-snapshot-stats
file_path = dir("data/", "channels-data_\\d{4}-\\d{2}[.]csv", full.names = TRUE) |>
sort(decreasing = TRUE) |>
head(1)
stats = read.csv(file_path)
stats$retrievedAt = as.POSIXct(stats$retrievedAt, "utc")
stats$id = with(stats, sprintf("[%s](https://youtube.com/channel/%s)", snippet.title, id)
)
latest_time = stats$retrievedAt[1]
stats[stats$retrievedAt == latest_time, -grep("retrievedAt|snippet\\.title", names(stats), ignore.case = TRUE)] |>
knitr::kable(caption = sprintf("Snapshot of stats as of %s", latest_time))
```
</details>
## Requirements {#requirements}
- Python 3.6 or higher
- [google-api-python-client](https://github.com/googleapis/google-api-python-client)
package
- R dependencies: `data.table` and `ggplot2`
- [Quarto](https://quarto.org) (for rendering chart.qmd)
## Installation {#installation}
1. Clone this repository:
``` bash
git clone https://github.com/eyayaw/ethio-youtube-channels.git
```
2. Change the directory to the cloned repository:
``` bash
cd ethio-youtube-channels
```
3. Install the required Python package:
``` bash
pip install -r requirements.txt
```
# Usage {#usage}
1. To use the script, you'll need to obtain a Google API key. Follow
the instructions in the [Google API Console
documentation](https://developers.google.com/youtube/registering_an_application)
to create a project and obtain an API key. Please read the [YouTube
Data API getting
started](https://developers.google.com/youtube/v3/getting-started).
2. Set the API key as an environment variable named
`YOUTUBE_DATA_API_V3_KEY` :
`export YOUTUBE_DATA_API_V3_KEY=your_api_key`
3. Update the channels-list.csv file with the list of YouTube channel
IDs that you want to collect daily statistics about/from. Add one
channel ID per row.
4. Run the script: `python main.py`
5. The script will generate a CSV file named
`channels-data_MM-YYYY.csv` (where MM is the month and YYYY is the
year, generated automatically as the script runs) in the `data`
folder containing the daily statistics for each channel in the
`channels-list.csv` file. Additionally, "static" information of
the channels given in `channels-list.csv` will be written to
`data/channels-info.csv`, and will be updated when new channels are
added to the list.
# Visualizing Data {#visualizing-data}
1. Install the required R packages by running the following command in
the R console: `install.packages(c("ggplot2", "data.table"))`
2. Open the `make-plots.R` script and modify the input file name to
match the latest CSV file in the data folder.
3. Run the `make-plots.R` script to generate an interactive chart and
save it as chart.html.
4. Install Quarto and use it to render the `analysis.qmd` file.
# Contributing {#contributing}
I highly welcome contributions to this project. If you want to add new
features, fix bugs, or improve documentation, please submit a pull
request or open an issue.
------------------------------------------------------------------------
> # Inspired by
>
> 1. [Corey Schafer](https://github.com/CoreyMSchafer)'s [Python
> YouTube API
> tutorials](https://www.youtube.com/@coreyms/search?query=python%20youtube%20api%20tutorial),
> and
> 2. [Patrik Loeber](https://github.com/patrickloeber)'s [YouTube Data
> API
> tutorials](https://www.python-engineer.com/posts/youtube-data-api-01)
```{r}
#| eval: false
knitr::include_graphics(sprintf("figs/channels-plot_%s.png", substr(latest_time, 1, 19)))
```