Skip to content

aledurso22/Applied-Data-Analysis-Project---copy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

Is YouTube Riding the Mental Health Wave?

Data Story Website

Authors : Shaochen Bai, Liyu Zhang, Meichen Guo, Alessio Desogus & Alessandro D'Urso

NOTE: we are using plotly to have interactive plots which cannot be displayed on milestone_P3.ipynb. But you can find the results in our data story listed below.

Read our full data story here.

Table of Contents 📋

  1. Abstract
  2. Research Questions
  3. Proposed Additional Datasets
  4. Methods
  5. Proposed Timeline
  6. Organization Within the Team - Work Repartition
  7. Sources

Abstract 📝

  • Deadlines coming! Have you ever felt stressful, anxious or even depressed?

  • Have you ever posted your pressure on YouTube, or simply seek guidance from videos talking about mental problems on the platform?

  • Do you know there are growing amount of discussions about mental health on Youtube?

  • What are they like? How many talk about them? How they influence today's channel content?

Follow our analysis and find out! Centered on the topic of mental health on YouTube, our main goal is to look for the trend, track their contents and see the channels response toward it.

Research Questions ❓

Here are the specific research questions we plan to address :

  • Is mental health a trend on YouTube?
  • Which topics predominate in the mental health category?
  • What is the common sentiment when people talk about mental health, and what does that reveil?
  • Can we see an increase (or decrease) in performance (subscribers, views, likes, ...) for channels that speak about mental health?
  • Did channels that were not speaking about mental health start more often to speak about it?

Proposed Additional Datasets 📊

  • Our analysis is well-supported by the existing (huge) dataset, and we find it sufficiently comprehensive for our research questions. Thus, we will not incorporate additional datasets.

Methods 🔍

Methods and techniques divided by each dataset that we intend to use for our analysis:

Video Filtering: The method is based on snowball keyword matching. First, we design a comprehensive keyword list regarding mental health and retrieve videos whose text fields match the words. Then we iteratively look at the retrieved result and append new words to the list that we found useful. A more detailed description of the method is explained in the notebook.

Comparing trends: To compare trends between different topics, we would like to first select a set of socially important topics representative in their respective categories, which is climate change and gender inequality.

Predominant subtopics in mental health: We first categorize the keywords into several subtopic categories [General, Lonely, Depress, Stress, Suicide, Trauma, Disorder], and people into several groups [Man, Woman, Teenager, Senior] and see their trend and number.

In this section, our primary objectives are the followings :

Identify Gain in Views and Subscribers for Mental Health Channels: By analyzing the time-series data, our goal is to detect if mental health video uploading signify increased views and subscribers growth within mental health-related channels. This exploration provides valuable insights into emerging trends and their dynamic behaviors.

Select the most symbolic channel: from the analysis of before, we try to find channels that mostly reflect the abovementioned gainings

Identify an increasing number of uploads of mental health video: if really thre is a gain in views when uploading a mental helath video, then, singularly, do channels upload more of it

Given considerations of project feasibility, including both the size and type of data required, the utilization of this dataset is omitted.

Proposed Timeline & Internal Milestones (IM) 📅

Task Start Week End Week
First Analysis & Results (IM1) Week 8 Week 9
Comparative Evaluation Week 9 Week 10
Second Analysis & Results (IM2) Week 10 Week 13
Web Site Implementation Week 10 Week 13
Final Check & Submission (IM3) Week 13 Week 14

Organization Within the Team - Work Repartition 🤝

Given the existence of three entirely independent datasets, distinct teams have been designated for each dataset: the Video Metadata, Channel Metadata, and Time-series Data. Concurrently, parallel efforts are dedicated to exploring the data story, website implementation, and the comprehensive contextual framework of the analysis.

Team Member Work Done
Liyu Zhang Channel metadata analysis, sentiment analysis, visualizations for data story, website development.
Meichen Guo Channel metadata analysis, textual context for data story development.
Alessio Desogus Time-series analysis, website creation & development, textual context for data story and README.
Alessandro D'Urso Time-series analysis, visualizations for data story, textual context for data story.
Shaochen Bai Video metadata analysis, pre-proccessing data, visualizations for data story, website development.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published