DBL Data Challenge

About the project

Welcome to the public repository for the DBL Data Challenge project! This README file serves as a guide to help you understand our project, its purpose, and how it can provide valuable insights into airline data. Our team, Group 16, was responsible for analyzing data from the social media platform "Twitter" for our client, "Lufthansa." We compared the Twitter data of Lufthansa with their competitor, "American Airlines." The goal of our analysis was to gain insights into customer sentiments, preferences, and overall performance of the two airlines on social media This project was developed as part of the Data Science program at the Technical University of Eindhoven.

Built with

Python
MongoDB

Prerequisites

Before you proceed, ensure that you have the following installed in your local machine:

Git: a version control system for tracking changes in computer files and coordinating work on those files among multiple people.
Python: a popular programming language. This project is built with Python, ensure you have version 3.x installed.
pip: a package installer for Python. You can usually install it alongside Python.

Getting started

To get started with our project, we have provided an installation guide that outlines the required dependencies and steps for setting up the environment. Following these instructions can ensure a smooth setup process and avoid any potential compatibility issues.

Clone the repository on your machine using git clone command. Use git pull to fetch the latest updates.
Set up the local environment using a venv module python3 -m venv env. Use the following prompt to activate it .\env\Scripts\activate for windows or source env/bin/activate for macOS.
Download the airline files that have been filtered on MongoDB (No code for this. Queries that were used in the MongoDB Compass are given)

Roadmap of python files for the sprints

Sprint 1 - extraction and cleaning of data using MongoDB/MySQLite and fundamental analysis

* MongoDB Queries - One csv file for each Airline  
* DATA CLEANING.py
* JSON load.py
* Plots-Extras.py
* extra_task.py
* data_cleaning.py
* json_load.py

Sprint 2 - the refinement of data and basic sentiment analysis

* Download the airline files before running these tasks
* cleaning-csv.py (Airline Cleaning and Prep for conversation Extraction)
* conversation-extract.py (Use path of each airline file thats been cleaned and and use corresponding
airline id in the functions)
* MeanSentiment.py
* TextBlob-testing.py
* Vader-testing.py
* extra 1 pres 2.py
* Response Time Sentiment ExtraSprint2.py
* sentiment analysis.py
* statistics_convo.py

Sprint 3 - sentiment analysis of conversations

* extras 1 pres 2.py
* sentiment analysis on conversations_S3_t1.py
* sprint 3_task2.py

Sprint 4 - deep sentiment analysis of conversations

* one sided convo extra.py
* reply words polina demo.py
* sentiment flight related tweets.py
* sentiment over reply count.py
* First Response Sentiment by Client.py (Preps conversation file for response time vs sentiment graphs)
* Sent vs Response Time Lufthansa First Reply.py
* Covid Times, response and sentiment.py

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
.idea		.idea
For Demo		For Demo
Sprint 1		Sprint 1
Sprint 2		Sprint 2
Sprint 3		Sprint 3
Sprint 4		Sprint 4
Sqlite		Sqlite
UNUSED CODES		UNUSED CODES
.DS_Store		.DS_Store
.gitignore		.gitignore
First Response Sentiment by Client.py		First Response Sentiment by Client.py
README.md		README.md
TextBlob-testing.py		TextBlob-testing.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DBL Data Challenge

About the project

Built with

Prerequisites

Getting started

Roadmap of python files for the sprints

Sprint 1 - extraction and cleaning of data using MongoDB/MySQLite and fundamental analysis

Sprint 2 - the refinement of data and basic sentiment analysis

Sprint 3 - sentiment analysis of conversations

Sprint 4 - deep sentiment analysis of conversations

About

Releases

Packages

Languages

ummtushar/NLP-Tweets

Folders and files

Latest commit

History

Repository files navigation

DBL Data Challenge

About the project

Built with

Prerequisites

Getting started

Roadmap of python files for the sprints

Sprint 1 - extraction and cleaning of data using MongoDB/MySQLite and fundamental analysis

Sprint 2 - the refinement of data and basic sentiment analysis

Sprint 3 - sentiment analysis of conversations

Sprint 4 - deep sentiment analysis of conversations

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages