fds-final-project

This project is part of Foundation of Data Science Course in Sapienza University 2024. It focuses on sentiment analysis of Libyan dialect poems using machine learning (ML) techniques. The dataset includes annotated Libyan poems labeled as positive or negative, and three classifiers—Support Vector Machine (SVM), Naïve Bayes (NB), and Logistic Regression (LR)—were used for the analysis. Preprocessing steps include cleaning, tokenization, normalization, stopword removal, stemming, and lemmatization, with feature extraction techniques such as TF-IDF and N-grams (Unigrams, Bigrams, and Trigrams). Six experiments were conducted to evaluate the impact of various preprocessing and feature extraction methods on classifier performance. The results showed that SVM performed the best, achieving an accuracy of 74.63% when Unigrams and Trigrams were used. The project files include the dataset, preprocessing scripts, model training and evaluation scripts, and the final report, which documents the methodology, results, and comparisons with previous studies.

Note

The dataset we used for this project is a private dataset. We can't upload it in a public repo without the author's authorization.

Paper link: https://arxiv.org/abs/2109.07203

Some Results

The results highlight the impact of various preprocessing strategies on sentiment classification accuracy. The table below summarizes the accuracy for different experiments:

Experiment	Logistic Regression	Naive Bayes	SVM
Lemmatization with Stopword Removal	0.72	0.68	0.74
Lemmatization without Stopword Removal	0.70	0.65	0.71
Stemming with Stopword Removal	0.73	0.69	0.75
Stemming without Stopword Removal	0.71	0.67	0.72
Stemming and Lemmatization with Stopwords	0.74	0.70	0.76
Stemming and Lemmatization without Stopwords	0.72	0.66	0.73

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
29_fds_final_project.ipynb		29_fds_final_project.ipynb
29_final_report.pdf		29_final_report.pdf
README.md		README.md
libyan-stopwords-list.joblib		libyan-stopwords-list.joblib
results_figure.png		results_figure.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fds-final-project

Note

Paper link: https://arxiv.org/abs/2109.07203

Some Results

About

Releases

Packages

Contributors 2

Languages

habberrih/fds-final-project

Folders and files

Latest commit

History

Repository files navigation

fds-final-project

Note

Paper link: https://arxiv.org/abs/2109.07203

Some Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages