News Classification and Sentiment Analysis
In this notebook, we'll be analyzing a news data set from this website (https://data-flair.training/blogs/advanced-python-project-detecting-fake-news/)
tl;dr:
Libraries used:
sklearn specifically PassiveAggressiveClassifier among many more
re
The VADER lexicon
and many more standard libraries
Key Takeaways:
We trained a PassiveAgressiveClassifier and achieved an accuracy of about 94% (with precision and recall ~93%)
Preformed sentiment analysis and saw that FAKE articles were the usually the most positive or negative articles present in the dataset
Things I Learned:
Fake news is really scary and often seems real
How much choice of model impacts accuracy