Skip to content

Commit

Permalink
Headline Module Commit
Browse files Browse the repository at this point in the history
  • Loading branch information
asiffarhankhan committed Aug 25, 2019
1 parent e7e9f53 commit 2e57784
Show file tree
Hide file tree
Showing 16 changed files with 220 additions and 215 deletions.
2 changes: 1 addition & 1 deletion CODE_OF_CONDUCT.md → Code of Conduct.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,4 +78,4 @@ For answers to common questions about this code of conduct, see
https://www.contributor-covenant.org/faq


Copyright (c) 2019, John Steinhable
Copyright (c) 2019, The UnTruth team
61 changes: 61 additions & 0 deletions Idea/0-Algorithm_Proposal.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Proposing Algorithm

The critical thinking model applied on humans can also be applied to a program in order to write an algorithm that detects a fake news. The program can be written in several parts ensuring that each module carry out only a single step from the steps below.

##### Critical Thinking Model:

1. Read the headline.
2. Read the entire article.
3. Don’t believe a word of anything you read until you check facts and check sources.
4. Are the sources and facts credible? Why or why not?
5. Do a quick search engine scan to see who else has covered the story.
6. Do you see two sides (or more) to the article?
7. Are you being spun? Do you feel manipulated?
8. Are other credible news outlets covering the story?
9. Is this story a potential fake news story?


### Implementation


#### Read the headline
The headline will provide the program a rough idea. It may be designed in a way that the headline will be reverse-searched on top search engines and gather all the data from similar headlines into heap. The program will also look up for the data on the source website to estimate the legitness_score of that source.



#### Read the entire article
The next steps involves scanning through the whole article word by word and finding relevant patterns that may be crucial to further classify the article into fake or legit. Further the motive of the article may be compared with the headline to predict weather the misleading_title returns True or False



#### Don’t believe a word of anything you read until you check facts and check sources
The initial overall trust_score of the article always always remains -1 until all the scores are calculated i.e The program will always consider the news to be fake unless it had completely processed it, hence not giving any preference to BBC.com over FakeNews.com and both considered a fake initially



#### Are the sources and facts credible? Why or why not?
The source of the current article, the author and the images on the article are reverse-searched to ensure the credibility of the source. the history of posts from the same author and images uploaded on the article are original or just carried forward from other sources and articles



#### Do a quick search engine scan to see who else has covered the story. *


#### Do you see two sides (or more) to the article?
This step may involve checking if the article is comparing one entity with another example, political parties. The job of the program here is to determine what is being talked about here and what is it compared with eg: An article constantly comparing Males and Females



#### Are you being spun? Do you feel manipulated?
The next part will help determine if the article is biased towards one side more than the other, in the above example if the article is about Males and Females, the program checks if there's any bias to the comparison, One being favoured more over other and calculate the bias_score . When in favour of females the bias_score for females will be shown as +1 and -1 for men. unbias will be reflected with a bias_score totalling to 0



#### Are other credible news outlets covering the story?


#### Is this story a potential fake news story?
Finally after everything is taken into consideration, The parameters will be used to label the data to be a fake or a legit



2 changes: 1 addition & 1 deletion idea/1. Headline.md → Idea/1. Headline.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# 1. Headline

### I deas for executing a headline rating
### Ideas for executing a headline rating

#### 1.1 Trigger word list
Check the words contained in the headline against a pre determined list of words.
Expand Down
2 changes: 1 addition & 1 deletion LICENSE → License
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
MIT License

Copyright (c) 2019 John Steinhable
Copyright (c) 2019 The UnTruth team

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
File renamed without changes.
4 changes: 0 additions & 4 deletions functions/summary/config.py

This file was deleted.

19 changes: 0 additions & 19 deletions functions/summary/readme.md

This file was deleted.

3 changes: 0 additions & 3 deletions functions/summary/requirements.txt

This file was deleted.

95 changes: 0 additions & 95 deletions functions/summary/summary.py

This file was deleted.

5 changes: 0 additions & 5 deletions functions/url/config.py

This file was deleted.

23 changes: 0 additions & 23 deletions functions/url/readme.md

This file was deleted.

3 changes: 0 additions & 3 deletions functions/url/requirements.txt

This file was deleted.

55 changes: 0 additions & 55 deletions functions/url/url.py

This file was deleted.

55 changes: 55 additions & 0 deletions headline.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
import urllib.request
from bs4 import BeautifulSoup
from textblob.classifiers import NaiveBayesClassifier
from textblob import TextBlob

class title:

#Initialisations
def __init__(self):
self.news_url="https://edition.cnn.com/2019/08/25/politics/trump-g7-boris-johnson-emmanuel-macron/index.html"


def extract_headline(self):
self.net_con=True #Expecting Internet Connection to be working initially
try:
news_page=urllib.request.urlopen(self.news_url)
soup = BeautifulSoup(news_page,'html.parser')
headline_in_html=soup.find('h1')
headline=headline_in_html.text.strip()
return headline

except urllib.error.URLError:
print("\nCONNECTIION ERROR:There may be a connection problem. Please check if the device is connected to the Internet")
self.net_con=False #Value update if the program is unable to connenct


#Adding Training Data
def train_data(self, headline):
try:
with open('training_data.csv','r') as td:
cl=NaiveBayesClassifier(td,format='csv')
sentiment=cl.classify(headline)
return sentiment

except:
if self.net_con==False:
pass
else:
print("\n\nProgram Error")


def headline_category(self,headline,sentiment):

analyse_headline=TextBlob(headline)
print("\n"+"Headline:",headline,"\n")
print("Headline Sentiment:",sentiment,"\n\n")

def main(self):
hdln=self.extract_headline()
sntmnt=self.train_data(hdln)
self.headline_category(hdln,sntmnt)

if __name__=='__main__':
do_ya_thing=title()
do_ya_thing.main()
5 changes: 0 additions & 5 deletions main.py

This file was deleted.

Loading

0 comments on commit 2e57784

Please sign in to comment.