Headline Module Commit

Learning-Python-Team · Aug 25, 2019 · 2e57784 · 2e57784
1 parent e7e9f53
commit 2e57784
Show file tree

Hide file tree

Showing 16 changed files with 220 additions and 215 deletions.
diff --git a/CODE_OF_CONDUCT.md → Code of Conduct.md b/CODE_OF_CONDUCT.md → Code of Conduct.md
@@ -78,4 +78,4 @@ For answers to common questions about this code of conduct, see
 https://www.contributor-covenant.org/faq
 
 
-Copyright (c) 2019, John Steinhable
+Copyright (c) 2019, The UnTruth team
diff --git a/Idea/0-Algorithm_Proposal.md b/Idea/0-Algorithm_Proposal.md
@@ -0,0 +1,61 @@
+# Proposing Algorithm
+
+The critical thinking model applied on humans can also be applied to a program in order to write an algorithm that detects a fake news. The program can be written in several parts ensuring that each module carry out only a single step from the steps below.
+
+##### Critical Thinking Model:
+
+1. Read the headline.
+2. Read the entire article.
+3. Don’t believe a word of anything you read until you check facts and check sources.
+4. Are the sources and facts credible? Why or why not?
+5. Do a quick search engine scan to see who else has covered the story.
+6. Do you see two sides (or more) to the article?
+7. Are you being spun? Do you feel manipulated?
+8. Are other credible news outlets covering the story?
+9. Is this story a potential fake news story?
+
+
+### Implementation
+
+
+#### Read the headline
+The headline will provide the program a rough idea. It may be designed in a way that the headline will be reverse-searched on top search engines and gather all the data from similar headlines into heap. The program will also look up for the data on the source website to estimate the legitness_score of that source.
+
+
+
+#### Read the entire article
+The next steps involves scanning through the whole article word by word and finding relevant patterns that may be crucial to further classify the article into fake or legit. Further the motive of the article may be compared with the headline to predict weather the misleading_title returns True or False
+
+
+
+#### Don’t believe a word of anything you read until you check facts and check sources
+The initial overall trust_score of the article always always remains -1 until all the scores are calculated i.e The program will always consider the news to be fake unless it had completely processed it, hence not giving any preference to BBC.com over FakeNews.com and both considered a fake initially
+
+
+
+#### Are the sources and facts credible? Why or why not?
+The source of the current article, the author and the images on the article are reverse-searched to ensure the credibility of the source. the history of posts from the same author and images uploaded on the article are original or just carried forward from other sources and articles
+
+
+
+#### Do a quick search engine scan to see who else has covered the story. *
+
+
+#### Do you see two sides (or more) to the article?
+This step may involve checking if the article is comparing one entity with another example, political parties. The job of the program here is to determine what is being talked about here and what is it compared with eg: An article constantly comparing Males and Females
+
+
+
+#### Are you being spun? Do you feel manipulated?
+The next part will help determine if the article is biased towards one side more than the other, in the above example if the article is about Males and Females, the program checks if there's any bias to the comparison, One being favoured more over other and calculate the bias_score . When in favour of females the bias_score for females will be shown as +1 and -1 for men. unbias will be reflected with a bias_score totalling to 0
+
+
+
+#### Are other credible news outlets covering the story?
+
+
+#### Is this story a potential fake news story?
+Finally after everything is taken into consideration, The parameters will be used to label the data to be a fake or a legit
+
+
+
diff --git a/idea/1. Headline.md → Idea/1. Headline.md b/idea/1. Headline.md → Idea/1. Headline.md
@@ -1,6 +1,6 @@
 # 1. Headline
 
-### I deas for executing a headline rating
+### Ideas for executing a headline rating
 
 #### 1.1 Trigger word list
 Check the words contained in the headline against a pre determined list of words.  

diff --git a/LICENSE → License b/LICENSE → License
@@ -1,6 +1,6 @@
 MIT License
 
-Copyright (c) 2019 John Steinhable
+Copyright (c) 2019 The UnTruth team
 
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal

diff --git a/README.md → Read Me.md b/README.md → Read Me.md
diff --git a/functions/summary/config.py b/functions/summary/config.py
diff --git a/functions/summary/readme.md b/functions/summary/readme.md
diff --git a/functions/summary/requirements.txt b/functions/summary/requirements.txt
diff --git a/functions/summary/summary.py b/functions/summary/summary.py
diff --git a/functions/url/config.py b/functions/url/config.py
diff --git a/functions/url/readme.md b/functions/url/readme.md
diff --git a/functions/url/requirements.txt b/functions/url/requirements.txt
diff --git a/functions/url/url.py b/functions/url/url.py
diff --git a/headline.py b/headline.py
@@ -0,0 +1,55 @@
+import urllib.request
+from bs4 import BeautifulSoup
+from textblob.classifiers import NaiveBayesClassifier
+from textblob import TextBlob
+
+class title:
+
+    #Initialisations
+    def __init__(self): 
+        self.news_url="https://edition.cnn.com/2019/08/25/politics/trump-g7-boris-johnson-emmanuel-macron/index.html"
+
+
+    def extract_headline(self):
+        self.net_con=True #Expecting Internet Connection to be working initially
+        try:
+            news_page=urllib.request.urlopen(self.news_url)   
+            soup = BeautifulSoup(news_page,'html.parser')
+            headline_in_html=soup.find('h1')
+            headline=headline_in_html.text.strip()
+            return headline
+
+        except urllib.error.URLError:
+            print("\nCONNECTIION ERROR:There may be a connection problem. Please check if the device is connected to the Internet")
+            self.net_con=False #Value update if the program is unable to connenct
+
+
+    #Adding Training Data
+    def train_data(self, headline):
+        try:
+            with open('training_data.csv','r') as td:
+                cl=NaiveBayesClassifier(td,format='csv')
+                sentiment=cl.classify(headline)
+                return sentiment
+
+        except:
+            if self.net_con==False:
+                pass
+            else:
+                print("\n\nProgram Error")
+
+
+    def headline_category(self,headline,sentiment):
+
+        analyse_headline=TextBlob(headline)
+        print("\n"+"Headline:",headline,"\n")
+        print("Headline Sentiment:",sentiment,"\n\n")
+
+    def main(self):
+        hdln=self.extract_headline()
+        sntmnt=self.train_data(hdln)
+        self.headline_category(hdln,sntmnt)
+
+if __name__=='__main__':
+    do_ya_thing=title()
+    do_ya_thing.main()
diff --git a/main.py b/main.py
Original file line number	Diff line number	Diff line change
Expand Up		@@ -78,4 +78,4 @@ For answers to common questions about this code of conduct, see
		https://www.contributor-covenant.org/faq


		Copyright (c) 2019, John Steinhable
		Copyright (c) 2019, The UnTruth team