Image-Text-Summarizer

It mainly comes in use when the reader is reading Novels, Stories or anything which contains large set of paragraphs. The reader can take an image of the paragraph and input it to the model. And the model will result in a summary of that paragraph. Basically, the model makes reading easy and time saving for the readers.

Technologies Used

Optical Text Recognition
Natural Language preprocessing

Make a file ocr.py in the project folder.

Setup Virual Environment

$ virtualenv venv --python=python3.6

$ source venv/bin/activate

Install dependencies

Pillow pip3 install Pillow
Pytesseract pip3 install pytesseract
OpenCV pip3 install opencv-python
NLTK pip3 install nltk

Run ocr.py

python3 ocr.py --image images/story1.jpg > story.txt

story.txt

This file contains all the text from the image story1.jpg using OCR with pytesseract.

Make a new file summarize.py

summarize.py

In this file we used python's NLTK for removing stop_words, puctuations. And also word & sentence tokenizers from the NLTK library.

Run summarize.py

python3 summarize.py story.txt > summary.txt

summary.txt

This file contains the summary of the the text file story.txt.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Image-Text-Summarizer

Technologies Used

Make a file ocr.py in the project folder.

Setup Virual Environment

Install dependencies

Run ocr.py

story.txt

Make a new file summarize.py

summarize.py

Run summarize.py

summary.txt

Files

README.md

Latest commit

History

README.md

File metadata and controls

Image-Text-Summarizer

Technologies Used

Make a file ocr.py in the project folder.

Setup Virual Environment

Install dependencies

Run ocr.py

story.txt

Make a new file summarize.py

summarize.py

Run summarize.py

summary.txt