LPCVC-2020 Sample Solution

Overview

This is the sample solution for Low Power Computer Vision Challenge (LPCVC) 2020 Video Track. This solution serves only as the baseline solution and a lot of improvements can be made on top of this to further optimize the performance of the solution.

The proposed solution is made up of 3 blocks. The first block (sampling block) takes in a video file and determine which frames are worth doing detection and recognition on. This sample solution does so by dissecting the motion vector from the H.264 encoding of the video to pick out stationary i-frames. The second block (detection block) does word detection on the frames selected from the sampling block. This sample solution uses EAST Detector. Lastly, the third block (recognition block) does optical character recognition (OCR) on the cropped words. The sample solution provides two choices: Connectionist Temporal Classification (CTC) or Attention OCR.

Setup

Clone code from master branch.

git clone https://github.com/tanliyon/lpcvc-2020.git

Download model file for all EAST-Detector, CTC and Attention OCR.
EAST-Detector
CTC
Attention-Encoder
Attention-Decoder
Install dependencies.
pip install -r requirements.txt
Note that lanms might not work with Windows.
Check directory structure. It should be:
lpcvc-2020
|_wrapper.py
|_detector.pth
|_ctc.pth
|_encoder.pth
|_decoder.pth
|_(all other folders pulled from master)

Usage

The call syntax is:

python main.py video_file_path.mp4 question_file_path.txt

To toggle between the two recognition option, you can toggle the USE_ATTN_OCR flag in main.py. The SHOW_BOXES flag controls if the detection output should be saved in a folder and the SHOW_TEXT flag controls if the recognition prediction should be printed in stdout.

Notes

Currently, the solution takes a long time because of the number of frames it run inference on. If you want to test only a portion of it, run the code for a set amount of time, then comment out the line frames_list = iFRAMES(video_path) in wrapper.py. Then run the code again.

References

Low Power Computer Vision Challenge (LPCVC) 2020 Video Track
EAST Detector
Connectionist Temporal Classification (CTC)
Attention OCR

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

README.md

README.md

LPCVC-2020 Sample Solution

Overview

Contents

Setup

Usage

Notes

References

Files

README.md

Latest commit

History

README.md

File metadata and controls

LPCVC-2020 Sample Solution

Overview

Contents

Setup

Usage

Notes

References