Skip to content

Commit

Permalink
update
Browse files Browse the repository at this point in the history
  • Loading branch information
eshoyuan committed Sep 8, 2022
1 parent 6a211b9 commit b46f3e4
Show file tree
Hide file tree
Showing 10 changed files with 39 additions and 1 deletion.
36 changes: 36 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
## Usage

1. Install PaddlePaddle.

You can refer to https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/doc/doc_en/quickstart_en.md#1-installation.

2. Install third-party libraries.

`pip install -r requirements.txt`

3. Prepare the dataset.

Unzip archive in `./data/`. Your directory & file structure should look like this,

```
Tiktok_OCR
└───data
│ │ train.txt
│ │ eval.txt
│ │
│ └───train_set_random
│ | │ v0d00fg10000cb9gacjc77u3gp5qggd0_0_.jpg
│ | │ ...
│ │
│ └───test_set_random
│ │ v0d00fg10000cb9gacjc77u3gp5qggd0_2_.jpg
│ │ ...
```

4. Execute the shell script.

`sh run.sh `


4 changes: 3 additions & 1 deletion README_cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,8 @@ Text-Image-Augmentation: 主要是下述三种变化, 是针对文本识别的

Pseudo Label: 将测试集置信度大于一定阈值的标签作为伪标签, 重新训练模型.

SVTR-Large: 以前的话还会担心大模型过拟合, 但是两个月看到https://www.zhihu.com/question/356398589, 似乎大模型一般可以泛化的更好, 结果也是如此.

removing STN: STN本身不应该对于性能有影响, 这里移去STN主要是发现默认尺寸不适合, 且考虑到该任务文本比较规则, 所以移去.

由于最后几天服务器宕机了, 没有来得及将Pseudo Label和SVTR-Large (removing STN) 结合, 模型集成也只有0.646一个SVTR-Large (removing STN) 模型, 其他都是原来0.625左右的SVTR-Large模型, 最后投票结果即为最终成绩0.66465.
Expand All @@ -100,7 +102,7 @@ removing STN: STN本身不应该对于性能有影响, 这里移去STN主要是
2. SOTA的弱监督目标定位. DDT是比较老的算法了, 本来是打算用PSOL的, DDT是PSOL算法的第一步, 但是发现效果不错, 就没有继续用别的了. 这里提升未知, 不一定有效果.
3. Domain adaptation. 这个作用可能不是很大, 因为数据增强中的模糊也可以类似的效果, 整体看这题的domain shift并不严重, 最后在验证集上的准确率0.75和测试集0.646差距不大.
4. 调参. 这个影响应该比较大, SVTR模型可调的参数比较多的而且实验下来影响较大, 没有机会细调.
5. RotNet自监督预训练. 这个根据CVPR2021论文[What if we only use real datasets for scene text recognition? toward scene text recognition with fewer labels](http://openaccess.thecvf.com/content/CVPR2021/html/Baek_What_if_We_Only_Use_Real_Datasets_for_Scene_Text_CVPR_2021_paper.html)甚至好于MoCo.
5. RotNet自监督预训练. 这个根据论文[What if we only use real datasets for scene text recognition? toward scene text recognition with fewer labels](http://openaccess.thecvf.com/content/CVPR2021/html/Baek_What_if_We_Only_Use_Real_Datasets_for_Scene_Text_CVPR_2021_paper.html)甚至好于MoCo.



Expand Down
Binary file removed imgs/douyu-frame-example_0da1dc04.png
Binary file not shown.
Binary file removed imgs/image-20220908204913076.png
Binary file not shown.
Binary file removed imgs/image-20220908210111319.png
Binary file not shown.
Binary file removed imgs/image-20220908212615794.png
Binary file not shown.
Binary file removed imgs/image-20220908213152888.png
Binary file not shown.
Binary file removed imgs/image-20220908213432357.png
Binary file not shown.
Binary file removed imgs/img1.png
Binary file not shown.
File renamed without changes.

0 comments on commit b46f3e4

Please sign in to comment.