Skip to content

Commit c879001

Browse files
authored
v0.5.0 adaption (#813)
* v0.5.0 adaption * ci fix * fix doc version
1 parent 21f0713 commit c879001

36 files changed

+389
-351
lines changed

README.md

+1
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ mindspore versions.
4646
| mindocr | mindspore |
4747
|:-------:|:-----------:|
4848
| main | master |
49+
| 0.5 | 2.5.0 |
4950
| 0.4 | 2.3.0/2.3.1 |
5051
| 0.3 | 2.2.10 |
5152
| 0.1 | 1.8 |

README_CN.md

+1
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ MindOCR是一个基于[MindSpore](https://www.mindspore.cn/) 框架开发的OCR
4646
| mindocr | mindspore |
4747
|:-------:|:-----------:|
4848
| main | master |
49+
| 0.5 | 2.5.0 |
4950
| 0.4 | 2.3.0/2.3.1 |
5051
| 0.3 | 2.2.10 |
5152
| 0.1 | 1.8 |

configs/cls/mobilenetv3/README.md

+16-18
Original file line numberDiff line numberDiff line change
@@ -31,25 +31,11 @@ Currently we support the 0 and 180 degree classification. You can update the par
3131

3232
</div>
3333

34+
## Requirements
3435

35-
## Results
36-
37-
| mindspore | ascend driver | firmware | cann toolkit/kernel |
38-
|:---------:|:---------------:|:------------:|:-------------------:|
39-
| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
40-
41-
MobileNetV3 is pretrained on ImageNet. For text direction classification task, we further train MobileNetV3 on RCTW17, MTWI and LSVT datasets.
42-
43-
Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode
44-
<div align="center">
45-
46-
| **model name** | **cards** | **batch size** | **img/s** | **accuracy** | **config** | **weight** |
47-
|----------------|-----------|----------------|-----------|--------------|-----------------------------------------------------|------------------------------------------------|
48-
| MobileNetV3 | 4 | 256 | 5923.5 | 94.59% | [yaml](cls_mv3.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/cls/cls_mobilenetv3-92db9c58.ckpt) |
49-
</div>
50-
51-
52-
36+
| mindspore | ascend driver | firmware | cann toolkit/kernel |
37+
|:----------:|:--------------:|:--------------:|:-------------------:|
38+
| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
5339

5440
## Quick Start
5541

@@ -128,6 +114,18 @@ Please set the checkpoint path to the arg `ckpt_load_path` in the `eval` section
128114
python tools/eval.py -c configs/cls/mobilenetv3/cls_mv3.yaml
129115
```
130116

117+
## Performance
118+
119+
MobileNetV3 is pretrained on ImageNet. For text direction classification task, we further train MobileNetV3 on RCTW17, MTWI and LSVT datasets.
120+
121+
Experiments are tested on ascend 910* with mindspore 2.5.0 graph mode
122+
<div align="center">
123+
124+
| **model name** | **cards** | **batch size** | **img/s** | **accuracy** | **config** | **weight** |
125+
|----------------|-----------|----------------|-----------|--------------|-----------------------------------------------------|------------------------------------------------|
126+
| MobileNetV3 | 4 | 256 | 5923.5 | 94.59% | [yaml](cls_mv3.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/cls/cls_mobilenetv3-92db9c58.ckpt) |
127+
</div>
128+
131129
## References
132130

133131
<!--- Guideline: Citation format GB/T 7714 is suggested. -->

configs/cls/mobilenetv3/README_CN.md

+16-15
Original file line numberDiff line numberDiff line change
@@ -31,22 +31,11 @@ MobileNetV3[[1](#参考文献)]于2019年发布,这个版本结合了V1的deep
3131

3232
</div>
3333

34+
### 配套版本
3435

35-
## 实验结果
36-
37-
| mindspore | ascend driver | firmware | cann toolkit/kernel |
38-
|:---------:|:---------------:|:------------:|:-------------------:|
39-
| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
40-
41-
MobileNetV3在ImageNet上预训练。另外,我们进一步在RCTW17、MTWI和LSVT数据集上进行了文字方向分类任务的训练。
42-
43-
在采用图模式的ascend 910*上实验结果,mindspore版本为2.3.1
44-
<div align="center">
45-
46-
| **模型名称** | **卡数** | **单卡批量大小** | **img/s** | **准确率** | **配置** | **权重** |
47-
|-------------|--------|------------|-----------|---------|----------------------|------------------------------------------------------------------------------------------|
48-
| MobileNetV3 | 4 | 256 | 5923.5 | 94.59% | [yaml](cls_mv3.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/cls/cls_mobilenetv3-92db9c58.ckpt) |
49-
</div>
36+
| mindspore | ascend driver | firmware | cann toolkit/kernel |
37+
|:----------:|:--------------:|:--------------:|:-------------------:|
38+
| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
5039

5140

5241
## 快速上手
@@ -128,6 +117,18 @@ model:
128117
python tools/eval.py -c configs/cls/mobilenetv3/cls_mv3.yaml
129118
```
130119

120+
## 性能表现
121+
122+
MobileNetV3在ImageNet上预训练。另外,我们进一步在RCTW17、MTWI和LSVT数据集上进行了文字方向分类任务的训练。
123+
124+
在采用图模式的ascend 910*上实验结果,mindspore版本为2.5.0
125+
<div align="center">
126+
127+
| **模型名称** | **卡数** | **单卡批量大小** | **img/s** | **准确率** | **配置** | **权重** |
128+
|-------------|--------|------------|-----------|---------|----------------------|------------------------------------------------------------------------------------------|
129+
| MobileNetV3 | 4 | 256 | 5923.5 | 94.59% | [yaml](cls_mv3.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/cls/cls_mobilenetv3-92db9c58.ckpt) |
130+
</div>
131+
131132
## 参考文献
132133

133134
<!--- Guideline: Citation format GB/T 7714 is suggested. -->

configs/det/dbnet/README.md

+2-4
Original file line numberDiff line numberDiff line change
@@ -56,13 +56,11 @@ combination of these two modules leads to scale-robust feature fusion.
5656
DBNet++ performs better in detecting text instances of diverse scales, especially for large-scale text instances where
5757
DBNet may generate inaccurate or discrete bounding boxes.
5858

59-
6059
## Requirements
6160

6261
| mindspore | ascend driver | firmware | cann toolkit/kernel |
6362
|:----------:|:--------------:|:--------------:|:-------------------:|
64-
| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
65-
63+
| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
6664

6765
## Quick Start
6866

@@ -290,7 +288,7 @@ msrun --worker_num=2 --local_worker_num=2 python tools/train.py --config configs
290288
# Based on verification,binding cores usually results in performance acceleration.Please configure the parameters and run.
291289
msrun --bind_core=True --worker_num=2 --local_worker_num=2 python tools/train.py --config configs/det/dbnet/db_r50_icdar15.yaml
292290
```
293-
**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/tutorials/experts/en/r2.3.1/parallel/msrun_launcher.html).
291+
**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/msrun_launcher.html).
294292

295293
The training result (including checkpoints, per-epoch performance and curves) will be saved in the directory parsed by the arg `ckpt_save_dir` in yaml config file. The default directory is `./tmp_det`.
296294

configs/det/dbnet/README_CN.md

+2-3
Original file line numberDiff line numberDiff line change
@@ -45,8 +45,7 @@ DBNet++在检测不同尺寸的文本方面表现更好,尤其是对于尺寸
4545

4646
| mindspore | ascend driver | firmware | cann toolkit/kernel |
4747
|:----------:|:--------------:|:--------------:|:-------------------:|
48-
| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
49-
48+
| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
5049

5150
## 快速上手
5251

@@ -271,7 +270,7 @@ msrun --worker_num=2 --local_worker_num=2 python tools/train.py --config configs
271270
# 经验证,绑核在大部分情况下有性能加速,请配置参数并运行
272271
msrun --bind_core=True --worker_num=2 --local_worker_num=2 python tools/train.py --config configs/det/dbnet/db_r50_icdar15.yaml
273272
```
274-
**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.1/parallel/msrun_launcher.html).
273+
**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/msrun_launcher.html).
275274

276275
训练结果(包括checkpoint、每个epoch的性能和曲线图)将被保存在yaml配置文件的`ckpt_save_dir`参数配置的路径下,默认为`./tmp_det`。
277276

configs/det/dbnet/README_CN_PP-OCRv3.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -338,7 +338,7 @@ msrun --worker_num=4 --local_worker_num=4 python tools/train.py --config configs
338338
# 经验证,绑核在大部分情况下有性能加速,请配置参数并运行
339339
msrun --bind_core=True --worker_num=4 --local_worker_num=4 python tools/train.py --config configs/det/dbnet/db_mobilenetv3_ppocrv3.yaml
340340
```
341-
**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.1/parallel/msrun_launcher.html).
341+
**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/msrun_launcher.html).
342342

343343

344344
* 单卡训练

configs/det/east/README.md

+9-1
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,14 @@ EAST uses regression for the position and rotation angle of the text box, enabli
2929
4. **Text detection branch**:
3030
After determining the location and size of the text region, EAST further classifies these regions as text or non-text areas. For this purpose, a fully convolutional text branch is employed for binary classification of the text areas.
3131

32+
33+
## Requirements
34+
35+
| mindspore | ascend driver | firmware | cann toolkit/kernel |
36+
|:----------:|:--------------:|:--------------:|:-------------------:|
37+
| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
38+
39+
3240
## Quick Start
3341

3442
### Installation
@@ -128,7 +136,7 @@ msrun --worker_num=8 --local_worker_num=8 python tools/train.py --config configs
128136
# Based on verification,binding cores usually results in performance acceleration.Please configure the parameters and run.
129137
msrun --bind_core=True --worker_num=8 --local_worker_num=8 python tools/train.py --config configs/det/east/east_r50_icdar15.yaml
130138
```
131-
**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/tutorials/experts/en/r2.3.1/parallel/msrun_launcher.html).
139+
**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/msrun_launcher.html).
132140

133141
The training result (including checkpoints, per-epoch performance and curves) will be saved in the directory parsed by the arg `ckpt_save_dir` in yaml config file. The default directory is `./tmp_det`.
134142

configs/det/east/README_CN.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ EAST的整体架构图如图1所示,包含以下阶段:
3333

3434
| mindspore | ascend driver | firmware | cann toolkit/kernel |
3535
|:----------:|:--------------:|:--------------:|:-------------------:|
36-
| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
36+
| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
3737

3838
## 快速上手
3939

@@ -132,7 +132,7 @@ msrun --worker_num=8 --local_worker_num=8 python tools/train.py --config configs
132132
# 经验证,绑核在大部分情况下有性能加速,请配置参数并运行
133133
msrun --bind_core=True --worker_num=8 --local_worker_num=8 python tools/train.py --config configs/det/east/east_r50_icdar15.yaml
134134
```
135-
**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.1/parallel/msrun_launcher.html).
135+
**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/msrun_launcher.html).
136136

137137
训练结果(包括checkpoint、每个epoch的性能和曲线图)将被保存在yaml配置文件的`ckpt_save_dir`参数配置的路径下,默认为`./tmp_det`。
138138

configs/det/psenet/README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ The overall architecture of PSENet is presented in Figure 1. It consists of mult
2525

2626
| mindspore | ascend driver | firmware | cann toolkit/kernel |
2727
|:----------:|:--------------:|:--------------:|:-------------------:|
28-
| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
28+
| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
2929

3030
## Quick Start
3131

@@ -156,7 +156,7 @@ msrun --worker_num=8 --local_worker_num=8 python tools/train.py --config configs
156156
msrun --bind_core=True --worker_num=8 --local_worker_num=8 python tools/train.py --config configs/det/psenet/pse_r152_icdar15.yaml
157157

158158
```
159-
**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/tutorials/experts/en/r2.3.1/parallel/msrun_launcher.html).
159+
**Note:** For more information about msrun configuration, please refer to [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/msrun_launcher.html).
160160

161161

162162
The training result (including checkpoints, per-epoch performance and curves) will be saved in the directory parsed by the arg `ckpt_save_dir` in yaml config file. The default directory is `./tmp_det`.

configs/det/psenet/README_CN.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ PSENet的整体架构图如图1所示,包含以下阶段:
2525

2626
| mindspore | ascend driver | firmware | cann toolkit/kernel |
2727
|:----------:|:--------------:|:--------------:|:-------------------:|
28-
| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
28+
| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
2929

3030
## 快速上手
3131

@@ -155,7 +155,7 @@ msrun --worker_num=8 --local_worker_num=8 python tools/train.py --config configs
155155
# 经验证,绑核在大部分情况下有性能加速,请配置参数并运行
156156
msrun --bind_core=True --worker_num=8 --local_worker_num=8 python tools/train.py --config configs/det/psenet/pse_r152_icdar15.yaml
157157
```
158-
**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.1/parallel/msrun_launcher.html).
158+
**注意:** 有关 msrun 配置的更多信息,请参考[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/msrun_launcher.html).
159159

160160
训练结果(包括checkpoint、每个epoch的性能和曲线图)将被保存在yaml配置文件的`ckpt_save_dir`参数配置的路径下,默认为`./tmp_det`
161161

configs/kie/vi_layoutxlm/README.md

+19-31
Original file line numberDiff line numberDiff line change
@@ -40,43 +40,20 @@ After obtaining αij from the original self-attention layer, considering the lar
4040
<em> Figure 1. LayoutXLM(LayoutLMv2) architecture [<a href="#References">1</a>] </em>
4141
</p>
4242

43-
## Results
44-
<!--- Guideline:
45-
Table Format:
46-
- Model: model name in lower case with _ seperator.
47-
- Context: Training context denoted as {device}x{pieces}-{MS mode}, where mindspore mode can be G - graph mode or F - pynative mode with ms function. For example, D910x8-G is for training on 8 pieces of Ascend 910 NPU using graph mode.
48-
- Top-1 and Top-5: Keep 2 digits after the decimal point.
49-
- Params (M): # of model parameters in millions (10^6). Keep 2 digits after the decimal point
50-
- Recipe: Training recipe/configuration linked to a yaml config file. Use absolute url path.
51-
- Download: url of the pretrained model weights. Use absolute url path.
52-
-->
53-
54-
### Accuracy
55-
56-
| mindspore | ascend driver | firmware | cann toolkit/kernel |
57-
|:---------:|:---------------:|:------------:|:-------------------:|
58-
| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
59-
60-
According to our experiments, the performance and accuracy evaluation([Model Evaluation](#33-Model-Evaluation)) results of training ([Model Training](#32-Model-Training)) on the XFUND Chinese dataset are as follows:
61-
62-
63-
Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode
64-
<div align="center">
65-
66-
| **model name** | **cards** | **batch size** | **img/s** | **hmean** | **config** | **weight** |
67-
|----------------|-----------|----------------|-----------|-----------|-----------------------------------------------------|------------------------------------------------|
68-
| LayoutXLM | 1 | 8 | 73.26 | 90.34% | [yaml](../layoutxlm/ser_layoutxlm_xfund_zh.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/layoutxlm/ser_layoutxlm_base-a4ea148e.ckpt) |
69-
| VI-LayoutXLM | 1 | 8 | 110.6 | 93.31% | [yaml](../layoutxlm/ser_layoutxlm_xfund_zh.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/layoutxlm/ser_layoutxlm_base-a4ea148e.ckpt) |
70-
</div>
71-
43+
## Requirements
7244

45+
| mindspore | ascend driver | firmware | cann toolkit/kernel |
46+
|:----------:|:--------------:|:--------------:|:-------------------:|
47+
| 2.5.0 | 24.1.0 | 7.5.0.3.220 | 8.0.0.beta1 |
7348

7449
## Quick Start
75-
### Preparation
7650

77-
#### Installation
51+
### Installation
52+
7853
Please refer to the [installation instruction](https://github.com/mindspore-lab/mindocr#installation) in MindOCR.
7954

55+
### Dataset preparation
56+
8057
#### Dataset Download
8158

8259
[The XFUND dataset](https://github.com/doc-analysis/XFUND) is used as the experimental dataset. The XFUND dataset is a multilingual dataset proposed by Microsoft for the Knowledge-Intensive Extraction (KIE) task. It consists of seven datasets, each containing 149 training samples and 50 validation samples.
@@ -168,7 +145,18 @@ Recognition results are as shown in the image, and the image is saved as`inferen
168145
<em> example_ser.jpg </em>
169146
</p>
170147

148+
## Performance
149+
150+
According to our experiments, the performance of evaluation on the XFUND Chinese dataset are as follows:
151+
152+
Experiments are tested on ascend 910* with mindspore 2.5.0 graph mode
153+
<div align="center">
171154

155+
| **model name** | **cards** | **batch size** | **img/s** | **hmean** | **config** | **weight** |
156+
|----------------|-----------|----------------|-----------|-----------|-----------------------------------------------------|------------------------------------------------|
157+
| LayoutXLM | 1 | 8 | 73.26 | 90.34% | [yaml](../layoutxlm/ser_layoutxlm_xfund_zh.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/layoutxlm/ser_layoutxlm_base-a4ea148e.ckpt) |
158+
| VI-LayoutXLM | 1 | 8 | 110.6 | 93.31% | [yaml](../layoutxlm/ser_layoutxlm_xfund_zh.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/layoutxlm/ser_layoutxlm_base-a4ea148e.ckpt) |
159+
</div>
172160

173161
## References
174162
<!--- Guideline: Citation format GB/T 7714 is suggested. -->

0 commit comments

Comments
 (0)