Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
d98361a
add audio type explain.
zxcd Mar 3, 2025
de5954e
fiux
zxcd Mar 3, 2025
e49b58f
update meta.py (#4043)
cuicheng01 May 19, 2025
7405958
3.0 fix (#4044)
cuicheng01 May 19, 2025
e84f71d
3.0 fix (#4045)
cuicheng01 May 19, 2025
17eca7c
fix pipeline doc and update visual rst (#4051)
liuhongen1234567 May 20, 2025
1893782
Fix parallel inference docs (#4052)
Bobholamovic May 20, 2025
30e6998
update README (#4053)
cuicheng01 May 20, 2025
7e1b81c
Update README (#4055)
cuicheng01 May 20, 2025
5ea4969
add ocrv5 rec model (#4058)
zhangyubo0722 May 20, 2025
336ec38
update docker (#4047)
changdazhou May 21, 2025
da316b8
support use_textline_orientation for ppchatocrv4 (#4068)
changdazhou May 22, 2025
a1f401e
[Docs] Update parallel inference documentation (#4072)
Bobholamovic May 27, 2025
c25acfe
Update paddle2onnx version (#4074)
Bobholamovic May 23, 2025
f4e1c1b
[Docs] Fix device docs (#4063)
Bobholamovic May 21, 2025
646fd51
add PP-LCNet_x1_0_textline_ori
May 26, 2025
92d28cb
mv fonts to cache dir
changdazhou May 27, 2025
938a773
bugfix
changdazhou May 27, 2025
499ae94
Fix docs (#4095)
Bobholamovic May 28, 2025
097a803
fix RGB channel in table recognition result of PP-StructureV3
TingquanGao May 27, 2025
9f1fd2f
update pdx version from 3.0rc1 to 3.0.x in readme
TingquanGao May 27, 2025
b99f546
update pdx installation cmd
TingquanGao May 27, 2025
c70f116
[TEMP] support mkldnn block list to avoid trige error in PP-Structure…
TingquanGao May 28, 2025
7fcd89a
Fix bugs (#4083)
Bobholamovic May 29, 2025
53456fc
Update PP-ChatOCRv4 interface (#4069)
Bobholamovic May 29, 2025
92dee40
fix mkldnn (#4098)
zhangyubo0722 May 29, 2025
f868b72
Fix layout params (#4094)
Bobholamovic May 29, 2025
ae5ac2e
change '736,min' -> '64,min' (#4101)
leo-q8 May 29, 2025
5e17382
【cherry-pick】fix textline models (#4107)
zhangyubo0722 May 30, 2025
09b5459
update version (#4113)
cuicheng01 May 30, 2025
7ecfb58
update version (#4114)
cuicheng01 May 30, 2025
3c1bcdc
Merge branch 'PaddlePaddle:release/3.0' into doc
zxcd Jun 3, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions .github/workflows/deploy_docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
name: Develop Docs
on:
push:
branches: #设置更新哪个分支会更新站点
- release/3.0
permissions:
contents: write
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Configure Git Credentials
run: |
git config user.name github-actions[bot]
git config user.email 41898282+github-actions[bot]@users.noreply.github.com
- uses: actions/setup-python@v5
with:
python-version: 3.x
- run: echo "cache_id=$(date --utc '+%V')" >> $GITHUB_ENV
- uses: actions/cache@v4
with:
key: mkdocs-material-${{ env.cache_id }}
path: .cache
restore-keys: |
mkdocs-material-
- run: pip install mike mkdocs-material jieba mkdocs-git-revision-date-localized-plugin mkdocs-git-committers-plugin-2 mkdocs-git-authors-plugin mkdocs-static-i18n mkdocs-minify-plugin
- run: |
git fetch origin gh-pages --depth=1
mike deploy --push --update-aliases 3.0 latest
mike set-default --push latest
50 changes: 26 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,29 +35,31 @@ PaddleX 3.0 是基于飞桨框架构建的低代码开发工具,它集成了

## 📣 近期更新

🔥🔥 **2025.5.20,发布 PaddleX v3.0.0**,核心升级如下:
- **重要能力发布:**
- **重磅发布文字识别模型 PP-OCRv5**:全场景 OCR 识别精度跃升13%,单模型同时支持 5 种文字类型(简体中文、繁体中文、中文拼音、英文和日文),在中英文手写字体、竖直文本、生僻字等提升非常明显。可在 [在线Demo](https://aistudio.baidu.com/community/app/91660/webUI?source=appCenter) 中立即体验。
- **重磅发布文档解析方案 PP-StructureV3**:强化了版面区域检测、表格识别、中英文公式识别、多栏阅读顺序的恢复能力,增加了图表理解能力,在 OmniDocBench 榜单上,PP-StructureV3 的整体中文和英文的编辑距离均达到 SOTA 水平。可在 [在线Demo](https://aistudio.baidu.com/community/app/518494/webUI?source=appCenter) 中立即体验。
- **优化PP-ChatOCRv4**:原生支持文心大模型4.5T,结合新增的PP-DocBee2,关键信息抽取精度相比上一代提升15.7个百分点。可在 [在线Demo](https://aistudio.baidu.com/community/app/518493/webUI?source=appCenter) 中立即体验。
- **推理能力优化:**
- 通用OCR、通用版面解析v3、公式识别、印章文本识别、文档图像预处理产线支持设置batch size>1,一次处理多个页面。
- 通用OCR、通用版面解析v3等17条产线支持多卡并行推理;新增产线多进程并行推理示例代码。

🔥🔥 **2025.4.22,发布 PaddleX v3.0.0rc1 。** 本次版本全面适配 PaddlePaddle 3.0正式版,核心升级如下:

- **全面适配飞桨框架3.0新特性**:支持编译器训练,训练命令通过追加 `-o Global.dy2st=True` 即可开启编译器训练,在 GPU 上,多数模型训练速度可提升 10% 以上,少部分模型训练速度可以提升 30% 以上。推理方面,模型整体适配飞桨 3.0 中间表示技术(PIR),拥有更加灵活的扩展能力和兼容性,静态图模型存储文件名由 `xxx.pdmodel` 改为 `xxx.json`。
- **新增飞桨自研文档图像理解多模态大模型 PP-DocBee**:在学术界及内部业务场景文档理解评测榜单上,PP-DocBee 均达到同参数量级别模型的 SOTA 水平。可应用到财报、研报、合同、说明书、法律法规等文档 QA 场景。
- **全面支持 ONNX 格式模型,支持通过Paddle2ONNX插件转换模型格式。**
- **升级高性能推理:**
- **新增对 ONNX、OM 格式模型的支持:** PaddleX 可以根据需要智能选择模型格式;
- **扩展支持的产线和模块:** 所有静态图推理的单功能模块与产线均可使用高性能推理插件来提升推理性能;
- **支持 CLI、API、配置文件 3 种配置方式:** 支持更精细的配置,用户可以在子产线、子模块粒度启用和禁用高性能推理插件。

- **多硬件支持扩展:**
- **NPU:昇腾全面验证的模型数量提升到 200 个。此外,通用 OCR、图像分类、目标检测等常用产线支持 OM 模型格式推理,推理速度能够提升 113.8%-226.4%,支持在 Atlas 200、Atlas 300 系列产品上推理部署。**
- **GCU:燧原正式纳入飞桨例行发版体系,完成了 PaddleX 生态适配。支持 90 个模型的训练和推理。**

🔥🔥 **2025.5.20,发布 PaddleX v3.0.0**,相比PaddleX v2.x,核心升级如下:

**丰富的模型库:**
- **模型丰富:** PaddleX3.0 包含270+模型,涵盖了图像(视频)分类/检测/分割、OCR、语音识别、时序等多种场景。
- **方案成熟:** PaddleX3.0 基于丰富的模型库,**提供了通用文档解析、关键信息抽取、文档理解、表格识别、通用图像识别等多种重要且成熟的AI解决方案。**

**统一推理接口,重构部署能力:**
- **推理接口标准化**,降低不同种类模型带来的API接口差异,减少用户学习成本,提升企业落地效率。
- **提供多模型组合能力**,复杂任务可以通过不同的模型方便地进行组合使用,实现1+1>2 的能力。
- **部署能力升级,多种模型部署可以使用统一的命令管理,支持多卡推理,支持多卡多实例服务化部署。**

**全面适配飞桨框架3.0:**
- **全面适配飞桨框架3.0新特性:** 支持编译器训练,训练命令通过追加 `-o Global.dy2st=True` 即可开启编译器训练,在 GPU 上,多数模型训练速度可提升 10% 以上,少部分模型训练速度可以提升 30% 以上。推理方面,模型整体适配飞桨 3.0 中间表示技术(PIR),拥有更加灵活的扩展能力和兼容性,静态图模型存储文件名由 `xxx.pdmodel` 改为 `xxx.json`。
- **全面支持 ONNX 格式模型:** 支持通过Paddle2ONNX插件转换模型格式。

**重磅能力支撑:**
- **支撑PP-OCRv5的串联逻辑和多硬件推理、多后端推理、服务化部署能力。**
- **支撑PP-StructureV3的复杂模型串联和并联的逻辑,首次串联并联共15个模型,实现多模型协同的复杂pipeline。精度在 OmniDocBench 榜单上达到 SOTA 水平。**
- **支撑PP-ChatOCRv4的大模型串联逻辑,结合文心大模型4.5Turbo,结合新增的PP-DocBee2,关键信息抽取精度相比上一代提升15.7个百分点。**

**多硬件支持:**
- **整体支持英伟达、英特尔、苹果M系列、昆仑芯、昇腾、寒武纪、海光、燧原等芯片的训练和推理。**
- **在昇腾上,全面适配的模型达到200个,** 支持OM高性能推理的模型达到21个。此外支持PP-OCRv5、PP-StructureV3等重要模型方案。
- 在昆仑芯上支持重要分类、检测、OCR类模型(含PP-OCRv5)。

## 🔠 模型产线说明

Expand Down Expand Up @@ -547,7 +549,7 @@ PaddleX的各个产线均支持本地**快速推理**,部分模型支持在[AI

### 🛠️ 安装

> ❗在安装 PaddleX 之前,请确保您已具备基本的 **Python 运行环境**(注:目前支持 Python 3.8 至 Python 3.12)。PaddleX 3.0-rc1 版本依赖的 PaddlePaddle 版本为 3.0.0 及以上版本,请在使用前务必保证版本的对应关系。
> ❗在安装 PaddleX 之前,请确保您已具备基本的 **Python 运行环境**(注:目前支持 Python 3.8 至 Python 3.12)。PaddleX 3.0.x 版本依赖的 PaddlePaddle 版本为 3.0.0 及以上版本,请在使用前务必保证版本的对应关系。

* **安装 PaddlePaddle**
```bash
Expand All @@ -565,7 +567,7 @@ python -m pip install paddlepaddle==3.0.0 -i https://www.paddlepaddle.org.cn/pac
* **安装PaddleX**

```bash
pip install paddlex[base]==3.0.0
pip install "paddlex[base]==3.0.1"
```

> ❗ 更多安装方式参考 [PaddleX 安装教程](https://paddlepaddle.github.io/PaddleX/latest/installation/installation.html)
Expand Down
46 changes: 22 additions & 24 deletions README_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,32 +42,30 @@ PaddleX 3.0 is a low-code development tool for AI models built on the PaddlePadd

Core upgrades are as follows:

- **Major Capability Releases:**
- **Launch of the groundbreaking text recognition model PP-OCRv5**: Achieves a 13% improvement in OCR accuracy across all scenarios. A single model now supports 5 types of text (Simplified Chinese, Traditional Chinese, Chinese Pinyin, English, and Japanese), with significant enhancements in recognizing handwritten fonts, vertical text, and rare characters in both Chinese and English. You can experience it immediately in the [online demo](https://aistudio.baidu.com/community/app/91660/webUI?source=appCenter).

- **Launch of the groundbreaking document parsing solution PP-StructureV3**: Enhanced capabilities in layout area detection, table recognition, Chinese and English formula recognition, and restoration of multi-column reading order, with added abilities for chart understanding. PP-StructureV3 achieves state-of-the-art (SOTA) levels in both Chinese and English editing distances on the OmniDocBench leaderboard. Experience it in the [online demo](https://aistudio.baidu.com/community/app/518494/webUI?source=appCenter).

- **Optimization of PP-ChatOCRv4**: Supports the Ernie 4.5T. Combined with PP-DocBee2, it shows a 15.7 percentage point improvement in key information extraction accuracy compared to the previous generation. Experience it in the [online demo](https://aistudio.baidu.com/community/app/518493/webUI?source=appCenter).
- **Rich Model Library:**
- **Extensive Model Coverage:** PaddleX 3.0 includes **270+ models**, covering diverse scenarios such as image/video classification/detection/segmentation, OCR, speech recognition, time series analysis, and more.
- **Mature Solutions:** Built on this robust model library, PaddleX 3.0 offers **critical and production-ready AI solutions**, including general document parsing, key information extraction, document understanding, table recognition, and general image recognition.

- **Inference Capability Optimization:**
- The general OCR, PP-StructureV3, formula recognition, seal text recognition, and document image preprocessing pipelines support setting batch size >1, allowing multiple pages to be processed at once.

- 17 pipelines, including general OCR and PP-StructureV3, now support multi-GPU parallel inference. Sample code for multi-process parallel inference has been added.
- **Unified Inference API & Enhanced Deployment Capabilities:**
- **Standardized Inference Interface:** Reduces API fragmentation across model types, lowering the learning curve for users and accelerating enterprise adoption.
- **Multi-Model Composition:** Complex tasks can be efficiently tackled by combining different models, achieving synergistic performance (1+1>2).
- **Upgraded Deployment:** Unified commands now manage deployments for diverse models, supporting **multi-GPU inference** and **multi-instance serving deployments**.

- **Full Compatibility with PaddlePaddle Framework 3.0:**
- **Leveraging New Paddle 3.0 Features:**
- Compiler-accelerated training: Enable by appending `-o Global.dy2st=True` to training commands. **Most GPU-based models see >10% speed gains, with some exceeding 30%.**
- Inference upgrades: Full adaptation to Paddle 3.0’s Program Intermediate Representation (PIR) enhances flexibility and compatibility. Static graph models now use `xxx.json` instead of `xxx.pdmodel`.
- **ONNX Model Support:** Seamless format conversion via the Paddle2ONNX plugin.

🔥 **2025.4.22, PaddleX v3.0.0rc1 major upgrade.** This version fully adapts to PaddlePaddle 3.0.0, with the following core upgrades:
- **Flagship Capabilities:**
- **PP-OCRv5:** Powers **multi-hardware inference, multi-backend support, and serving deployments** for this industry-leading OCR system.
- **PP-StructureV3:** Orchestrates **15+ models** in hybrid (serial/parallel) pipelines, achieving **SOTA accuracy on OmniDocBench**.
- **PP-ChatOCRv4:** Integrates with **PP-DocBee2 and ERNIE 4.5Turbo**, boosting key information extraction accuracy by **15.7 percentage points** over the previous generation.

- **Adapts to New Features of PaddlePaddle 3.0**: Supports compiler training, which can be enabled by appending `-o Global.dy2st=True` to the training command. On GPUs, the training speed of most models can be improved by over 10%, and for a few models, the improvement can exceed 30%. For inference, the models are fully adapted to PaddlePaddle 3.0's Intermediate Representation (PIR) technology, offering more flexible extensibility and compatibility. The file names for inference model have been changed from `xxx.pdmodel` to `xxx.json`.
- **Newly Added Self-developed MLLM for Document Image Understanding, PP-DocBee**: PP-DocBee has achieved SOTA performance among models with similar parameter sizes on academic and internal business scenario document understanding evaluation benchmarks. It can be applied to document QA scenarios such as financial reports, research reports, contracts, manuals, and legal regulations.
- **Full Support for ONNX Format Models, with Support for Model Format Conversion via the Paddle2ONNX Plugin.**
- **Enhanced High-Performance Inference**:
- **Added Support for ONNX and OM Format Models**: PaddleX can intelligently select the model format based on needs;
- **Expanded Supported Pipelines and Modules**: All single modules and pipelines for inference model can use the high-performance inference plugin to improve inference performance;
- **Support for 3 Configuration Methods: CLI, API, and Configuration Files**: Enables more granular configuration, allowing users to enable and disable the high-performance inference plugin at the sub-pipeline and sub-module level.

- **Expanded Multi-Hardware Support**:
- **NPU: The number of models fully validated on Ascend NPU has increased to 200. Additionally, common pipelines such as general OCR, image classification, and object detection support OM model format inference, with inference speed improvements ranging from 113.8% to 226.4%. Inference deployment is supported on Atlas 200 and Atlas 300 series products.**
- **GCU: Enflame has been officially integrated into the PaddlePaddle regular release system, completing the adaptation of the PaddleX ecosystem. Supports the training and inference of 90 models.**
- **Multi-Hardware Support:**
- **Broad Compatibility:** Training and inference supported on **NVIDIA, Intel, Apple M-series, Kunlunxin, Ascend, Cambricon, Hygon, Enflame**, and more.
- **Ascend-Optimized:** **200+ fully adapted models**, including **21 OM-accelerated inference models**, plus key solutions like PP-OCRv5 and PP-StructureV3.
- **Kunlunxin-Optimized:** Critical classification, detection, and OCR models (including PP-OCRv5) are fully supported.


## 🔠 Explanation of Pipeline
Expand Down Expand Up @@ -555,7 +553,7 @@ In addition, PaddleX provides developers with a full-process efficient model tra

### 🛠️ Installation

> ❗Before installing PaddleX, please ensure that you have a basic **Python runtime environment** (Note: Currently supports Python 3.8 to Python 3.12). The PaddleX 3.0-rc1 version depends on PaddlePaddle version 3.0.0 and above. Please make sure the version compatibility is maintained before use.
> ❗Before installing PaddleX, please ensure that you have a basic **Python runtime environment** (Note: Currently supports Python 3.8 to Python 3.12). The PaddleX 3.0.x version depends on PaddlePaddle version 3.0.0 and above. Please make sure the version compatibility is maintained before use.

* **Installing PaddlePaddle**

Expand All @@ -574,7 +572,7 @@ python -m pip install paddlepaddle==3.0.0 -i https://www.paddlepaddle.org.cn/pac
* **Installing PaddleX**

```bash
pip install paddlex[base]==3.0.0
pip install "paddlex[base]==3.0.1"
```

> ❗For more installation methods, refer to the [PaddleX Installation Guide](https://paddlepaddle.github.io/PaddleX/latest/en/installation/installation.html).
Expand Down
Loading