Feature Description
I would like to propose a new PaddleOCR Tool Node for Flowise.
The tool should allow users to extract text from uploaded images, PDFs, and documents directly within Flowise workflows. The extracted text can then be passed to LLMs, Agents, Memory Nodes, Vector Stores, and other Flowise components.
This would simplify document-processing workflows and eliminate the need for external OCR integrations.
Feature Category
New Node/Component
Problem Statement
Currently, users need to rely on external OCR services or custom implementations to extract text from images, PDFs, and documents before processing them with LLMs.
This adds additional complexity to workflows and creates unnecessary dependencies on third-party tools.
A native OCR solution within Flowise would make document-processing pipelines easier to build and maintain.
Proposed Solution
Introduce a PaddleOCR Tool Node that can:
• Accept uploaded images, PDFs, and documents.
• Extract text using PaddleOCR.
• Return the extracted text as output.
• Support multiple OCR languages.
• Optionally provide OCR confidence scores.
• Optionally support GPU acceleration.
Proposed Parameters:
Required:
• Uploaded File (Image, PDF, or Document)
Optional:
• OCR Language Selection
• GPU Acceleration Toggle
• Confidence Score Output
Example Workflow:
- User uploads a receipt image or PDF invoice.
- PaddleOCR Tool extracts the text.
- Extracted text is passed to an LLM.
- The LLM identifies fields such as Merchant Name, Date, Invoice Number, and Total Amount.
- Structured information is returned to the user.
Mockups or References
PaddleOCR GitHub Repository:
https://github.com/PaddlePaddle/PaddleOCR
Official Documentation:
https://www.paddleocr.ai/
Example Use Cases:
• Receipt Processing
• Invoice Extraction
• Document Digitization
• OCR for RAG Pipelines
Additional Context
PaddleOCR is lightweight, open-source, highly accurate, and supports multiple languages. It can handle scanned documents, tables, forms, and complex document layouts.
Adding native PaddleOCR support would greatly benefit users building AI agents, document-processing workflows, and RAG applications within Flowise.
If the core team is open to this idea, I would be happy to work on the implementation and submit a Pull Request for this feature as a contribution.
Feature Description
I would like to propose a new PaddleOCR Tool Node for Flowise.
The tool should allow users to extract text from uploaded images, PDFs, and documents directly within Flowise workflows. The extracted text can then be passed to LLMs, Agents, Memory Nodes, Vector Stores, and other Flowise components.
This would simplify document-processing workflows and eliminate the need for external OCR integrations.
Feature Category
New Node/Component
Problem Statement
Currently, users need to rely on external OCR services or custom implementations to extract text from images, PDFs, and documents before processing them with LLMs.
This adds additional complexity to workflows and creates unnecessary dependencies on third-party tools.
A native OCR solution within Flowise would make document-processing pipelines easier to build and maintain.
Proposed Solution
Introduce a PaddleOCR Tool Node that can:
• Accept uploaded images, PDFs, and documents.
• Extract text using PaddleOCR.
• Return the extracted text as output.
• Support multiple OCR languages.
• Optionally provide OCR confidence scores.
• Optionally support GPU acceleration.
Proposed Parameters:
Required:
• Uploaded File (Image, PDF, or Document)
Optional:
• OCR Language Selection
• GPU Acceleration Toggle
• Confidence Score Output
Example Workflow:
Mockups or References
PaddleOCR GitHub Repository:
https://github.com/PaddlePaddle/PaddleOCR
Official Documentation:
https://www.paddleocr.ai/
Example Use Cases:
• Receipt Processing
• Invoice Extraction
• Document Digitization
• OCR for RAG Pipelines
Additional Context
PaddleOCR is lightweight, open-source, highly accurate, and supports multiple languages. It can handle scanned documents, tables, forms, and complex document layouts.
Adding native PaddleOCR support would greatly benefit users building AI agents, document-processing workflows, and RAG applications within Flowise.
If the core team is open to this idea, I would be happy to work on the implementation and submit a Pull Request for this feature as a contribution.