Skip to content

Add PaddleOCR Tool Node for Multi-Format Text Extraction (Image/PDF/Doc) #6473

@Akthar24

Description

@Akthar24

Feature Description

I would like to propose a new PaddleOCR Tool Node for Flowise.

The tool should allow users to extract text from uploaded images, PDFs, and documents directly within Flowise workflows. The extracted text can then be passed to LLMs, Agents, Memory Nodes, Vector Stores, and other Flowise components.

This would simplify document-processing workflows and eliminate the need for external OCR integrations.

Feature Category

New Node/Component

Problem Statement

Currently, users need to rely on external OCR services or custom implementations to extract text from images, PDFs, and documents before processing them with LLMs.

This adds additional complexity to workflows and creates unnecessary dependencies on third-party tools.

A native OCR solution within Flowise would make document-processing pipelines easier to build and maintain.

Proposed Solution

Introduce a PaddleOCR Tool Node that can:

• Accept uploaded images, PDFs, and documents.
• Extract text using PaddleOCR.
• Return the extracted text as output.
• Support multiple OCR languages.
• Optionally provide OCR confidence scores.
• Optionally support GPU acceleration.

Proposed Parameters:

Required:
• Uploaded File (Image, PDF, or Document)

Optional:
• OCR Language Selection
• GPU Acceleration Toggle
• Confidence Score Output

Example Workflow:

  1. User uploads a receipt image or PDF invoice.
  2. PaddleOCR Tool extracts the text.
  3. Extracted text is passed to an LLM.
  4. The LLM identifies fields such as Merchant Name, Date, Invoice Number, and Total Amount.
  5. Structured information is returned to the user.

Mockups or References

PaddleOCR GitHub Repository:
https://github.com/PaddlePaddle/PaddleOCR

Official Documentation:
https://www.paddleocr.ai/

Example Use Cases:
• Receipt Processing
• Invoice Extraction
• Document Digitization
• OCR for RAG Pipelines

Additional Context

PaddleOCR is lightweight, open-source, highly accurate, and supports multiple languages. It can handle scanned documents, tables, forms, and complex document layouts.

Adding native PaddleOCR support would greatly benefit users building AI agents, document-processing workflows, and RAG applications within Flowise.

If the core team is open to this idea, I would be happy to work on the implementation and submit a Pull Request for this feature as a contribution.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions