Native support for PDFs #682

stefan-kp · 2025-02-17T14:54:46Z

stefan-kp
Feb 17, 2025

First of all, congratulations on such a great library!

Most models support PDF analysis out of the box, but if I understand correctly, currently smolagents only support images. So the only way to use PDFs is by converting them to images in the first place, which is a bit cumbersome. Maybe someone has an idea how this could be done more easily with PDFs.

here is how I do this now

from smolagents import CodeAgent, LiteLLMModel
from PIL import Image
from pdf2image import convert_from_path
import os

# Define poppler path (macOS)

def analyze_pdf( pdf_path):
    poppler_path = '/opt/homebrew/Cellar/poppler/25.02.0/bin'  # Adjust version as needed


    abs_pdf_path = os.path.abspath(pdf_path)

    try:
        # Convert PDF pages to images
        pages = convert_from_path(
            abs_pdf_path,
            poppler_path=poppler_path  # Required on macOS
        )
        
        # Analyze each page
        response = agent.run(
                f"Please analyze {len(pages)} pages of this PDF and describe what you see in detail. "
                "Include information about text content, layout, any tables or figures, "
                "and the overall structure of the document.",
                images=pages  
        )
        print(f"Page {len(pages)} Analysis:", response)
    except Exception as e:
        print(f"Error: {str(e)}")

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Native support for PDFs #682

{{title}}

Replies: 0 comments

Select a reply

Native support for PDFs #682

stefan-kp Feb 17, 2025

Replies: 0 comments

stefan-kp
Feb 17, 2025